On Fri, Feb 10, 2017 at 2:56 PM, Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
On Fri, Feb 10, 2017 at 2:27 PM, Martin Perina
<mperina(a)redhat.com> wrote:
> Hi Gianluca,
>
> so generally speaking when host is in Maintenance status, engine doesn't
> communicate with this host, so you can do pretty much anything about it.
>
Thanks for the detailed answer, Martin.
It seemed to me that in the past I did a reboot at the host OS level while
in maintenance and this generated a series of fencing/reboots of the host
itself, so I wanted to be sure.
It is possible that at that time I also used "Confirm host has been
rebooted" button, as initially I misunderstood its target.... and it was
that action to generate fencing loops (2 or 3 I don't remember).
Then in Admin manual for 4.0 I saw that instead it is only to be used in
case of unresponsive host, correct?
"
If a host unpredictably goes into a non-responsive state, for example, due
to a hardware failure;
it can significantly affect the performance of the environment. If you do
not have a power
management device, or it is incorrectly configured, you can reboot the
host manually.
Do not use the Confirm host has been rebooted option unless you have
manually
rebooted the host. Using this option while the host is still running can
lead to a virtual
machine image corruption.
Procedure 7.20. Manually fencing or isolating a non-responsive host
1. On the Hosts tab, select the host. The status must display as
non-responsive.
2. Manually reboot the host. This could mean physically entering the lab
and rebooting the
host.
3. On the Administration Portal, right-click the host entry and select the
Confirm Host has
been rebooted button.
4. A message displays prompting you to ensure that the host has been shut
down or
rebooted. Select the Approve Operation check box and click OK.
"
Initially I misread and kept focus only to the "Manually fencing" and
"Manually reboot" but point 1. seems quite clear:
The status must display as non-responsive.
So no other use for "Confirm Host has been rebooted"
I will verify again the workflow
Right, by executing "Confirm Host has been rebooted" you take the
responsibility and tell engine, that everything running on the host is
really turned off. After that engine can change VM statuses do Down and
restart HA VMs. But this option has to be used really carefully, otherwise
you could cause data loss or even split brain. IMO this option should be
used only if your hosts does not support Power Management (in this case you
cannot have HA VMs) or you have some hardware failure which prevents power
management to turn off the host.
>
> About the upgrade flow (host 4.0 -> 4.1), here's proper way(s) how to
> achieve that:
>
Yes, my question was in general, eg when I'm going to pass from 4.1.0.4 to
4.1.1
>
> 1. Put host to Maintenance
> 2. Add 4.1 repositories to the host
> 3. Upgrade the host - you have 2 options here:
> a. UI
> - Go to Hosts tab in webadmin
> - Select host and click on Reinstall inside Installation menu
> b. Command line
> - Connect to the host using SSH and upgrade using yum
>
Sometimes at this point I could have a new kernel to boot, due to the
update just completed, so the point 4. is not what I would like to do right
now.
Yes, but this is OS upgrade, which should be done only manually via SSH
and only when host is in Maintenance.
By "host upgrade" in oVirt terms we usually mean only upgrade of VDSM and
dependent packages like libvirt or qemu-kvm ...
> 4. Activate the host
>
> The main difference between UI and command line options is, that with UI
> option only necessary packages like VDSM and it's dependencies are upgraded.
>
> Now regarding restart:
> 1. If host is in Maintenance, restart using SSH and shutdown/reboot
> execution is no problem. But if host is Up, then executing shutdown/reboot
> using SSH will make host NonResponsive and fencing will be executed
>
> 2. If you select Restart using SSH from Maintenance menu, it just
> connect using SSH and execute shutdown same way as it's done manually
>
> 3. If you select Restart using Power Management from Maintenance
> menu, it will execute restart using PM device and further actions are
> hardware dependent. On some servers it will send shutdown signal to OS
> similarly as if you execute reboot and afterwards it turns off power of the
> server (so the result may be the same as using shutdown from command line).
> But on different hardware it can just turn off power of the server.
>
> So for planned restart using SSH restart either from UI or from command
> line is preferred way.
>
Ok, so a normal reboot is ok as point 4..
And then after reboot
5. Activate (and not "Confirm Host Has Rebooted"... ;-)
Right
Regarding Power Mgmt ---> Restart
I thought that in general I should completely disable on the host the
feature to initiate shutdown in case of power button pressed, like in RHCS
or similar targeted environments:
https://access.redhat.com/documentation/en-US/Red_Hat_
Enterprise_Linux/6/html/Cluster_Administration/s2-
acpi-disable-boot-CA.html
Is it not the case in oVirt? I think in case of fencing, it is necessary
to have the host down as soon as possible
My initial question was the confirmation about Power Mgmt --> Restart : if
it implied power Off / Power On by design or any other workflow....
By design, restart using power management should use some device
independent on host OS, so we can restart the host even thought you cannot
connect to the OS.
Thanks in advance,
Gianluca