(so noted) ...or anyone else who knows the answer ;)
-----Original Message-----
From: Michal Skrivanek [mailto:michal.skrivanek@redhat.com]
Sent: Friday, April 29, 2016 9:02 AM
To: Will Dennis
Cc: users(a)ovirt.org
Subject: Re: [ovirt-users] Hosts temporarily in "Non Operational" state after
upgrade
On 29 Apr 2016, at 14:46, Will Dennis <wdennis(a)nec-labs.com>
wrote:
Bump - can any RHAT folks comment on this?
note oVirt is a community project;-)
-----Original Message-----
From: Will Dennis
Sent: Wednesday, April 27, 2016 11:00 PM
To: users(a)ovirt.org
Subject: Hosts temporarily in "Non Operational" state after upgrade
Hi all,
Had run updates tonight on my three oVirt hosts (3.6 hyperconverged) on on two of them,
they went into “non Operational” state for a few minutes each before springing back to
life… The synopsis was this:
- Ran updates throughout the web Admin UI ...then I got the following series of messages
via the “Events” tab in the UI:
what exactly did you do in the UI?
- Updates successfully ran
- VDSM “command failed: Heartbeat exceeded” message
- host is not responding message
- "Failed to connect to hosted_storage" message
- “The error message for connection localhost:/engine returned by VDSM was: Problem while
trying to mount target”
- "Host <name> reports about one of the Active Storage Domains as
Problematic”
- “Host <name> cannot access the Storage Domain(s) hosted_storage attached to the
data center Default. Setting host state to Non-Operational.”
- "Detected change in status of brick {…} of volume {…} from DOWN to UP.” (once for
every brick on the host for every Gluster volume.)
- "Host <name> was autorecovered.”
- "Status of host <name> was set to Up.”
so..it was not in Maintenance when you run the update?
You should avoid doing that as an update to any package may interfere with running guests.
E.g. a qemu rpm update can (and likely will) simply kill all your VMs, I suppose similarly
for Gluster before updating anything the volumes should be in some kind of maintenance
mode as well
(BTW, it would be awesome if the UI’s Events log could be copied and pasted… Doesn’t work
for me at least…)
Duration of outage was ~3 mins per each affected host. Didn’t happen on the first host I
upgraded, but did on the last two.
I know I’m a little over the bleeding edge running hyperconverged on 3.6 :) but, should
this behavior be expected?
Also, if I go onto the hosts directly and run a ‘yum update’ after this upgrade process
(not that I went thru with it, just wanted to see what was available to be upgraded) I see
a bunch of ovirt-* packages that can be upgraded, which didn’t get updated thru the web
UI’s upgrade process —
ovirt-engine-sdk-python noarch 3.6.5.0-1.el7.centos ovirt-3.6
480 k
ovirt-hosted-engine-ha noarch 1.3.5.3-1.1.el7
centos-ovirt36 295 k
ovirt-hosted-engine-setup noarch 1.3.5.0-1.1.el7
centos-ovirt36 270 k
ovirt-release36 noarch 007-1 ovirt-3.6
9.5 k
Are these packages not related to the “Upgrade” process available thru the web UI?
FYI, here’s what did get updated thru the web UI “Upgrade” process — Apr 27 21:36:28
Updated: libvirt-client-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:28 Updated: libvirt-daemon-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:28 Updated: libvirt-daemon-driver-network-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:28 Updated: libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:28 Updated: libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:28 Updated: vdsm-infra-4.17.26-1.el7.noarch Apr 27 21:36:28 Updated:
vdsm-python-4.17.26-1.el7.noarch Apr 27 21:36:28 Updated: vdsm-xmlrpc-4.17.26-1.el7.noarch
Apr 27 21:36:28 Updated: libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:29 Updated: mom-0.5.3-1.1.el7.noarch Apr 27 21:36:29 Updated:
libvirt-lock-sanlock-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:29 Updated: libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:29 Updated: libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:29 Updated: libvirt-daemon-driver-interface-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:29 Updated: libvirt-daemon-driver-storage-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:29 Updated: libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64
Apr 27 21:36:29 Updated: 1:libguestfs-1.28.1-1.55.el7.centos.2.x86_64
Apr 27 21:36:29 Updated: 1:libguestfs-tools-c-1.28.1-1.55.el7.centos.2.x86_64
Apr 27 21:36:29 Installed: libguestfs-winsupport-7.2-1.el7.x86_64
Apr 27 21:36:29 Updated: vdsm-yajsonrpc-4.17.26-1.el7.noarch
Apr 27 21:36:29 Updated: vdsm-jsonrpc-4.17.26-1.el7.noarch Apr 27 21:36:29 Installed:
unzip-6.0-15.el7.x86_64 Apr 27 21:36:30 Installed: gtk2-2.24.28-8.el7.x86_64 Apr 27
21:36:31 Installed: 1:virt-v2v-1.28.1-1.55.el7.centos.2.x86_64
Apr 27 21:36:31 Updated: safelease-1.0-7.el7.x86_64 Apr 27 21:36:31 Updated:
vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch
Apr 27 21:36:32 Updated: vdsm-4.17.26-1.el7.noarch Apr 27 21:36:32 Updated:
vdsm-gluster-4.17.26-1.el7.noarch Apr 27 21:36:32 Updated: vdsm-cli-4.17.26-1.el7.noarch
Perhaps libvirtd restarted because of those updates, which causes a vdsm restart as well,
dropping the host connection temporarily
Thanks,
michal
Thanks,
Will
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users