Re: [ovirt-users] took the plunge to 4.2 but not so sure it was a good idea

24 Dec 2017

      Quoting Yaniv Kaul <ykaul@redhat.com>:
...
On Sun, Dec 24, 2017 at 4:34 AM, Jason Keltz <jas@cse.yorku.ca> wrote:
...
On 12/23/2017 5:38 PM, Jason Keltz wrote:
...
Hi..
I took the plunge to 4.2, but I think maybe I should have waited a bit...
Can you specify what did you upgrade, and in which order? Engine, hosts?
Cluster level, etc.?
I was running 4.1.8 everywhere. I upgraded engine (standalone) to 4.2,  
then the 4 hosts. I stopped ovirt-engine, added the new repo for 4.2,  
ran the yum update of ovirt setup, ran engine-setup and that process  
worked flawlessly. No errors. I had just upgraded to 4.1.8 a few days  
ago, so all my ovirt infrastructure was running latest ovirt and I  
also upgraded engine and hosts to latest CentOS and latest kernel with  
the last 4.1.8 update.  I then upgraded cluster level. All the VMs  
were going to be upgraded as they were rebooted, and since it's the  
reboot that breaks console, and since a reinstall brings it back, I'm  
going to assume it's the switch from 4.1 to 4.2 cluster that breaks  
it.  If I submit this as a bug then what log/logs would I submit?
...
...
...
Initially, after upgrade to 4.2, the status of many of my hosts changed
from "server" to "desktop".  That's okay - I can change them back.
You mean the VM type?
Yes.  VM type. Most of the VMs switched from desktop to server after  
the update.
...
...
...
My first VM, "archive", I had the ability to access console after the
upgrade.  I rebooted archive, and I lost the ability (option is grayed
out).  The VM boots, but I need access to the console.
My second VM is called "dist".    That one, ovirt says is running, but I
can't access it, can't ping it, and there's no console either, so I
literally can't get to it. I can reboot it, and shut it down, but it would
be helpful to be able to access it.   What to do?
I reinstalled "dist" because I needed the VM to be accessible on the
network.  I was going to try detatching the disk from the existing dist
server, and attaching it to a new dist VM, but I ended up inadvertently
deleting the disk image.  I can't believe that under "storage" you can't
detatch a disk from a VM - you can only delete the disk.
After reinstalling dist, I got back console, and network access!  I tried
rebooting it several times, and console remains... so the loss of console
has something to do with switching from a 4.1 VM to 4.2.
I've very afraid to reboot my engine because it seems like when I reboot
...
hosts, I lose access to console.
I rebooted one more VM for which I had console access, and again, I've
lost it (at least network access remains). Now that this situation is
repeatable, I'm going one of the ovirt gurus can send me the magical DB
command to fix it.    Probably not a solution to reinstall my 37 VMs from
kickstart.. that would be a headache.
In addition, when I try to check for "host updates", I get an error that
...
it can't check for host updates.  I ran a yum update on the hosts (after
upgrading repo to 4.2 and doing a yum update) and all I'm looking for it to
do is clear status, but it doesn't seem to work.
The error in engine.log when I try to update any of the hosts is:
2017-12-23 19:11:36,479-05 INFO [org.ovirt.engine.core.bll.hos
tdeploy.HostUpgradeCheckCommand] (default task-156)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Running command:
HostUpgradeCheckCommand internal: false. Entities affected :  ID:
45f8b331-842e-48e7-9df8-56adddb93836 Type: VDSAction group
EDIT_HOST_CONFIGURATION with role type ADMIN
2017-12-23 19:11:36,496-05 INFO [org.ovirt.engine.core.dal.dbb
roker.auditloghandling.AuditLogDirector] (default task-156) [] EVENT_ID:
HOST_AVAILABLE_UPDATES_STARTED(884), Started to check for available
updates on host virt1.
2017-12-23 19:11:36,500-05 INFO [org.ovirt.engine.core.bll.hos
tdeploy.HostUpgradeCheckInternalCommand]  
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Running command:
HostUpgradeCheckInternalCommand internal: true. Entities affected : ID:
45f8b331-842e-48e7-9df8-56adddb93836 Type: VDS
2017-12-23 19:11:36,504-05 INFO  
[org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Executing Ansible command:
ANSIBLE_STDOUT_CALLBACK=hostupgradeplugin [/usr/bin/ansible-playbook,
--check, --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa,
--inventory=/tmp/ansible-inventory1039100972039373314,
/usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml] [Logfile: null]
2017-12-23 19:11:37,897-05 INFO  
[org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Ansible playbook command has
exited with value: 4
2017-12-23 19:11:37,897-05 ERROR  
[org.ovirt.engine.core.bll.host.HostUpgradeManager]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Failed to run check-update of host
'virt1-mgmt'.
2017-12-23 19:11:37,897-05 ERROR  
[org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Failed to check if updates are
available for host 'virt1' with error message 'Failed to run check-update
of host 'virt1-mgmt'.'
2017-12-23 19:11:37,904-05 ERROR [org.ovirt.engine.core.dal.dbb
roker.auditloghandling.AuditLogDirector]  
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] EVENT_ID:
HOST_AVAILABLE_UPDATES_FAILED(839), Failed to check for available updates
on host virt1 with message 'Failed to run check-update of host
'virt1-mgmt'.'.
Can you share the complete logs? Best if you could file a bug about it,
with the logs attached.
Y.
I will do that later today.

Thanks!

Jason.

-- 
dfjklgdfgkljdfgklfdklgj