[ovirt-users] took the plunge to 4.2 but not so sure it was a good idea

Sun Dec 24 12:41:51 UTC 2017

On Sun, Dec 24, 2017 at 1:53 PM, Jason Keltz <jas at cse.yorku.ca> wrote:

> Quoting Yaniv Kaul <ykaul at redhat.com>:
>
> On Sun, Dec 24, 2017 at 4:34 AM, Jason Keltz <jas at cse.yorku.ca> wrote:
>>
>>
>>> On 12/23/2017 5:38 PM, Jason Keltz wrote:
>>>
>>> Hi..
>>>>
>>>> I took the plunge to 4.2, but I think maybe I should have waited a
>>>> bit...
>>>>
>>>>
>>> Can you specify what did you upgrade, and in which order? Engine, hosts?
>> Cluster level, etc.?
>>
>>
> I was running 4.1.8 everywhere. I upgraded engine (standalone) to 4.2,
> then the 4 hosts. I stopped ovirt-engine, added the new repo for 4.2, ran
> the yum update of ovirt setup, ran engine-setup and that process worked
> flawlessly. No errors. I had just upgraded to 4.1.8 a few days ago, so all
> my ovirt infrastructure was running latest ovirt and I also upgraded engine
> and hosts to latest CentOS and latest kernel with the last 4.1.8 update.  I
> then upgraded cluster level. All the VMs were going to be upgraded as they
> were rebooted, and since it's the reboot that breaks console, and since a
> reinstall brings it back, I'm going to assume it's the switch from 4.1 to
> 4.2 cluster that breaks it.  If I submit this as a bug then what log/logs
> would I submit?

I think vdsm log is going to be very helpful, but also the console (and
potentially engine). ovirt-log-collector should collect everything needed.
Thanks,
Y.

>
>
>
>>
>>> Initially, after upgrade to 4.2, the status of many of my hosts changed
>>>> from "server" to "desktop".  That's okay - I can change them back.
>>>>
>>>>
>>> You mean the VM type?
>>
>>
>> Yes.  VM type. Most of the VMs switched from desktop to server after the
> update.
>
>
> My first VM, "archive", I had the ability to access console after the
>>>> upgrade.  I rebooted archive, and I lost the ability (option is grayed
>>>> out).  The VM boots, but I need access to the console.
>>>>
>>>> My second VM is called "dist".    That one, ovirt says is running, but I
>>>> can't access it, can't ping it, and there's no console either, so I
>>>> literally can't get to it. I can reboot it, and shut it down, but it
>>>> would
>>>> be helpful to be able to access it.   What to do?
>>>>
>>>> I reinstalled "dist" because I needed the VM to be accessible on the
>>>>
>>> network.  I was going to try detatching the disk from the existing dist
>>> server, and attaching it to a new dist VM, but I ended up inadvertently
>>> deleting the disk image.  I can't believe that under "storage" you can't
>>> detatch a disk from a VM - you can only delete the disk.
>>>
>>> After reinstalling dist, I got back console, and network access!  I tried
>>> rebooting it several times, and console remains... so the loss of console
>>> has something to do with switching from a 4.1 VM to 4.2.
>>>
>>> I've very afraid to reboot my engine because it seems like when I reboot
>>>
>>>> hosts, I lose access to console.
>>>>
>>>> I rebooted one more VM for which I had console access, and again, I've
>>>>
>>> lost it (at least network access remains). Now that this situation is
>>> repeatable, I'm going one of the ovirt gurus can send me the magical DB
>>> command to fix it.    Probably not a solution to reinstall my 37 VMs from
>>> kickstart.. that would be a headache.
>>>
>>> In addition, when I try to check for "host updates", I get an error that
>>>
>>>> it can't check for host updates.  I ran a yum update on the hosts (after
>>>> upgrading repo to 4.2 and doing a yum update) and all I'm looking for
>>>> it to
>>>> do is clear status, but it doesn't seem to work.
>>>>
>>>> The error in engine.log when I try to update any of the hosts is:
>>>>
>>>
>>> 2017-12-23 19:11:36,479-05 INFO [org.ovirt.engine.core.bll.hos
>>> tdeploy.HostUpgradeCheckCommand] (default task-156)
>>> [ae11a704-3b40-45d3-9850-932f6ed91ed9] Running command:
>>> HostUpgradeCheckCommand internal: false. Entities affected :  ID:
>>> 45f8b331-842e-48e7-9df8-56adddb93836 Type: VDSAction group
>>> EDIT_HOST_CONFIGURATION with role type ADMIN
>>> 2017-12-23 19:11:36,496-05 INFO [org.ovirt.engine.core.dal.dbb
>>> roker.auditloghandling.AuditLogDirector] (default task-156) [] EVENT_ID:
>>> HOST_AVAILABLE_UPDATES_STARTED(884), Started to check for available
>>> updates on host virt1.
>>> 2017-12-23 19:11:36,500-05 INFO [org.ovirt.engine.core.bll.hos
>>> tdeploy.HostUpgradeCheckInternalCommand] (EE-ManagedThreadFactory-comma
>>> ndCoordinator-Thread-7)
>>> [ae11a704-3b40-45d3-9850-932f6ed91ed9] Running command:
>>> HostUpgradeCheckInternalCommand internal: true. Entities affected : ID:
>>> 45f8b331-842e-48e7-9df8-56adddb93836 Type: VDS
>>> 2017-12-23 19:11:36,504-05 INFO [org.ovirt.engine.core.common.
>>> utils.ansible.AnsibleExecutor]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-7)
>>> [ae11a704-3b40-45d3-9850-932f6ed91ed9] Executing Ansible command:
>>> ANSIBLE_STDOUT_CALLBACK=hostupgradeplugin [/usr/bin/ansible-playbook,
>>> --check, --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa,
>>> --inventory=/tmp/ansible-inventory1039100972039373314,
>>> /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml] [Logfile:
>>> null]
>>> 2017-12-23 19:11:37,897-05 INFO [org.ovirt.engine.core.common.
>>> utils.ansible.AnsibleExecutor]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-7)
>>> [ae11a704-3b40-45d3-9850-932f6ed91ed9] Ansible playbook command has
>>> exited with value: 4
>>> 2017-12-23 19:11:37,897-05 ERROR [org.ovirt.engine.core.bll.hos
>>> t.HostUpgradeManager]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-7)
>>> [ae11a704-3b40-45d3-9850-932f6ed91ed9] Failed to run check-update of
>>> host
>>> 'virt1-mgmt'.
>>> 2017-12-23 19:11:37,897-05 ERROR [org.ovirt.engine.core.bll.hos
>>> tdeploy.HostUpdatesChecker]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-7)
>>> [ae11a704-3b40-45d3-9850-932f6ed91ed9] Failed to check if updates are
>>> available for host 'virt1' with error message 'Failed to run check-update
>>> of host 'virt1-mgmt'.'
>>> 2017-12-23 19:11:37,904-05 ERROR [org.ovirt.engine.core.dal.dbb
>>> roker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-comma
>>> ndCoordinator-Thread-7)
>>> [ae11a704-3b40-45d3-9850-932f6ed91ed9] EVENT_ID:
>>> HOST_AVAILABLE_UPDATES_FAILED(839), Failed to check for available
>>> updates
>>> on host virt1 with message 'Failed to run check-update of host
>>> 'virt1-mgmt'.'.
>>>
>>>
>>> Can you share the complete logs? Best if you could file a bug about it,
>> with the logs attached.
>> Y.
>>
>> I will do that later today.
>
> Thanks!
>
> Jason.
>
>
> --
> dfjklgdfgkljdfgklfdklgj
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20171224/3fc7a088/attachment.html>