migration failures

I have an ovirt cluster running ovirt 4.0 and I am seeing several errors when I attempt to put one of our nodes into maintenance mode. The logs on the source server show errors as follows. Feb 23 10:15:08 ovirt-node-production3.example.com libvirtd[18800]: operation aborted: migration job: canceled by client Feb 23 10:15:08 ovirt-node-production3.example.com libvirtd[18800]: internal error: qemu unexpectedly closed the monitor: 2017-02-23T15:12:58.289459Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2017-02-23T15:12:58.289684Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config 2017-02-23T15:15:07.889891Z qemu-kvm: Unknown combination of migration flags: 0 2017-02-23T15:15:07.890821Z qemu-kvm: error while loading state section id 2(ram) 2017-02-23T15:15:07.892357Z qemu-kvm: load of migration failed: Invalid argument This cluster does *not* have NUMA enabled so I am not sure why this error is happening. Some migrations did succeed after being restarted on a different host however I have two VMs that appear to be stuck. Is there a way to resolve this?

On 02/23/2017 04:20 PM, Michael Watters wrote:
I have an ovirt cluster running ovirt 4.0 and I am seeing several errors when I attempt to put one of our nodes into maintenance mode. The logs on the source server show errors as follows.
Feb 23 10:15:08 ovirt-node-production3.example.com libvirtd[18800]: operation aborted: migration job: canceled by client Feb 23 10:15:08 ovirt-node-production3.example.com libvirtd[18800]: internal error: qemu unexpectedly closed the monitor: 2017-02-23T15:12:58.289459Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2017-02-23T15:12:58.289684Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config 2017-02-23T15:15:07.889891Z qemu-kvm: Unknown combination of migration flags: 0 2017-02-23T15:15:07.890821Z qemu-kvm: error while loading state section id 2(ram) 2017-02-23T15:15:07.892357Z qemu-kvm: load of migration failed: Invalid argument
This cluster does *not* have NUMA enabled so I am not sure why this error is happening.
It's one implementation detail. NUMA is enabled transparently because it is required for memory hotplug support. It should be fully transparent.
Some migrations did succeed after being restarted on a different host however I have two VMs that appear to be stuck. Is there a way to resolve this?
The load/save state errors are most often found when the two sides of the migration have different and incompatible version of QEMU. In turn, this is quite often a bug, because forward migrations (e.g. from 2.3.0 to 2.4.0) are always supported, for obvious upgrade needs. So, which version of libvirt and qemu do you have on the sides of failing migration paths? Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani

On 02/23/2017 10:28 AM, Francesco Romani wrote:
On 02/23/2017 04:20 PM, Michael Watters wrote:
I have an ovirt cluster running ovirt 4.0 and I am seeing several errors when I attempt to put one of our nodes into maintenance mode. The logs on the source server show errors as follows.
Feb 23 10:15:08 ovirt-node-production3.example.com libvirtd[18800]: operation aborted: migration job: canceled by client Feb 23 10:15:08 ovirt-node-production3.example.com libvirtd[18800]: internal error: qemu unexpectedly closed the monitor: 2017-02-23T15:12:58.289459Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2017-02-23T15:12:58.289684Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config 2017-02-23T15:15:07.889891Z qemu-kvm: Unknown combination of migration flags: 0 2017-02-23T15:15:07.890821Z qemu-kvm: error while loading state section id 2(ram) 2017-02-23T15:15:07.892357Z qemu-kvm: load of migration failed: Invalid argument
This cluster does *not* have NUMA enabled so I am not sure why this error is happening. It's one implementation detail. NUMA is enabled transparently because it is required for memory hotplug support. It should be fully transparent.
Some migrations did succeed after being restarted on a different host however I have two VMs that appear to be stuck. Is there a way to resolve this? The load/save state errors are most often found when the two sides of the migration have different and incompatible version of QEMU. In turn, this is quite often a bug, because forward migrations (e.g. from 2.3.0 to 2.4.0) are always supported, for obvious upgrade needs.
So, which version of libvirt and qemu do you have on the sides of failing migration paths?
Bests,

On 02/23/2017 10:28 AM, Francesco Romani wrote:
The load/save state errors are most often found when the two sides of the migration have different and incompatible version of QEMU. In turn, this is quite often a bug, because forward migrations (e.g. from 2.3.0 to 2.4.0) are always supported, for obvious upgrade needs.
I think you're on to something there. The destination server is running ovirt 3.6 while the source server is on 4.0. The cluster compatibility level is also set to 3.6 since I have not upgrade every host node yet.

I canceled the migration and manually moved the VM to another host running ovirt 4.0. The source node was then able to set itself to maintenance mode without any errors. On 02/23/2017 10:46 AM, Michael Watters wrote:
On 02/23/2017 10:28 AM, Francesco Romani wrote:
The load/save state errors are most often found when the two sides of the migration have different and incompatible version of QEMU. In turn, this is quite often a bug, because forward migrations (e.g. from 2.3.0 to 2.4.0) are always supported, for obvious upgrade needs. I think you're on to something there. The destination server is running ovirt 3.6 while the source server is on 4.0. The cluster compatibility level is also set to 3.6 since I have not upgrade every host node yet.
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On 23 Feb 2017, at 17:08, Michael Watters <wattersm@watters.ws> wrote:
I canceled the migration and manually moved the VM to another host running ovirt 4.0. The source node was then able to set itself to maintenance mode without any errors.
On 02/23/2017 10:46 AM, Michael Watters wrote:
On 02/23/2017 10:28 AM, Francesco Romani wrote:
The load/save state errors are most often found when the two sides of the migration have different and incompatible version of QEMU. In turn, this is quite often a bug, because forward migrations (e.g. from 2.3.0 to 2.4.0) are always supported, for obvious upgrade needs. I think you're on to something there. The destination server is running ovirt 3.6 while the source server is on 4.0. The cluster compatibility level is also set to 3.6 since I have not upgrade every host node yet.
that is supported and works when the versions are really the latest ones. I suggest to check the repos, it might be you are not getting updates for CentOS or qemu-kvm-ev packages on either 3.6 or 4.0 side
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (3)
-
Francesco Romani
-
Michael Watters
-
Michal Skrivanek