
On 07/29/2013 06:06 PM, Nicholas Kesick wrote:
Date: Mon, 29 Jul 2013 09:56:30 +0200 From: mkletzan@redhat.com To: danken@redhat.com CC: cybertimber2000@hotmail.com; users@ovirt.org Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr
On 07/27/2013 09:50 PM, Dan Kenigsberg wrote:
On Fri, Jul 26, 2013 at 02:03:28PM -0400, Nicholas Kesick wrote:
Date: Fri, 26 Jul 2013 05:52:44 +0300 From: iheim@redhat.com To: cybertimber2000@hotmail.com CC: danken@redhat.com; users@ovirt.org Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr
On 07/26/2013 05:40 AM, Nicholas Kesick wrote:
Replies inline. > Date: Thu, 25 Jul 2013 22:27:17 +0300 > From: danken@redhat.com > To: cybertimber2000@hotmail.com > CC: users@ovirt.org > Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr > > On Thu, Jul 25, 2013 at 11:54:40AM -0400, Nicholas Kesick wrote: > > When I try to migrate a VM, any VM, between my two hosts, I receive an error that says Migration failed due to error: migrateerr. Looking in the log I don't see any thing that jumps out other than the final message > > > > VDSGenericException: VDSErrorException: Failed to MigrateStatusVDS, error = Fatal error during migration > > > > Ovirt-engine is version 3.2.2-1.1.fc18.noarch, firewalld is disabled, and selinux is permissive. > > Please do not say this in public, you're hurting Dan Walsh's feelings ;-) > I recall seeing his blog posts, and I agree. Not sure when I set it to permissive... maybe to get the 3.2 install w/ Firewalld setup to complete? I remember that was fixed in 3.2.1. I'll set it back to enforcing. > > > > ovirt-node version is 2.6.1 on both hosts. > > > > Any suggestions would be welcome! > > > > I'd love to see /etc/vdsm/vdsm.log from source and destination. The > intersting parts start with vmMigrate at the source and with > vmMigrationCreate at the destination. Hmm, I probably should have pulled that sooner. So, I cleared the active VDSM (while nothing was running) and libvirtd.log, booted one vm, and tried to migrate it. Attached are the logs. It looks like it boils down to (from the source): Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 271, in run File "/usr/share/vdsm/libvirtvm.py", line 505, in _startUnderlyingMigration File "/usr/share/vdsm/libvirtvm.py", line 541, in f File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1178, in migrateToURI2 libvirtError: internal error Attempt to migrate guest to the same host localhost Does this mean my UUIDs are the same? http://vaunaspada.babel.it/blog/?p=613 As far as the destination, I'm really not understanding what's going on on the destination between "Destination VM creation succeeded" and ":destroy Called" that would lead to it failing, except for what's after the traceback: Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 696, in _startUnderlyingVm File "/usr/share/vdsm/libvirtvm.py", line 1907, in _waitForIncomingMigrationFinish File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2822, in lookupByUUIDString libvirtError: Domain not found: no domain with matching uuid '50171e1b-cf21-41d8-80f3-88ab1b980091' But that is the ID of the VM by the looks of it. Sorry Itamar, nothing was written to libvirtd.log after I cleared it.
It could be that libvirtd is still writing to the files that you removed from the filesystem. To make sure libvirtd writes to your new file, restart the service. There may be clues there on why libvirt thinks that the source and destination are one and the same.
When clearing the logs, it should be enough to do '> /path/to/libvirtd.log' (in bash). Just checked and it seems some things were logged in there during my testing on Friday. I'll attach those.
Thread-800::ERROR::2013-07-26 01:57:16,198::vm::198::vm.Vm::(_recover) vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::internal error Attempt to migrate guest to the same host localhost Thread-800::ERROR::2013-07-26 01:57:16,377::vm::286::vm.Vm::(run) vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::Failed to migrate Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 271, in run File "/usr/share/vdsm/libvirtvm.py", line 505, in _startUnderlyingMigration File "/usr/share/vdsm/libvirtvm.py", line 541, in f File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1178, in migrateToURI2 libvirtError: internal error Attempt to migrate guest to the same host localhost
what are your hostnames?
"host001" on 192.168.0.103 and "host002" on 192.168.0.104 Even tried changing it, no luck.
Are they resolving properly on those hosts? Is there a DNS or /etc/hosts entry related to this? There are /etc/hosts entries on both hosts to each other, and a "ping host001" "ping host002" resolves correctly.I do however note that the terminal session says root@localhost. I wonder if running hostnamectl set-hostname {name} will fix anything. ...and after running hostnamectl set-hostname {name}, migrations are working! I think maybe I found a bug:With the node in maintenance mode, from the oVirt Node (SSH or local) if you go to Network, change the hostname ({newname}), and then go down to the configured System NIC and press enter, it says it is Setting Hostname. Now, if you press F2, the terminal will show root@{newname}. If you reboot however, under network it will say {newname}, but pressing F2 for the terminal will show root@localhost. If it's localhost, it won't migrate.So, it looks like the hostname isn't getting written persistantly. Even a hostnamectl set-hostname {name} gets lost on reboot. Or am I doing something wrong?
Could it be because the oVirt Node - Network tab - does not have any DNS servers specified?
I do not think so. We do not see "name resolution" errors, or name resolutions at all.
What does libvirt (src + dst) say about
virsh -r capabilities|grep uuid
? If uuids happen to be the same, you get the bug that you are reporting. host 001:virsh -r capabilities|grep uuid <uuid>a4dc7de7-e2d3-45f5-b75a-7101f71d2b17</uuid> host002: virsh -r capabilities|grep uuid <uuid>ce66bb7f-fbbb-432b-9f62-5bcf5cb732e4</uuid>
Although, as Dan says, having the same UUID for both hosts will report the same error even when hostnames are different.
Do you have an UUID set in your libvirtd.conf? What do you have in the following files (if they exist) on both hosts?
/sys/devices/virtual/dmi/id/product_uuidhost001:cat /sys/devices/virtual/dmi/id/product_uuid 44454C4C-3700-1047-8048-C3C04F4C4631 host002:cat /sys/devices/virtual/dmi/id/product_uuid 44454C4C-5900-1054-8034-B3C04F4E4631 Those match what I see on the webadmin under hosts > hostname > hardware information
/sys/class/dmi/id/product_uuidhost001:cat /sys/class/dmi/id/product_uuid 44454C4C-3700-1047-8048-C3C04F4C4631 host002: cat /sys/class/dmi/id/product_uuid 44454C4C-5900-1054-8034-B3C04F4E4631
Oops, there's something more wrong than I though. Did you manage to catch libvirtd logs this time? Martin