
On 07/29/2013 06:06 PM, Nicholas Kesick wrote:
Date: Mon, 29 Jul 2013 09:56:30 +0200 From: mkletzan@redhat.com To: danken@redhat.com CC: cybertimber2000@hotmail.com; users@ovirt.org Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr
On 07/27/2013 09:50 PM, Dan Kenigsberg wrote:
On Fri, Jul 26, 2013 at 02:03:28PM -0400, Nicholas Kesick wrote:
Date: Fri, 26 Jul 2013 05:52:44 +0300 From: iheim@redhat.com To: cybertimber2000@hotmail.com CC: danken@redhat.com; users@ovirt.org Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr
On 07/26/2013 05:40 AM, Nicholas Kesick wrote:
Replies inline. > Date: Thu, 25 Jul 2013 22:27:17 +0300 > From: danken@redhat.com > To: cybertimber2000@hotmail.com > CC: users@ovirt.org > Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr > > On Thu, Jul 25, 2013 at 11:54:40AM -0400, Nicholas Kesick wrote: > > When I try to migrate a VM, any VM, between my two hosts, I receive an error that says Migration failed due to error: migrateerr. Looking in the log I don't see any thing that jumps out other than the final message > > > > VDSGenericException: VDSErrorException: Failed to MigrateStatusVDS, error = Fatal error during migration > > > > Ovirt-engine is version 3.2.2-1.1.fc18.noarch, firewalld is disabled, and selinux is permissive. > > Please do not say this in public, you're hurting Dan Walsh's feelings ;-) > I recall seeing his blog posts, and I agree. Not sure when I set it to permissive... maybe to get the 3.2 install w/ Firewalld setup to complete? I remember that was fixed in 3.2.1. I'll set it back to enforcing. > > > > ovirt-node version is 2.6.1 on both hosts. > > > > Any suggestions would be welcome! > > > > I'd love to see /etc/vdsm/vdsm.log from source and destination. The > intersting parts start with vmMigrate at the source and with > vmMigrationCreate at the destination. Hmm, I probably should have pulled that sooner. So, I cleared the active VDSM (while nothing was running) and libvirtd.log, booted one vm, and tried to migrate it. Attached are the logs. It looks like it boils down to (from the source): Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 271, in run File "/usr/share/vdsm/libvirtvm.py", line 505, in _startUnderlyingMigration File "/usr/share/vdsm/libvirtvm.py", line 541, in f File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1178, in migrateToURI2 libvirtError: internal error Attempt to migrate guest to the same host localhost Does this mean my UUIDs are the same? http://vaunaspada.babel.it/blog/?p=613 As far as the destination, I'm really not understanding what's going on on the destination between "Destination VM creation succeeded" and ":destroy Called" that would lead to it failing, except for what's after the traceback: Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 696, in _startUnderlyingVm File "/usr/share/vdsm/libvirtvm.py", line 1907, in _waitForIncomingMigrationFinish File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2822, in lookupByUUIDString libvirtError: Domain not found: no domain with matching uuid '50171e1b-cf21-41d8-80f3-88ab1b980091' But that is the ID of the VM by the looks of it. Sorry Itamar, nothing was written to libvirtd.log after I cleared it.
It could be that libvirtd is still writing to the files that you removed from the filesystem. To make sure libvirtd writes to your new file, restart the service. There may be clues there on why libvirt thinks that the source and destination are one and the same.
When clearing the logs, it should be enough to do '> /path/to/libvirtd.log' (in bash). Just checked and it seems some things were logged in there during my testing on Friday. I'll attach those.
Thread-800::ERROR::2013-07-26 01:57:16,198::vm::198::vm.Vm::(_recover) vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::internal error Attempt to migrate guest to the same host localhost Thread-800::ERROR::2013-07-26 01:57:16,377::vm::286::vm.Vm::(run) vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::Failed to migrate Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 271, in run File "/usr/share/vdsm/libvirtvm.py", line 505, in _startUnderlyingMigration File "/usr/share/vdsm/libvirtvm.py", line 541, in f File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1178, in migrateToURI2 libvirtError: internal error Attempt to migrate guest to the same host localhost
what are your hostnames?
"host001" on 192.168.0.103 and "host002" on 192.168.0.104 Even tried changing it, no luck.
Are they resolving properly on those hosts? Is there a DNS or /etc/hosts entry related to this? There are /etc/hosts entries on both hosts to each other, and a "ping host001" "ping host002" resolves correctly.I do however note that the terminal session says root@localhost. I wonder if running hostnamectl set-hostname {name} will fix anything. ...and after running hostnamectl set-hostname {name}, migrations are working! I think maybe I found a bug:With the node in maintenance mode, from the oVirt Node (SSH or local) if you go to Network, change the hostname ({newname}), and then go down to the configured System NIC and press enter, it says it is Setting Hostname. Now, if you press F2, the terminal will show root@{newname}. If you reboot however, under network it will say {newname}, but pressing F2 for the terminal will show root@localhost. If it's localhost, it won't migrate.So, it looks like the hostname isn't getting written persistantly. Even a hostnamectl set-hostname {name} gets lost on reboot. Or am I doing something wrong?
Could it be because the oVirt Node - Network tab - does not have any DNS servers specified?
I do not think so. We do not see "name resolution" errors, or name resolutions at all.
No name resolution errors, but the name resolution is the problem. Looking at the logs, the hostname source libvirtd sends to the destination is "localhost", which means it's unable to properly resolve it's hostname. You should properly set up hostname resolving. It's good that hostnames are resolved to ip addresses properly on both hosts, but when 'gethostname()' or 'hostname' (in shell) gives you 'localhost', the daemon can't get it's hostname properly. You can probably check that out also by running: python2 -c 'import socket; print socket.gethostname()' This should return the proper hostname of the machine. Martin