> Date: Mon, 29 Jul 2013 09:56:30 +0200
> From: mkletzan(a)redhat.com
> To: danken(a)redhat.com
> CC: cybertimber2000(a)hotmail.com; users(a)ovirt.org
> Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr
>
> On 07/27/2013 09:50 PM, Dan Kenigsberg wrote:
>> On Fri, Jul 26, 2013 at 02:03:28PM -0400, Nicholas Kesick wrote:
>>>> Date: Fri, 26 Jul 2013 05:52:44 +0300
>>>> From: iheim(a)redhat.com
>>>> To: cybertimber2000(a)hotmail.com
>>>> CC: danken(a)redhat.com; users(a)ovirt.org
>>>> Subject: Re: [Users] oVirt 3.2 - Migration failed due to error:
migrateerr
>>>>
>>>> On 07/26/2013 05:40 AM, Nicholas Kesick wrote:
>>>>>
>>>>> Replies inline.
>>>>> > Date: Thu, 25 Jul 2013 22:27:17 +0300
>>>>> > From: danken(a)redhat.com
>>>>> > To: cybertimber2000(a)hotmail.com
>>>>> > CC: users(a)ovirt.org
>>>>> > Subject: Re: [Users] oVirt 3.2 - Migration failed due to
error:
>>>>> migrateerr
>>>>> >
>>>>> > On Thu, Jul 25, 2013 at 11:54:40AM -0400, Nicholas Kesick
wrote:
>>>>> > > When I try to migrate a VM, any VM, between my two hosts,
I receive
>>>>> an error that says Migration failed due to error: migrateerr. Looking
in
>>>>> the log I don't see any thing that jumps out other than the final
message
>>>>> > >
>>>>> > > VDSGenericException: VDSErrorException: Failed to
MigrateStatusVDS,
>>>>> error = Fatal error during migration
>>>>> > >
>>>>> > > Ovirt-engine is version 3.2.2-1.1.fc18.noarch, firewalld
is
>>>>> disabled, and selinux is permissive.
>>>>> >
>>>>> > Please do not say this in public, you're hurting Dan
Walsh's feelings ;-)
>>>>> >
>>>>> I recall seeing his blog posts, and I agree. Not sure when I set it
to
>>>>> permissive... maybe to get the 3.2 install w/ Firewalld setup to
>>>>> complete? I remember that was fixed in 3.2.1. I'll set it back to
enforcing.
>>>>> > >
>>>>> > > ovirt-node version is 2.6.1 on both hosts.
>>>>> > >
>>>>> > > Any suggestions would be welcome!
>>>>> > >
>>>>> >
>>>>> > I'd love to see /etc/vdsm/vdsm.log from source and
destination. The
>>>>> > intersting parts start with vmMigrate at the source and with
>>>>> > vmMigrationCreate at the destination.
>>>>> Hmm, I probably should have pulled that sooner. So, I cleared the
active
>>>>> VDSM (while nothing was running) and libvirtd.log, booted one vm,
and
>>>>> tried to migrate it. Attached are the logs. It looks like it boils
down
>>>>> to (from the source):
>>>>> Traceback (most recent call last):
>>>>> File "/usr/share/vdsm/vm.py", line 271, in run
>>>>> File "/usr/share/vdsm/libvirtvm.py", line 505, in
>>>>> _startUnderlyingMigration
>>>>> File "/usr/share/vdsm/libvirtvm.py", line 541, in f
>>>>> File
"/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py",
>>>>> line 111, in wrapper
>>>>> File "/usr/lib64/python2.7/site-packages/libvirt.py",
line 1178, in
>>>>> migrateToURI2
>>>>> libvirtError: internal error Attempt to migrate guest to the same
host
>>>>> localhost
>>>>> Does this mean my UUIDs are the same?
>>>>>
http://vaunaspada.babel.it/blog/?p=613
>>>>> As far as the destination, I'm really not understanding
what's going on
>>>>> on the destination between "Destination VM creation
succeeded" and
>>>>> ":destroy Called" that would lead to it failing, except for
what's after
>>>>> the traceback:
>>>>> Traceback (most recent call last):
>>>>> File "/usr/share/vdsm/vm.py", line 696, in
_startUnderlyingVm
>>>>> File "/usr/share/vdsm/libvirtvm.py", line 1907, in
>>>>> _waitForIncomingMigrationFinish
>>>>> File
"/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py",
>>>>> line 111, in wrapper
>>>>> File "/usr/lib64/python2.7/site-packages/libvirt.py",
line 2822, in
>>>>> lookupByUUIDString
>>>>> libvirtError: Domain not found: no domain with matching uuid
>>>>> '50171e1b-cf21-41d8-80f3-88ab1b980091'
>>>>> But that is the ID of the VM by the looks of it.
>>>>> Sorry Itamar, nothing was written to libvirtd.log after I cleared
it.
>>
>> It could be that libvirtd is still writing to the files that you removed
>> from the filesystem. To make sure libvirtd writes to your new file,
>> restart the service. There may be clues there on why libvirt thinks that
>> the source and destination are one and the same.
>>
>
> When clearing the logs, it should be enough to do '>
> /path/to/libvirtd.log' (in bash).
> Just checked and it seems some things were logged in there during my testing on
Friday. I'll attach those.
>>>>
>>>> Thread-800::ERROR::2013-07-26 01:57:16,198::vm::198::vm.Vm::(_recover)
>>>> vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::internal error Attempt to
>>>> migrate guest to the same host localhost
>>>> Thread-800::ERROR::2013-07-26 01:57:16,377::vm::286::vm.Vm::(run)
>>>> vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::Failed to migrate
>>>> Traceback (most recent call last):
>>>> File "/usr/share/vdsm/vm.py", line 271, in run
>>>> File "/usr/share/vdsm/libvirtvm.py", line 505, in
>>>> _startUnderlyingMigration
>>>> File "/usr/share/vdsm/libvirtvm.py", line 541, in f
>>>> File
"/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py",
>>>> line 111, in wrapper
>>>> File "/usr/lib64/python2.7/site-packages/libvirt.py", line
1178, in
>>>> migrateToURI2
>>>> libvirtError: internal error Attempt to migrate guest to the same host
>>>> localhost
>>>>
>>>> what are your hostnames?
>>>
>>> "host001" on 192.168.0.103 and "host002" on
192.168.0.104
>>> Even tried changing it, no luck.
>>>
>
> Are they resolving properly on those hosts? Is there a DNS or
> /etc/hosts entry related to this?
> There are /etc/hosts entries on both hosts to each other, and a "ping
host001" "ping host002" resolves correctly.I do however note that the
terminal session says root@localhost. I wonder if running hostnamectl set-hostname {name}
will fix anything. ...and after running hostnamectl set-hostname {name}, migrations are
working! I think maybe I found a bug:With the node in maintenance mode, from the oVirt
Node (SSH or local) if you go to Network, change the hostname ({newname}), and then go
down to the configured System NIC and press enter, it says it is Setting Hostname. Now, if
you press F2, the terminal will show root@{newname}. If you reboot however, under network
it will say {newname}, but pressing F2 for the terminal will show root@localhost. If
it's localhost, it won't migrate.So, it looks like the hostname isn't getting
written persistantly. Even a hostnamectl set-hostname {name} gets lost on reboot. Or am I
doing something wrong?
>>> Could it be because the oVirt Node - Network tab - does not have any DNS
servers specified?
>>
>> I do not think so. We do not see "name resolution" errors, or name
>> resolutions at all.
>>
>> What does libvirt (src + dst) say about
>>
>> virsh -r capabilities|grep uuid
>>
>> ? If uuids happen to be the same, you get the bug that you are
>> reporting.
>> host 001:virsh -r capabilities|grep uuid
<uuid>a4dc7de7-e2d3-45f5-b75a-7101f71d2b17</uuid> host002:
virsh -r capabilities|grep uuid
<uuid>ce66bb7f-fbbb-432b-9f62-5bcf5cb732e4</uuid>
>
> Although, as Dan says, having the same UUID for both hosts will report
> the same error even when hostnames are different.
>
> Do you have an UUID set in your libvirtd.conf? What do you have in the
> following files (if they exist) on both hosts?
>
> /sys/devices/virtual/dmi/id/product_uuidhost001:cat
/sys/devices/virtual/dmi/id/product_uuid
44454C4C-3700-1047-8048-C3C04F4C4631 host002:cat
/sys/devices/virtual/dmi/id/product_uuid
44454C4C-5900-1054-8034-B3C04F4E4631 Those match what I see on the webadmin under hosts
> hostname > hardware information
> /sys/class/dmi/id/product_uuidhost001:cat /sys/class/dmi/id/product_uuid
44454C4C-3700-1047-8048-C3C04F4C4631 host002:
cat /sys/class/dmi/id/product_uuid
44454C4C-5900-1054-8034-B3C04F4E4631
Oops, there's something more wrong than I though. Did you manage to
catch libvirtd logs this time?
Martin