[Users] oVirt 3.2 - Migration failed due to error: migrateerr
Martin Kletzander
mkletzan at redhat.com
Mon Jul 29 13:18:01 EDT 2013
On 07/29/2013 06:06 PM, Nicholas Kesick wrote:
>
>
>> Date: Mon, 29 Jul 2013 09:56:30 +0200
>> From: mkletzan at redhat.com
>> To: danken at redhat.com
>> CC: cybertimber2000 at hotmail.com; users at ovirt.org
>> Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr
>>
>> On 07/27/2013 09:50 PM, Dan Kenigsberg wrote:
>>> On Fri, Jul 26, 2013 at 02:03:28PM -0400, Nicholas Kesick wrote:
>>>>> Date: Fri, 26 Jul 2013 05:52:44 +0300
>>>>> From: iheim at redhat.com
>>>>> To: cybertimber2000 at hotmail.com
>>>>> CC: danken at redhat.com; users at ovirt.org
>>>>> Subject: Re: [Users] oVirt 3.2 - Migration failed due to error: migrateerr
>>>>>
>>>>> On 07/26/2013 05:40 AM, Nicholas Kesick wrote:
>>>>>>
>>>>>> Replies inline.
>>>>>> > Date: Thu, 25 Jul 2013 22:27:17 +0300
>>>>>> > From: danken at redhat.com
>>>>>> > To: cybertimber2000 at hotmail.com
>>>>>> > CC: users at ovirt.org
>>>>>> > Subject: Re: [Users] oVirt 3.2 - Migration failed due to error:
>>>>>> migrateerr
>>>>>> >
>>>>>> > On Thu, Jul 25, 2013 at 11:54:40AM -0400, Nicholas Kesick wrote:
>>>>>> > > When I try to migrate a VM, any VM, between my two hosts, I receive
>>>>>> an error that says Migration failed due to error: migrateerr. Looking in
>>>>>> the log I don't see any thing that jumps out other than the final message
>>>>>> > >
>>>>>> > > VDSGenericException: VDSErrorException: Failed to MigrateStatusVDS,
>>>>>> error = Fatal error during migration
>>>>>> > >
>>>>>> > > Ovirt-engine is version 3.2.2-1.1.fc18.noarch, firewalld is
>>>>>> disabled, and selinux is permissive.
>>>>>> >
>>>>>> > Please do not say this in public, you're hurting Dan Walsh's feelings ;-)
>>>>>> >
>>>>>> I recall seeing his blog posts, and I agree. Not sure when I set it to
>>>>>> permissive... maybe to get the 3.2 install w/ Firewalld setup to
>>>>>> complete? I remember that was fixed in 3.2.1. I'll set it back to enforcing.
>>>>>> > >
>>>>>> > > ovirt-node version is 2.6.1 on both hosts.
>>>>>> > >
>>>>>> > > Any suggestions would be welcome!
>>>>>> > >
>>>>>> >
>>>>>> > I'd love to see /etc/vdsm/vdsm.log from source and destination. The
>>>>>> > intersting parts start with vmMigrate at the source and with
>>>>>> > vmMigrationCreate at the destination.
>>>>>> Hmm, I probably should have pulled that sooner. So, I cleared the active
>>>>>> VDSM (while nothing was running) and libvirtd.log, booted one vm, and
>>>>>> tried to migrate it. Attached are the logs. It looks like it boils down
>>>>>> to (from the source):
>>>>>> Traceback (most recent call last):
>>>>>> File "/usr/share/vdsm/vm.py", line 271, in run
>>>>>> File "/usr/share/vdsm/libvirtvm.py", line 505, in
>>>>>> _startUnderlyingMigration
>>>>>> File "/usr/share/vdsm/libvirtvm.py", line 541, in f
>>>>>> File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py",
>>>>>> line 111, in wrapper
>>>>>> File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1178, in
>>>>>> migrateToURI2
>>>>>> libvirtError: internal error Attempt to migrate guest to the same host
>>>>>> localhost
>>>>>> Does this mean my UUIDs are the same?
>>>>>> http://vaunaspada.babel.it/blog/?p=613
>>>>>> As far as the destination, I'm really not understanding what's going on
>>>>>> on the destination between "Destination VM creation succeeded" and
>>>>>> ":destroy Called" that would lead to it failing, except for what's after
>>>>>> the traceback:
>>>>>> Traceback (most recent call last):
>>>>>> File "/usr/share/vdsm/vm.py", line 696, in _startUnderlyingVm
>>>>>> File "/usr/share/vdsm/libvirtvm.py", line 1907, in
>>>>>> _waitForIncomingMigrationFinish
>>>>>> File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py",
>>>>>> line 111, in wrapper
>>>>>> File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2822, in
>>>>>> lookupByUUIDString
>>>>>> libvirtError: Domain not found: no domain with matching uuid
>>>>>> '50171e1b-cf21-41d8-80f3-88ab1b980091'
>>>>>> But that is the ID of the VM by the looks of it.
>>>>>> Sorry Itamar, nothing was written to libvirtd.log after I cleared it.
>>>
>>> It could be that libvirtd is still writing to the files that you removed
>>> from the filesystem. To make sure libvirtd writes to your new file,
>>> restart the service. There may be clues there on why libvirt thinks that
>>> the source and destination are one and the same.
>>>
>>
>> When clearing the logs, it should be enough to do '>
>> /path/to/libvirtd.log' (in bash).
>> Just checked and it seems some things were logged in there during my testing on Friday. I'll attach those.
>>>>>
>>>>> Thread-800::ERROR::2013-07-26 01:57:16,198::vm::198::vm.Vm::(_recover)
>>>>> vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::internal error Attempt to
>>>>> migrate guest to the same host localhost
>>>>> Thread-800::ERROR::2013-07-26 01:57:16,377::vm::286::vm.Vm::(run)
>>>>> vmId=`50171e1b-cf21-41d8-80f3-88ab1b980091`::Failed to migrate
>>>>> Traceback (most recent call last):
>>>>> File "/usr/share/vdsm/vm.py", line 271, in run
>>>>> File "/usr/share/vdsm/libvirtvm.py", line 505, in
>>>>> _startUnderlyingMigration
>>>>> File "/usr/share/vdsm/libvirtvm.py", line 541, in f
>>>>> File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py",
>>>>> line 111, in wrapper
>>>>> File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1178, in
>>>>> migrateToURI2
>>>>> libvirtError: internal error Attempt to migrate guest to the same host
>>>>> localhost
>>>>>
>>>>> what are your hostnames?
>>>>
>>>> "host001" on 192.168.0.103 and "host002" on 192.168.0.104
>>>> Even tried changing it, no luck.
>>>>
>>
>> Are they resolving properly on those hosts? Is there a DNS or
>> /etc/hosts entry related to this?
>> There are /etc/hosts entries on both hosts to each other, and a "ping host001" "ping host002" resolves correctly.I do however note that the terminal session says root at localhost. I wonder if running hostnamectl set-hostname {name} will fix anything. ...and after running hostnamectl set-hostname {name}, migrations are working! I think maybe I found a bug:With the node in maintenance mode, from the oVirt Node (SSH or local) if you go to Network, change the hostname ({newname}), and then go down to the configured System NIC and press enter, it says it is Setting Hostname. Now, if you press F2, the terminal will show root@{newname}. If you reboot however, under network it will say {newname}, but pressing F2 for the terminal will show root at localhost. If it's localhost, it won't migrate.So, it looks like the hostname isn't getting written persistantly. Even a hostnamectl set-hostname {name} gets lost on reboot. Or am I doing something wrong?
>>>> Could it be because the oVirt Node - Network tab - does not have any DNS servers specified?
>>>
>>> I do not think so. We do not see "name resolution" errors, or name
>>> resolutions at all.
>>>
>>> What does libvirt (src + dst) say about
>>>
>>> virsh -r capabilities|grep uuid
>>>
>>> ? If uuids happen to be the same, you get the bug that you are
>>> reporting.
>>> host 001:virsh -r capabilities|grep uuid
> <uuid>a4dc7de7-e2d3-45f5-b75a-7101f71d2b17</uuid> host002:
> virsh -r capabilities|grep uuid
> <uuid>ce66bb7f-fbbb-432b-9f62-5bcf5cb732e4</uuid>
>>
>> Although, as Dan says, having the same UUID for both hosts will report
>> the same error even when hostnames are different.
>>
>> Do you have an UUID set in your libvirtd.conf? What do you have in the
>> following files (if they exist) on both hosts?
>>
>> /sys/devices/virtual/dmi/id/product_uuidhost001:cat /sys/devices/virtual/dmi/id/product_uuid
> 44454C4C-3700-1047-8048-C3C04F4C4631 host002:cat /sys/devices/virtual/dmi/id/product_uuid
> 44454C4C-5900-1054-8034-B3C04F4E4631 Those match what I see on the webadmin under hosts > hostname > hardware information
>
>> /sys/class/dmi/id/product_uuidhost001:cat /sys/class/dmi/id/product_uuid
> 44454C4C-3700-1047-8048-C3C04F4C4631 host002:
> cat /sys/class/dmi/id/product_uuid
> 44454C4C-5900-1054-8034-B3C04F4E4631
>
Oops, there's something more wrong than I though. Did you manage to
catch libvirtd logs this time?
Martin
More information about the Users
mailing list