[Users] Migration issues with ovirt 3.3

Itamar Heim iheim at redhat.com
Fri Oct 4 09:29:22 UTC 2013


On 10/03/2013 06:12 PM, Gianluca Cecchi wrote:
> On Wed, Oct 2, 2013 at 12:07 AM, Jason Brooks  wrote:
>> I'm having this issue on my ovirt 3.3 setup (two node, one is AIO,
>> GlusterFS storage, both on F19) as well.
>>
>> Jason
>
> Me too with oVirt 3.3 setup and GlusterFS DataCenter.
> One dedicated engine + 2 vdsm hosts. All fedora 19 + ovirt stable.
> My trying to migrate VM is CentOS 6.4 fully updated

centos qemu/libvirt versions don't support glusterfs storage domain yet. 
6.5 would be needed. we're trying to find a solution until then.

> After trying to migrate in webadmin gui I get:
> VM c6s is down. Exit message: 'iface'.
>
> For a few moments the VM appears as down in the GUI but actually an
> ssh session I had before is still alive.
> Also the qemu process on source host is still there.
> After a little, the VM results again as "Up" in the source node from the gui
>
> Actually it never stopped:
> [root at c6s ~]# uptime
>   16:36:56 up 19 min,  1 user,  load average: 0.00, 0.00, 0.00
>
>
> At target many errors such as
> Thread-8609::ERROR::2013-10-03
> 16:17:13,086::task::850::TaskManager.Task::(_setError)
> Task=`a102541e-fbe5-46a3-958c-e5f4026cac8c`::Unexpected error
> Traceback (most recent call last):
>    File "/usr/share/vdsm/storage/task.py", line 857, in _run
>      return fn(*args, **kargs)
>    File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
>      res = f(*args, **kwargs)
>    File "/usr/share/vdsm/storage/hsm.py", line 2123, in getAllTasksStatuses
>      allTasksStatus = sp.getAllTasksStatuses()
>    File "/usr/share/vdsm/storage/securable.py", line 66, in wrapper
>      raise SecureError()
> SecureError
>
> But relevant perhaps is around the time of migration (16:33-16:34)
>
> Thread-9968::DEBUG::2013-10-03
> 16:33:38,811::task::1168::TaskManager.Task::(prepare)
> Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::finished: {'taskStatus':
> {'code': 0, 'message': 'Task is initializing', 'taskState': 'running',
> 'taskResult': '', 'taskID': '0eaac2f3-3d25-4c8c-9738-708aba290404'}}
> Thread-9968::DEBUG::2013-10-03
> 16:33:38,811::task::579::TaskManager.Task::(_updateState)
> Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::moving from state
> preparing -> state finished
> Thread-9968::DEBUG::2013-10-03
> 16:33:38,811::resourceManager::939::ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-9968::DEBUG::2013-10-03
> 16:33:38,812::resourceManager::976::ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-9968::DEBUG::2013-10-03
> 16:33:38,812::task::974::TaskManager.Task::(_decref)
> Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::ref 0 aborting False
> 0eaac2f3-3d25-4c8c-9738-708aba290404::ERROR::2013-10-03
> 16:33:38,847::task::850::TaskManager.Task::(_setError)
> Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Unexpected error
> Traceback (most recent call last):
>    File "/usr/share/vdsm/storage/task.py", line 857, in _run
>      return fn(*args, **kargs)
>    File "/usr/share/vdsm/storage/task.py", line 318, in run
>      return self.cmd(*self.argslist, **self.argsdict)
>    File "/usr/share/vdsm/storage/sp.py", line 272, in startSpm
>      self.masterDomain.acquireHostId(self.id)
>    File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId
>      self._clusterLock.acquireHostId(hostId, async)
>    File "/usr/share/vdsm/storage/clusterlock.py", line 189, in acquireHostId
>      raise se.AcquireHostIdFailure(self._sdUUID, e)
> AcquireHostIdFailure: Cannot acquire host id:
> ('d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291', SanlockException(5, 'Sanlock
> lockspace add failure', 'Input/output error'))
> 0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03
> 16:33:38,847::task::869::TaskManager.Task::(_run)
> Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Task._run:
> 0eaac2f3-3d25-4c8c-9738-708aba290404 () {} failed - stopping task
> 0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03
> 16:33:38,847::task::1194::TaskManager.Task::(stop)
> Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::stopping in state running
> (force False)
>
> Instead at source:
>
> Thread-10402::ERROR::2013-10-03
> 16:35:03,713::vm::244::vm.Vm::(_recover)
> vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::migration destination
> error: Error creating the requested VM
> Thread-10402::ERROR::2013-10-03 16:35:03,740::vm::324::vm.Vm::(run)
> vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::Failed to migrate
> Traceback (most recent call last):
>    File "/usr/share/vdsm/vm.py", line 311, in run
>      self._startUnderlyingMigration()
>    File "/usr/share/vdsm/vm.py", line 347, in _startUnderlyingMigration
>      response['status']['message'])
> RuntimeError: migration destination error: Error creating the requested VM
> Thread-1161::DEBUG::2013-10-03
> 16:35:04,243::fileSD::238::Storage.Misc.excCmd::(getReadDelay)
> '/usr/bin/dd iflag=direct
> if=/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/metadata
> bs=4096 count=1' (cwd None)
> Thread-1161::DEBUG::2013-10-03
> 16:35:04,262::fileSD::238::Storage.Misc.excCmd::(getReadDelay)
> SUCCESS: <err> = '0+1 records in\n0+1 records out\n512 bytes (512 B)
> copied, 0.0015976 s, 320 kB/s\n'; <rc> = 0
> Thread-1161::INFO::2013-10-03
> 16:35:04,269::clusterlock::174::SANLock::(acquireHostId) Acquiring
> host id for domain d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291 (id: 2)
> Thread-1161::DEBUG::2013-10-03
> 16:35:04,270::clusterlock::192::SANLock::(acquireHostId) Host id for
> domain d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291 successfully acquired (id:
> 2)
>
>
> Full logs here:
> https://docs.google.com/file/d/0BwoPbcrMv8mvNVFBaGhDeUdMOFE/edit?usp=sharing
>
> in the zip file:
> engine.log
> vdsm_source.log
> vdsm_target.log
> ovirt gluster 3.3 datacenter status.png
> (note as many times it seems from a gui point of view that DC set to
> non responsive...?)
>
> One strange thing:
> In Clusters --> Gluster01 --> Volumes I  see my gluster volume (gvdata) as up
>
> while in DC --> Gluster  (My DC name) Storage --> gvdata (my storage
> domain name) here is marked as down, but the related C6 VM is always
> up and running???
>
> source:
> [root at f18ovn03 vdsm]# df -h | grep rhev
> f18ovn01.mydomain:gvdata                  30G  4.7G   26G  16%
> /rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata
> f18engine.mydomain:/var/lib/exports/iso   35G  9.4G   24G  29%
> /rhev/data-center/mnt/f18engine.mydomain:_var_lib_exports_iso
>
> target:
>   [root at f18ovn01 vdsm]# df -h | grep rhev
> f18ovn01.mydomain:gvdata                  30G  4.7G   26G  16%
> /rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata
> f18engine.mydomain:/var/lib/exports/iso   35G  9.4G   24G  29%
> /rhev/data-center/mnt/f18engine.mydomain:_var_lib_exports_iso
>
> I didn't try to activate again because I didn't want to make worse things:
> it seems strange to me that both hosts are marked as up but the only
> existing Storage Domain (so marked as "Data (Master)) in the Gui is
> down.....
>
> Gianluca
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>




More information about the Users mailing list