Re: [Users] Migration issues with ovirt 3.3

4 Oct 2013


      On 10/03/2013 06:12 PM, Gianluca Cecchi wrote:
...
On Wed, Oct 2, 2013 at 12:07 AM, Jason Brooks  wrote:
...
I'm having this issue on my ovirt 3.3 setup (two node, one is AIO,
GlusterFS storage, both on F19) as well.
Jason
Me too with oVirt 3.3 setup and GlusterFS DataCenter.
One dedicated engine + 2 vdsm hosts. All fedora 19 + ovirt stable.
My trying to migrate VM is CentOS 6.4 fully updated
centos qemu/libvirt versions don't support glusterfs storage domain yet. 
6.5 would be needed. we're trying to find a solution until then.
...
After trying to migrate in webadmin gui I get:
VM c6s is down. Exit message: 'iface'.
For a few moments the VM appears as down in the GUI but actually an
ssh session I had before is still alive.
Also the qemu process on source host is still there.
After a little, the VM results again as "Up" in the source node from the gui
Actually it never stopped:
[root@c6s ~]# uptime
  16:36:56 up 19 min,  1 user,  load average: 0.00, 0.00, 0.00
At target many errors such as
Thread-8609::ERROR::2013-10-03
16:17:13,086::task::850::TaskManager.Task::(_setError)
Task=`a102541e-fbe5-46a3-958c-e5f4026cac8c`::Unexpected error
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/task.py", line 857, in _run
     return fn(*args, **kargs)
   File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
     res = f(*args, **kwargs)
   File "/usr/share/vdsm/storage/hsm.py", line 2123, in getAllTasksStatuses
     allTasksStatus = sp.getAllTasksStatuses()
   File "/usr/share/vdsm/storage/securable.py", line 66, in wrapper
     raise SecureError()
SecureError
But relevant perhaps is around the time of migration (16:33-16:34)
Thread-9968::DEBUG::2013-10-03
16:33:38,811::task::1168::TaskManager.Task::(prepare)
Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::finished: {'taskStatus':
{'code': 0, 'message': 'Task is initializing', 'taskState': 'running',
'taskResult': '', 'taskID': '0eaac2f3-3d25-4c8c-9738-708aba290404'}}
Thread-9968::DEBUG::2013-10-03
16:33:38,811::task::579::TaskManager.Task::(_updateState)
Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::moving from state
preparing -> state finished
Thread-9968::DEBUG::2013-10-03
16:33:38,811::resourceManager::939::ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-9968::DEBUG::2013-10-03
16:33:38,812::resourceManager::976::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-9968::DEBUG::2013-10-03
16:33:38,812::task::974::TaskManager.Task::(_decref)
Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::ref 0 aborting False
0eaac2f3-3d25-4c8c-9738-708aba290404::ERROR::2013-10-03
16:33:38,847::task::850::TaskManager.Task::(_setError)
Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Unexpected error
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/task.py", line 857, in _run
     return fn(*args, **kargs)
   File "/usr/share/vdsm/storage/task.py", line 318, in run
     return self.cmd(*self.argslist, **self.argsdict)
   File "/usr/share/vdsm/storage/sp.py", line 272, in startSpm
     self.masterDomain.acquireHostId(self.id)
   File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId
     self._clusterLock.acquireHostId(hostId, async)
   File "/usr/share/vdsm/storage/clusterlock.py", line 189, in acquireHostId
     raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id:
('d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291', SanlockException(5, 'Sanlock
lockspace add failure', 'Input/output error'))
0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03
16:33:38,847::task::869::TaskManager.Task::(_run)
Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Task._run:
0eaac2f3-3d25-4c8c-9738-708aba290404 () {} failed - stopping task
0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03
16:33:38,847::task::1194::TaskManager.Task::(stop)
Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::stopping in state running
(force False)
Instead at source:
Thread-10402::ERROR::2013-10-03
16:35:03,713::vm::244::vm.Vm::(_recover)
vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::migration destination
error: Error creating the requested VM
Thread-10402::ERROR::2013-10-03 16:35:03,740::vm::324::vm.Vm::(run)
vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::Failed to migrate
Traceback (most recent call last):
   File "/usr/share/vdsm/vm.py", line 311, in run
     self._startUnderlyingMigration()
   File "/usr/share/vdsm/vm.py", line 347, in _startUnderlyingMigration
     response['status']['message'])
RuntimeError: migration destination error: Error creating the requested VM
Thread-1161::DEBUG::2013-10-03
16:35:04,243::fileSD::238::Storage.Misc.excCmd::(getReadDelay)
'/usr/bin/dd iflag=direct
if=/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/metadata
bs=4096 count=1' (cwd None)
Thread-1161::DEBUG::2013-10-03
16:35:04,262::fileSD::238::Storage.Misc.excCmd::(getReadDelay)
SUCCESS: <err> = '0+1 records in\n0+1 records out\n512 bytes (512 B)
copied, 0.0015976 s, 320 kB/s\n'; <rc> = 0
Thread-1161::INFO::2013-10-03
16:35:04,269::clusterlock::174::SANLock::(acquireHostId) Acquiring
host id for domain d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291 (id: 2)
Thread-1161::DEBUG::2013-10-03
16:35:04,270::clusterlock::192::SANLock::(acquireHostId) Host id for
domain d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291 successfully acquired (id:
2)
Full logs here:
https://docs.google.com/file/d/0BwoPbcrMv8mvNVFBaGhDeUdMOFE/edit?usp=sharing
in the zip file:
engine.log
vdsm_source.log
vdsm_target.log
ovirt gluster 3.3 datacenter status.png
(note as many times it seems from a gui point of view that DC set to
non responsive...?)
One strange thing:
In Clusters --> Gluster01 --> Volumes I  see my gluster volume (gvdata) as up
while in DC --> Gluster  (My DC name) Storage --> gvdata (my storage
domain name) here is marked as down, but the related C6 VM is always
up and running???
source:
[root@f18ovn03 vdsm]# df -h | grep rhev
f18ovn01.mydomain:gvdata                  30G  4.7G   26G  16%
/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata
f18engine.mydomain:/var/lib/exports/iso   35G  9.4G   24G  29%
/rhev/data-center/mnt/f18engine.mydomain:_var_lib_exports_iso
target:
  [root@f18ovn01 vdsm]# df -h | grep rhev
f18ovn01.mydomain:gvdata                  30G  4.7G   26G  16%
/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata
f18engine.mydomain:/var/lib/exports/iso   35G  9.4G   24G  29%
/rhev/data-center/mnt/f18engine.mydomain:_var_lib_exports_iso
I didn't try to activate again because I didn't want to make worse things:
it seems strange to me that both hosts are marked as up but the only
existing Storage Domain (so marked as "Data (Master)) in the Gui is
down.....
Gianluca
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Migration issues with ovirt 3.3

Itamar Heim