On Wed, Oct 2, 2013 at 12:07 AM, Jason Brooks wrote:
I'm having this issue on my ovirt 3.3 setup (two node, one is
AIO,
GlusterFS storage, both on F19) as well.
Jason
Me too with oVirt 3.3 setup and GlusterFS DataCenter.
One dedicated engine + 2 vdsm hosts. All fedora 19 + ovirt stable.
My trying to migrate VM is CentOS 6.4 fully updated
After trying to migrate in webadmin gui I get:
VM c6s is down. Exit message: 'iface'.
For a few moments the VM appears as down in the GUI but actually an
ssh session I had before is still alive.
Also the qemu process on source host is still there.
After a little, the VM results again as "Up" in the source node from the gui
Actually it never stopped:
[root@c6s ~]# uptime
16:36:56 up 19 min, 1 user, load average: 0.00, 0.00, 0.00
At target many errors such as
Thread-8609::ERROR::2013-10-03
16:17:13,086::task::850::TaskManager.Task::(_setError)
Task=`a102541e-fbe5-46a3-958c-e5f4026cac8c`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 857, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 2123, in getAllTasksStatuses
allTasksStatus = sp.getAllTasksStatuses()
File "/usr/share/vdsm/storage/securable.py", line 66, in wrapper
raise SecureError()
SecureError
But relevant perhaps is around the time of migration (16:33-16:34)
Thread-9968::DEBUG::2013-10-03
16:33:38,811::task::1168::TaskManager.Task::(prepare)
Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::finished: {'taskStatus':
{'code': 0, 'message': 'Task is initializing',
'taskState': 'running',
'taskResult': '', 'taskID':
'0eaac2f3-3d25-4c8c-9738-708aba290404'}}
Thread-9968::DEBUG::2013-10-03
16:33:38,811::task::579::TaskManager.Task::(_updateState)
Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::moving from state
preparing -> state finished
Thread-9968::DEBUG::2013-10-03
16:33:38,811::resourceManager::939::ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-9968::DEBUG::2013-10-03
16:33:38,812::resourceManager::976::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-9968::DEBUG::2013-10-03
16:33:38,812::task::974::TaskManager.Task::(_decref)
Task=`6c1d3161-edcd-4344-a32a-4a18f75f5ba3`::ref 0 aborting False
0eaac2f3-3d25-4c8c-9738-708aba290404::ERROR::2013-10-03
16:33:38,847::task::850::TaskManager.Task::(_setError)
Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 857, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/storage/task.py", line 318, in run
return self.cmd(*self.argslist, **self.argsdict)
File "/usr/share/vdsm/storage/sp.py", line 272, in startSpm
self.masterDomain.acquireHostId(self.id)
File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId
self._clusterLock.acquireHostId(hostId, async)
File "/usr/share/vdsm/storage/clusterlock.py", line 189, in acquireHostId
raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id:
('d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291', SanlockException(5, 'Sanlock
lockspace add failure', 'Input/output error'))
0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03
16:33:38,847::task::869::TaskManager.Task::(_run)
Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::Task._run:
0eaac2f3-3d25-4c8c-9738-708aba290404 () {} failed - stopping task
0eaac2f3-3d25-4c8c-9738-708aba290404::DEBUG::2013-10-03
16:33:38,847::task::1194::TaskManager.Task::(stop)
Task=`0eaac2f3-3d25-4c8c-9738-708aba290404`::stopping in state running
(force False)
Instead at source:
Thread-10402::ERROR::2013-10-03
16:35:03,713::vm::244::vm.Vm::(_recover)
vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::migration destination
error: Error creating the requested VM
Thread-10402::ERROR::2013-10-03 16:35:03,740::vm::324::vm.Vm::(run)
vmId=`4147e0d3-19a7-447b-9d88-2ff19365bec0`::Failed to migrate
Traceback (most recent call last):
File "/usr/share/vdsm/vm.py", line 311, in run
self._startUnderlyingMigration()
File "/usr/share/vdsm/vm.py", line 347, in _startUnderlyingMigration
response['status']['message'])
RuntimeError: migration destination error: Error creating the requested VM
Thread-1161::DEBUG::2013-10-03
16:35:04,243::fileSD::238::Storage.Misc.excCmd::(getReadDelay)
'/usr/bin/dd iflag=direct
if=/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/metadata
bs=4096 count=1' (cwd None)
Thread-1161::DEBUG::2013-10-03
16:35:04,262::fileSD::238::Storage.Misc.excCmd::(getReadDelay)
SUCCESS: <err> = '0+1 records in\n0+1 records out\n512 bytes (512 B)
copied, 0.0015976 s, 320 kB/s\n'; <rc> = 0
Thread-1161::INFO::2013-10-03
16:35:04,269::clusterlock::174::SANLock::(acquireHostId) Acquiring
host id for domain d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291 (id: 2)
Thread-1161::DEBUG::2013-10-03
16:35:04,270::clusterlock::192::SANLock::(acquireHostId) Host id for
domain d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291 successfully acquired (id:
2)
Full logs here:
https://docs.google.com/file/d/0BwoPbcrMv8mvNVFBaGhDeUdMOFE/edit?usp=sharing
in the zip file:
engine.log
vdsm_source.log
vdsm_target.log
ovirt gluster 3.3 datacenter status.png
(note as many times it seems from a gui point of view that DC set to
non responsive...?)
One strange thing:
In Clusters --> Gluster01 --> Volumes I see my gluster volume (gvdata) as up
while in DC --> Gluster (My DC name) Storage --> gvdata (my storage
domain name) here is marked as down, but the related C6 VM is always
up and running???
source:
[root@f18ovn03 vdsm]# df -h | grep rhev
f18ovn01.mydomain:gvdata 30G 4.7G 26G 16%
/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata
f18engine.mydomain:/var/lib/exports/iso 35G 9.4G 24G 29%
/rhev/data-center/mnt/f18engine.mydomain:_var_lib_exports_iso
target:
[root@f18ovn01 vdsm]# df -h | grep rhev
f18ovn01.mydomain:gvdata 30G 4.7G 26G 16%
/rhev/data-center/mnt/glusterSD/f18ovn01.mydomain:gvdata
f18engine.mydomain:/var/lib/exports/iso 35G 9.4G 24G 29%
/rhev/data-center/mnt/f18engine.mydomain:_var_lib_exports_iso
I didn't try to activate again because I didn't want to make worse things:
it seems strange to me that both hosts are marked as up but the only
existing Storage Domain (so marked as "Data (Master)) in the Gui is
down.....
Gianluca