Exception during VM recovery causes VMs not being properly recovered

Hi, With the current master of VDSM after restarting VDSM (e.g. after upgrading) I noticed that the VMs were not properly initialized and in PAUSED state. Once checking the logs I found that the cause was here: Thread-13::INFO::2014-07-10 12:11:56,400::vm::2244::vm.Vm::(_startUnderlyingVm) vmId=`db614831-3b4b-4010-a989-f7a5ae6fa5d0`::Skipping errors on recovery Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 2228, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 3312, in _run self._domDependentInit() File "/usr/share/vdsm/virt/vm.py", line 3204, in _domDependentInit self._syncVolumeChain(drive) File "/usr/share/vdsm/virt/vm.py", line 5686, in _syncVolumeChain volumes = self._driveGetActualVolumeChain(drive) File "/usr/share/vdsm/virt/vm.py", line 5665, in _driveGetActualVolumeChain sourceAttr = ('file', 'dev')[drive.blockDev] TypeError: tuple indices must be integers, not NoneType The reason here seems to be this: Thread-13::DEBUG::2014-07-10 12:11:56,393::vm::1349::vm.Vm::(blockDev) vmId=`db614831-3b4b-4010-a989-f7a5ae6fa5d0`::Unable to determine if the path '/rhev/data-center/00000002-0002-0002-0002-000000000002/41b6de4e-23da-481d-904d-9af24fc5f3ab/images/17206f99-38ab-45bc-ae9b-d36a66b00e4c/7b05de43-9d85-435f-8ae9-6ccde21548e4' is a block device Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 1346, in blockDev self._blockDev = utils.isBlockDevice(self.path) File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 99, in isBlockDevice return stat.S_ISBLK(os.stat(path).st_mode) OSError: [Errno 2] No such file or directory: '/rhev/data-center/00000002-0002-0002-0002-000000000002/41b6de4e-23da-481d-904d-9af24fc5f3ab/images/17206f99-38ab-45bc-ae9b-d36a66b00e4c/7b05de43-9d85-435f-8ae9-6ccde21548e4' I am running the host on RHEL6.5 Note: I just rebooted the host and started a few more VMs again and when I restart VDSM I get the same errors again. -- Regards, Vinzenz Feenstra | Senior Software Engineer RedHat Engineering Virtualization R & D Phone: +420 532 294 625 IRC: vfeenstr or evilissimo Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

On 10/07/14 13:35 +0200, Vinzenz Feenstra wrote:
Hi,
With the current master of VDSM after restarting VDSM (e.g. after upgrading) I noticed that the VMs were not properly initialized and in PAUSED state. Once checking the logs I found that the cause was here:
Thanks for the report and sorry about the problem. This is related to my earlier thread about vm recovery. I am working to resolve this with Federico. We need to find a cleaner way to handle our corner case negative flows. For now you can disable the syncVolumeChain call in recovery to avoid the problem.
Thread-13::INFO::2014-07-10 12:11:56,400::vm::2244::vm.Vm::(_startUnderlyingVm) vmId=`db614831-3b4b-4010-a989-f7a5ae6fa5d0`::Skipping errors on recovery Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 2228, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 3312, in _run self._domDependentInit() File "/usr/share/vdsm/virt/vm.py", line 3204, in _domDependentInit self._syncVolumeChain(drive) File "/usr/share/vdsm/virt/vm.py", line 5686, in _syncVolumeChain volumes = self._driveGetActualVolumeChain(drive) File "/usr/share/vdsm/virt/vm.py", line 5665, in _driveGetActualVolumeChain sourceAttr = ('file', 'dev')[drive.blockDev] TypeError: tuple indices must be integers, not NoneType
The reason here seems to be this: Thread-13::DEBUG::2014-07-10 12:11:56,393::vm::1349::vm.Vm::(blockDev) vmId=`db614831-3b4b-4010-a989-f7a5ae6fa5d0`::Unable to determine if the path '/rhev/data-center/00000002-0002-0002-0002-000000000002/41b6de4e-23da-481d-904d-9af24fc5f3ab/images/17206f99-38ab-45bc-ae9b-d36a66b00e4c/7b05de43-9d85-435f-8ae9-6ccde21548e4' is a block device Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 1346, in blockDev self._blockDev = utils.isBlockDevice(self.path) File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 99, in isBlockDevice return stat.S_ISBLK(os.stat(path).st_mode) OSError: [Errno 2] No such file or directory: '/rhev/data-center/00000002-0002-0002-0002-000000000002/41b6de4e-23da-481d-904d-9af24fc5f3ab/images/17206f99-38ab-45bc-ae9b-d36a66b00e4c/7b05de43-9d85-435f-8ae9-6ccde21548e4'
I am running the host on RHEL6.5
Note: I just rebooted the host and started a few more VMs again and when I restart VDSM I get the same errors again.
-- Regards,
Vinzenz Feenstra | Senior Software Engineer RedHat Engineering Virtualization R & D Phone: +420 532 294 625 IRC: vfeenstr or evilissimo
Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com
-- Adam Litke
participants (2)
-
Adam Litke
-
Vinzenz Feenstra