[Users] Local storage domain fails to attach after host reboot
Patrick Hurrelmann
patrick.hurrelmann at lobster.de
Fri Jan 25 17:13:05 UTC 2013
On 24.01.2013 18:05, Patrick Hurrelmann wrote:
> Hi list,
>
> after rebooting one host (single host dc with local storage) the local
> storage domain can't be attached again. The host was set to maintenance
> mode and all running vms were shutdown prior the reboot.
>
> Vdsm keeps logging the following errors:
>
> Thread-1266::ERROR::2013-01-24
> 17:51:46,042::task::853::TaskManager.Task::(_setError)
> Task=`a0c11f61-8bcf-4f76-9923-43e8b9cc1424`::Unexpected error
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/task.py", line 861, in _run
> return fn(*args, **kargs)
> File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
> res = f(*args, **kwargs)
> File "/usr/share/vdsm/storage/hsm.py", line 817, in connectStoragePool
> return self._connectStoragePool(spUUID, hostID, scsiKey, msdUUID,
> masterVersion, options)
> File "/usr/share/vdsm/storage/hsm.py", line 859, in _connectStoragePool
> res = pool.connect(hostID, scsiKey, msdUUID, masterVersion)
> File "/usr/share/vdsm/storage/sp.py", line 641, in connect
> self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
> File "/usr/share/vdsm/storage/sp.py", line 1109, in __rebuild
> self.masterDomain = self.getMasterDomain(msdUUID=msdUUID,
> masterVersion=masterVersion)
> File "/usr/share/vdsm/storage/sp.py", line 1448, in getMasterDomain
> raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID)
> StoragePoolMasterNotFound: Cannot find master domain:
> 'spUUID=c9b86219-0d51-44c3-a7de-e0fe07e2c9e6,
> msdUUID=00ed91f3-43be-41be-8c05-f3786588a1ad'
>
> and
>
> Thread-1268::ERROR::2013-01-24
> 17:51:49,073::task::853::TaskManager.Task::(_setError)
> Task=`95b7f58b-afe0-47bd-9ebd-21d3224f5165`::Unexpected error
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/task.py", line 861, in _run
> return fn(*args, **kargs)
> File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
> res = f(*args, **kwargs)
> File "/usr/share/vdsm/storage/hsm.py", line 528, in getSpmStatus
> pool = self.getPool(spUUID)
> File "/usr/share/vdsm/storage/hsm.py", line 265, in getPool
> raise se.StoragePoolUnknown(spUUID)
> StoragePoolUnknown: Unknown pool id, pool not connected:
> ('c9b86219-0d51-44c3-a7de-e0fe07e2c9e6',)
>
> while engine logs:
>
> 2013-01-24 17:51:46,050 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase]
> (QuartzScheduler_Worker-43) [49026692] Command
> org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand
> return value
> Class Name:
> org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
> mStatus Class Name:
> org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
> mCode 304
> mMessage Cannot find master domain:
> 'spUUID=c9b86219-0d51-44c3-a7de-e0fe07e2c9e6,
> msdUUID=00ed91f3-43be-41be-8c05-f3786588a1ad'
>
>
> Vdsm and engine logs are also attached. I set the affected host back to
> maintenance. How can I recover from this and attach the storage domain
> again? If more information is needed, please do not hesitate to request it.
>
> This is on CentOS 6.3 using Dreyou's rpms. Installed versions on host:
>
> vdsm.x86_64 4.10.0-0.44.14.el6
> vdsm-cli.noarch 4.10.0-0.44.14.el6
> vdsm-python.x86_64 4.10.0-0.44.14.el6
> vdsm-xmlrpc.noarch 4.10.0-0.44.14.el6
>
> Engine:
>
> ovirt-engine.noarch 3.1.0-3.19.el6
> ovirt-engine-backend.noarch 3.1.0-3.19.el6
> ovirt-engine-cli.noarch 3.1.0.7-1.el6
> ovirt-engine-config.noarch 3.1.0-3.19.el6
> ovirt-engine-dbscripts.noarch 3.1.0-3.19.el6
> ovirt-engine-genericapi.noarch 3.1.0-3.19.el6
> ovirt-engine-jbossas711.x86_64 1-0
> ovirt-engine-notification-service.noarch 3.1.0-3.19.el6
> ovirt-engine-restapi.noarch 3.1.0-3.19.el6
> ovirt-engine-sdk.noarch 3.1.0.5-1.el6
> ovirt-engine-setup.noarch 3.1.0-3.19.el6
> ovirt-engine-tools-common.noarch 3.1.0-3.19.el6
> ovirt-engine-userportal.noarch 3.1.0-3.19.el6
> ovirt-engine-webadmin-portal.noarch 3.1.0-3.19.el6
> ovirt-image-uploader.noarch 3.1.0-16.el6
> ovirt-iso-uploader.noarch 3.1.0-16.el6
> ovirt-log-collector.noarch 3.1.0-16.el6
>
>
> Thanks and regards
> Patrick
Ok, managed to solve it. I force removed the datacenter and reinstalled
the host. I added a new local storage to it and re-created the vms (disk
images were moved and renamed from old non working local storage).
So this host is up an running again.
Regards
Patrick
--
Lobster LOGsuite GmbH, Münchner Straße 15a, D-82319 Starnberg
HRB 178831, Amtsgericht München
Geschäftsführer: Dr. Martin Fischer, Rolf Henrich
More information about the Users
mailing list