
Hello, I've been struggling to set up an Ovirt cluster and am now bumping into this problem: When I try to create a new (Gluster) storage domain, it fails to attach to the data center. The error on the node from vdsm.log: Thread-13::DEBUG::2014-06-21 16:17:14,157::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,159::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,160::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.10.120:/vmimage', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}], options=None) Thread-13::DEBUG::2014-06-21 16:17:14,172::hsm::2340::Storage.HSM::(__prefetchDomains) glusterDomPath: glusterSD/* Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD uuids: ('dc661957-c0c1-44ba-a5b9-e6558904207b',) Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs: {dc661957-c0c1-44ba-a5b9-e6558904207b: storage.glusterSD.findDomain} Thread-13::INFO::2014-06-21 16:17:14,186::logUtils::47::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,186::task::1185::TaskManager.Task::(prepare) Task=`97b78287-45d2-4d5a-8336-460987df3840`::finished: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,187::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state preparing -> state finished Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:14,188::task::990::TaskManager.Task::(_decref) Task=`97b78287-45d2-4d5a-8336-460987df3840`::ref 0 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,195::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,195::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,196::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='806d2356-12cf-437c-8917-dd13ee823e36', poolName='testing', masterDom='dc661957-c0c1-44ba-a5b9-e6558904207b', domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'], masterVersion=2, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None) Thread-13::DEBUG::2014-06-21 16:17:14,196::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan) Thread-13::DEBUG::2014-06-21 16:17:14,198::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,198::iscsi::407::Storage.ISCSI::(rescan) Performing SCSI scan, this will take up to 30 seconds Thread-13::DEBUG::2014-06-21 16:17:14,199::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,228::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,229::multipath::110::Storage.Misc.excCmd::(rescan) '/usr/bin/sudo -n /sbin/multipath -r' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,294::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS: <err> = ''; <rc> = 0 Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::497::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::499::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::508::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::510::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::528::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::530::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,298::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,318::fileSD::150::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/glusterSD/192.168.10.120:_vmimage/dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,322::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-13::DEBUG::2014-06-21 16:17:14,328::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,334::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing remnants of deleted images [] Thread-13::INFO::2014-06-21 16:17:14,335::sd::383::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_imageNS already registered Thread-13::INFO::2014-06-21 16:17:14,335::sd::391::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_volumeNS already registered Thread-13::INFO::2014-06-21 16:17:14,336::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,340::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,341::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '980' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,343::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,343::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.806d2356-12cf-437c-8917-dd13ee823e36 (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,344::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '982' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,347::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.dc661957-c0c1-44ba-a5b9-e6558904207b (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,347::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::INFO::2014-06-21 16:17:14,347::sp::133::Storage.StoragePool::(setBackend) updating pool 806d2356-12cf-437c-8917-dd13ee823e36 backend from type NoneType instance 0x39e278bf00 to type StoragePoolDiskBackend instance 0x7f764c093cb0 Thread-13::INFO::2014-06-21 16:17:14,348::sp::548::Storage.StoragePool::(create) spUUID=806d2356-12cf-437c-8917-dd13ee823e36 poolName=testing master_sd=dc661957-c0c1-44ba-a5b9-e6558904207b domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'] masterVersion=2 {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3, 'LOCKRENEWALINTERVALSEC': 5} Thread-13::INFO::2014-06-21 16:17:14,348::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,352::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,357::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::WARNING::2014-06-21 16:17:14,358::fileUtils::167::Storage.fileUtils::(createdir) Dir /rhev/data-center/806d2356-12cf-437c-8917-dd13ee823e36 already exists Thread-13::DEBUG::2014-06-21 16:17:14,358::persistentDict::167::Storage.PersistentDict::(transaction) Starting transaction Thread-13::DEBUG::2014-06-21 16:17:14,359::persistentDict::175::Storage.PersistentDict::(transaction) Finished transaction Thread-13::INFO::2014-06-21 16:17:14,359::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain dc661957-c0c1-44ba-a5b9-e6558904207b (id: 250) Thread-24::DEBUG::2014-06-21 16:17:14,394::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state init -> state preparing Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::1185::TaskManager.Task::(prepare) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::finished: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state preparing -> state finished Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-24::DEBUG::2014-06-21 16:17:14,397::task::990::TaskManager.Task::(_decref) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::ref 0 aborting False Thread-13::ERROR::2014-06-21 16:17:15,361::task::866::TaskManager.Task::(_setError) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 573, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 467, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long')) Thread-13::DEBUG::2014-06-21 16:17:15,363::task::885::TaskManager.Task::(_run) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._run: d815e5e5-0202-4137-94be-21dc5e2b61c9 (None, '806d2356-12cf-437c-8917-dd13ee823e36', 'testing', 'dc661957-c0c1-44ba-a5b9-e6558904207b', ['dc661957-c0c1-44ba-a5b9-e6558904207b'], 2, None, 5, 60, 10, 3) {} failed - stopping task Thread-13::DEBUG::2014-06-21 16:17:15,364::task::1211::TaskManager.Task::(stop) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::stopping in state preparing (force False) Thread-13::DEBUG::2014-06-21 16:17:15,364::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting True Thread-13::INFO::2014-06-21 16:17:15,365::task::1168::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 Thread-13::DEBUG::2014-06-21 16:17:15,365::task::1173::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Prepare: aborted: Cannot acquire host id Thread-13::DEBUG::2014-06-21 16:17:15,365::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 0 aborting True Thread-13::DEBUG::2014-06-21 16:17:15,366::task::925::TaskManager.Task::(_doAbort) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._doAbort: force False Thread-13::DEBUG::2014-06-21 16:17:15,366::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,366::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state preparing -> state aborting Thread-13::DEBUG::2014-06-21 16:17:15,366::task::550::TaskManager.Task::(__state_aborting) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_aborting: recover policy none Thread-13::DEBUG::2014-06-21 16:17:15,367::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state aborting -> state failed Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b': < ResourceRef 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', isValid: 'True' obj: 'None'>, 'Storage.806d2356-12cf-437c-8917-dd13ee823e36': < ResourceRef 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', isValid: 'True' obj: 'None'>} Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,368::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', Clearing records. Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,371::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', Clearing records. Thread-13::ERROR::2014-06-21 16:17:15,371::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': "Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))", 'code': 661}} My oVirt version: 3.4.2-1.el6 (CentOS 6.5) The hypervisor hosts run GlusterFS 3.5.0-3.fc19.(Fedora 19) The two storage servers run GlusterFS 3.5.0-2.el6 (Centos 6.5) So I am NOT using local storage of the hypervisor hosts for the GlusterFS bricks. What can I do to solve this error? -- Tiemen Ruiten Systems Engineer R&D Media

On 06/21/14 16:37, Tiemen Ruiten wrote:
Hello,
I've been struggling to set up an Ovirt cluster and am now bumping into this problem:
When I try to create a new (Gluster) storage domain, it fails to attach to the data center. The error on the node from vdsm.log:
Thread-13::DEBUG::2014-06-21 16:17:14,157::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,159::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,160::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.10.120:/vmimage', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}], options=None) Thread-13::DEBUG::2014-06-21 16:17:14,172::hsm::2340::Storage.HSM::(__prefetchDomains) glusterDomPath: glusterSD/* Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD uuids: ('dc661957-c0c1-44ba-a5b9-e6558904207b',) Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs: {dc661957-c0c1-44ba-a5b9-e6558904207b: storage.glusterSD.findDomain} Thread-13::INFO::2014-06-21 16:17:14,186::logUtils::47::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,186::task::1185::TaskManager.Task::(prepare) Task=`97b78287-45d2-4d5a-8336-460987df3840`::finished: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,187::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state preparing -> state finished Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:14,188::task::990::TaskManager.Task::(_decref) Task=`97b78287-45d2-4d5a-8336-460987df3840`::ref 0 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,195::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,195::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,196::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='806d2356-12cf-437c-8917-dd13ee823e36', poolName='testing', masterDom='dc661957-c0c1-44ba-a5b9-e6558904207b', domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'], masterVersion=2, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None) Thread-13::DEBUG::2014-06-21 16:17:14,196::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan) Thread-13::DEBUG::2014-06-21 16:17:14,198::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,198::iscsi::407::Storage.ISCSI::(rescan) Performing SCSI scan, this will take up to 30 seconds Thread-13::DEBUG::2014-06-21 16:17:14,199::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,228::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,229::multipath::110::Storage.Misc.excCmd::(rescan) '/usr/bin/sudo -n /sbin/multipath -r' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,294::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS: <err> = ''; <rc> = 0 Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::497::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::499::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::508::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::510::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::528::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::530::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,298::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,318::fileSD::150::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/glusterSD/192.168.10.120:_vmimage/dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,322::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-13::DEBUG::2014-06-21 16:17:14,328::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,334::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing remnants of deleted images [] Thread-13::INFO::2014-06-21 16:17:14,335::sd::383::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_imageNS already registered Thread-13::INFO::2014-06-21 16:17:14,335::sd::391::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_volumeNS already registered Thread-13::INFO::2014-06-21 16:17:14,336::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,340::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,341::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '980' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,343::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,343::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.806d2356-12cf-437c-8917-dd13ee823e36 (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,344::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '982' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,347::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.dc661957-c0c1-44ba-a5b9-e6558904207b (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,347::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::INFO::2014-06-21 16:17:14,347::sp::133::Storage.StoragePool::(setBackend) updating pool 806d2356-12cf-437c-8917-dd13ee823e36 backend from type NoneType instance 0x39e278bf00 to type StoragePoolDiskBackend instance 0x7f764c093cb0 Thread-13::INFO::2014-06-21 16:17:14,348::sp::548::Storage.StoragePool::(create) spUUID=806d2356-12cf-437c-8917-dd13ee823e36 poolName=testing master_sd=dc661957-c0c1-44ba-a5b9-e6558904207b domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'] masterVersion=2 {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3, 'LOCKRENEWALINTERVALSEC': 5} Thread-13::INFO::2014-06-21 16:17:14,348::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,352::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,357::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::WARNING::2014-06-21 16:17:14,358::fileUtils::167::Storage.fileUtils::(createdir) Dir /rhev/data-center/806d2356-12cf-437c-8917-dd13ee823e36 already exists Thread-13::DEBUG::2014-06-21 16:17:14,358::persistentDict::167::Storage.PersistentDict::(transaction) Starting transaction Thread-13::DEBUG::2014-06-21 16:17:14,359::persistentDict::175::Storage.PersistentDict::(transaction) Finished transaction Thread-13::INFO::2014-06-21 16:17:14,359::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain dc661957-c0c1-44ba-a5b9-e6558904207b (id: 250) Thread-24::DEBUG::2014-06-21 16:17:14,394::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state init -> state preparing Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::1185::TaskManager.Task::(prepare) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::finished: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state preparing -> state finished Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-24::DEBUG::2014-06-21 16:17:14,397::task::990::TaskManager.Task::(_decref) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::ref 0 aborting False Thread-13::ERROR::2014-06-21 16:17:15,361::task::866::TaskManager.Task::(_setError) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 573, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 467, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long')) Thread-13::DEBUG::2014-06-21 16:17:15,363::task::885::TaskManager.Task::(_run) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._run: d815e5e5-0202-4137-94be-21dc5e2b61c9 (None, '806d2356-12cf-437c-8917-dd13ee823e36', 'testing', 'dc661957-c0c1-44ba-a5b9-e6558904207b', ['dc661957-c0c1-44ba-a5b9-e6558904207b'], 2, None, 5, 60, 10, 3) {} failed - stopping task Thread-13::DEBUG::2014-06-21 16:17:15,364::task::1211::TaskManager.Task::(stop) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::stopping in state preparing (force False) Thread-13::DEBUG::2014-06-21 16:17:15,364::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting True Thread-13::INFO::2014-06-21 16:17:15,365::task::1168::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 Thread-13::DEBUG::2014-06-21 16:17:15,365::task::1173::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Prepare: aborted: Cannot acquire host id Thread-13::DEBUG::2014-06-21 16:17:15,365::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 0 aborting True Thread-13::DEBUG::2014-06-21 16:17:15,366::task::925::TaskManager.Task::(_doAbort) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._doAbort: force False Thread-13::DEBUG::2014-06-21 16:17:15,366::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,366::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state preparing -> state aborting Thread-13::DEBUG::2014-06-21 16:17:15,366::task::550::TaskManager.Task::(__state_aborting) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_aborting: recover policy none Thread-13::DEBUG::2014-06-21 16:17:15,367::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state aborting -> state failed Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b': < ResourceRef 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', isValid: 'True' obj: 'None'>, 'Storage.806d2356-12cf-437c-8917-dd13ee823e36': < ResourceRef 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', isValid: 'True' obj: 'None'>} Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,368::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', Clearing records. Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,371::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', Clearing records. Thread-13::ERROR::2014-06-21 16:17:15,371::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': "Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))", 'code': 661}}
My oVirt version: 3.4.2-1.el6 (CentOS 6.5) The hypervisor hosts run GlusterFS 3.5.0-3.fc19.(Fedora 19) The two storage servers run GlusterFS 3.5.0-2.el6 (Centos 6.5)
So I am NOT using local storage of the hypervisor hosts for the GlusterFS bricks.
What can I do to solve this error?
By the way, the options on the GlusterFS volume are as follows: Volume Name: vmimage Type: Replicate Volume ID: 348e1d45-1b80-420b-91c2-93f0d764f227 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.120:/export/gluster01/brick Brick2: 192.168.10.149:/export/gluster01/brick Options Reconfigured: network.ping-timeout: 10 cluster.quorum-count: 1 cluster.quorum-type: auto server.allow-insecure: on storage.owner-uid: 36 storage.owner-gid: 36 -- Tiemen Ruiten Systems Engineer R&D Media

On 06/21/14 16:57, Tiemen Ruiten wrote:
On 06/21/14 16:37, Tiemen Ruiten wrote:
Hello,
I've been struggling to set up an Ovirt cluster and am now bumping into this problem:
When I try to create a new (Gluster) storage domain, it fails to attach to the data center. The error on the node from vdsm.log:
Thread-13::DEBUG::2014-06-21 16:17:14,157::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,159::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,160::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.10.120:/vmimage', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}], options=None) Thread-13::DEBUG::2014-06-21 16:17:14,172::hsm::2340::Storage.HSM::(__prefetchDomains) glusterDomPath: glusterSD/* Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD uuids: ('dc661957-c0c1-44ba-a5b9-e6558904207b',) Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs: {dc661957-c0c1-44ba-a5b9-e6558904207b: storage.glusterSD.findDomain} Thread-13::INFO::2014-06-21 16:17:14,186::logUtils::47::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,186::task::1185::TaskManager.Task::(prepare) Task=`97b78287-45d2-4d5a-8336-460987df3840`::finished: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,187::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state preparing -> state finished Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:14,188::task::990::TaskManager.Task::(_decref) Task=`97b78287-45d2-4d5a-8336-460987df3840`::ref 0 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,195::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,195::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,196::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='806d2356-12cf-437c-8917-dd13ee823e36', poolName='testing', masterDom='dc661957-c0c1-44ba-a5b9-e6558904207b', domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'], masterVersion=2, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None) Thread-13::DEBUG::2014-06-21 16:17:14,196::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan) Thread-13::DEBUG::2014-06-21 16:17:14,198::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,198::iscsi::407::Storage.ISCSI::(rescan) Performing SCSI scan, this will take up to 30 seconds Thread-13::DEBUG::2014-06-21 16:17:14,199::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,228::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,229::multipath::110::Storage.Misc.excCmd::(rescan) '/usr/bin/sudo -n /sbin/multipath -r' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,294::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS: <err> = ''; <rc> = 0 Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::497::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::499::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::508::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::510::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::528::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::530::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,298::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,318::fileSD::150::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/glusterSD/192.168.10.120:_vmimage/dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,322::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-13::DEBUG::2014-06-21 16:17:14,328::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,334::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing remnants of deleted images [] Thread-13::INFO::2014-06-21 16:17:14,335::sd::383::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_imageNS already registered Thread-13::INFO::2014-06-21 16:17:14,335::sd::391::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_volumeNS already registered Thread-13::INFO::2014-06-21 16:17:14,336::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,340::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,341::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '980' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,343::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,343::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.806d2356-12cf-437c-8917-dd13ee823e36 (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,344::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '982' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,347::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.dc661957-c0c1-44ba-a5b9-e6558904207b (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,347::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::INFO::2014-06-21 16:17:14,347::sp::133::Storage.StoragePool::(setBackend) updating pool 806d2356-12cf-437c-8917-dd13ee823e36 backend from type NoneType instance 0x39e278bf00 to type StoragePoolDiskBackend instance 0x7f764c093cb0 Thread-13::INFO::2014-06-21 16:17:14,348::sp::548::Storage.StoragePool::(create) spUUID=806d2356-12cf-437c-8917-dd13ee823e36 poolName=testing master_sd=dc661957-c0c1-44ba-a5b9-e6558904207b domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'] masterVersion=2 {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3, 'LOCKRENEWALINTERVALSEC': 5} Thread-13::INFO::2014-06-21 16:17:14,348::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,352::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,357::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::WARNING::2014-06-21 16:17:14,358::fileUtils::167::Storage.fileUtils::(createdir) Dir /rhev/data-center/806d2356-12cf-437c-8917-dd13ee823e36 already exists Thread-13::DEBUG::2014-06-21 16:17:14,358::persistentDict::167::Storage.PersistentDict::(transaction) Starting transaction Thread-13::DEBUG::2014-06-21 16:17:14,359::persistentDict::175::Storage.PersistentDict::(transaction) Finished transaction Thread-13::INFO::2014-06-21 16:17:14,359::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain dc661957-c0c1-44ba-a5b9-e6558904207b (id: 250) Thread-24::DEBUG::2014-06-21 16:17:14,394::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state init -> state preparing Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::1185::TaskManager.Task::(prepare) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::finished: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state preparing -> state finished Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-24::DEBUG::2014-06-21 16:17:14,397::task::990::TaskManager.Task::(_decref) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::ref 0 aborting False Thread-13::ERROR::2014-06-21 16:17:15,361::task::866::TaskManager.Task::(_setError) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 573, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 467, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long')) Thread-13::DEBUG::2014-06-21 16:17:15,363::task::885::TaskManager.Task::(_run) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._run: d815e5e5-0202-4137-94be-21dc5e2b61c9 (None, '806d2356-12cf-437c-8917-dd13ee823e36', 'testing', 'dc661957-c0c1-44ba-a5b9-e6558904207b', ['dc661957-c0c1-44ba-a5b9-e6558904207b'], 2, None, 5, 60, 10, 3) {} failed - stopping task Thread-13::DEBUG::2014-06-21 16:17:15,364::task::1211::TaskManager.Task::(stop) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::stopping in state preparing (force False) Thread-13::DEBUG::2014-06-21 16:17:15,364::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting True Thread-13::INFO::2014-06-21 16:17:15,365::task::1168::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 Thread-13::DEBUG::2014-06-21 16:17:15,365::task::1173::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Prepare: aborted: Cannot acquire host id Thread-13::DEBUG::2014-06-21 16:17:15,365::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 0 aborting True Thread-13::DEBUG::2014-06-21 16:17:15,366::task::925::TaskManager.Task::(_doAbort) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._doAbort: force False Thread-13::DEBUG::2014-06-21 16:17:15,366::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,366::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state preparing -> state aborting Thread-13::DEBUG::2014-06-21 16:17:15,366::task::550::TaskManager.Task::(__state_aborting) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_aborting: recover policy none Thread-13::DEBUG::2014-06-21 16:17:15,367::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state aborting -> state failed Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b': < ResourceRef 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', isValid: 'True' obj: 'None'>, 'Storage.806d2356-12cf-437c-8917-dd13ee823e36': < ResourceRef 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', isValid: 'True' obj: 'None'>} Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,368::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', Clearing records. Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,371::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', Clearing records. Thread-13::ERROR::2014-06-21 16:17:15,371::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': "Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))", 'code': 661}}
My oVirt version: 3.4.2-1.el6 (CentOS 6.5) The hypervisor hosts run GlusterFS 3.5.0-3.fc19.(Fedora 19) The two storage servers run GlusterFS 3.5.0-2.el6 (Centos 6.5)
So I am NOT using local storage of the hypervisor hosts for the GlusterFS bricks.
What can I do to solve this error?
By the way, the options on the GlusterFS volume are as follows:
Volume Name: vmimage Type: Replicate Volume ID: 348e1d45-1b80-420b-91c2-93f0d764f227 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.120:/export/gluster01/brick Brick2: 192.168.10.149:/export/gluster01/brick Options Reconfigured: network.ping-timeout: 10 cluster.quorum-count: 1 cluster.quorum-type: auto server.allow-insecure: on storage.owner-uid: 36 storage.owner-gid: 36
OK, fixed it. For someone else's reference, I had to set the following options on the gluster volume: network.remote-dio: on performance.io-cache: off performance.read-ahead: off performance.quick-read: off cluster.eager-lock: enable Apparently that's done by the 'optimize for virt store' checkbox, but obviously not when the volume is created manually. Having this in the documentation on ovirt.org would have saved me a lot of time and frustration. -- Tiemen Ruiten Systems Engineer R&D Media

On 06/22/2014 06:38 PM, Tiemen Ruiten wrote:
On 06/21/14 16:57, Tiemen Ruiten wrote:
On 06/21/14 16:37, Tiemen Ruiten wrote:
Hello,
I've been struggling to set up an Ovirt cluster and am now bumping into this problem:
When I try to create a new (Gluster) storage domain, it fails to attach to the data center. The error on the node from vdsm.log:
Thread-13::DEBUG::2014-06-21 16:17:14,157::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,159::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,160::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.10.120:/vmimage', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}], options=None) Thread-13::DEBUG::2014-06-21 16:17:14,172::hsm::2340::Storage.HSM::(__prefetchDomains) glusterDomPath: glusterSD/* Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD uuids: ('dc661957-c0c1-44ba-a5b9-e6558904207b',) Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs: {dc661957-c0c1-44ba-a5b9-e6558904207b: storage.glusterSD.findDomain} Thread-13::INFO::2014-06-21 16:17:14,186::logUtils::47::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,186::task::1185::TaskManager.Task::(prepare) Task=`97b78287-45d2-4d5a-8336-460987df3840`::finished: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,187::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state preparing -> state finished Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:14,188::task::990::TaskManager.Task::(_decref) Task=`97b78287-45d2-4d5a-8336-460987df3840`::ref 0 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,195::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,195::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,196::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='806d2356-12cf-437c-8917-dd13ee823e36', poolName='testing', masterDom='dc661957-c0c1-44ba-a5b9-e6558904207b', domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'], masterVersion=2, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None) Thread-13::DEBUG::2014-06-21 16:17:14,196::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan) Thread-13::DEBUG::2014-06-21 16:17:14,198::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,198::iscsi::407::Storage.ISCSI::(rescan) Performing SCSI scan, this will take up to 30 seconds Thread-13::DEBUG::2014-06-21 16:17:14,199::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,228::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,229::multipath::110::Storage.Misc.excCmd::(rescan) '/usr/bin/sudo -n /sbin/multipath -r' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,294::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS: <err> = ''; <rc> = 0 Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::497::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::499::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::508::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::510::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::528::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::530::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,298::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,318::fileSD::150::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/glusterSD/192.168.10.120:_vmimage/dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,322::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-13::DEBUG::2014-06-21 16:17:14,328::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,334::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing remnants of deleted images [] Thread-13::INFO::2014-06-21 16:17:14,335::sd::383::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_imageNS already registered Thread-13::INFO::2014-06-21 16:17:14,335::sd::391::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_volumeNS already registered Thread-13::INFO::2014-06-21 16:17:14,336::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,340::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,341::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '980' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,343::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,343::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.806d2356-12cf-437c-8917-dd13ee823e36 (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,344::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '982' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,347::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.dc661957-c0c1-44ba-a5b9-e6558904207b (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,347::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::INFO::2014-06-21 16:17:14,347::sp::133::Storage.StoragePool::(setBackend) updating pool 806d2356-12cf-437c-8917-dd13ee823e36 backend from type NoneType instance 0x39e278bf00 to type StoragePoolDiskBackend instance 0x7f764c093cb0 Thread-13::INFO::2014-06-21 16:17:14,348::sp::548::Storage.StoragePool::(create) spUUID=806d2356-12cf-437c-8917-dd13ee823e36 poolName=testing master_sd=dc661957-c0c1-44ba-a5b9-e6558904207b domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'] masterVersion=2 {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3, 'LOCKRENEWALINTERVALSEC': 5} Thread-13::INFO::2014-06-21 16:17:14,348::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,352::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,357::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::WARNING::2014-06-21 16:17:14,358::fileUtils::167::Storage.fileUtils::(createdir) Dir /rhev/data-center/806d2356-12cf-437c-8917-dd13ee823e36 already exists Thread-13::DEBUG::2014-06-21 16:17:14,358::persistentDict::167::Storage.PersistentDict::(transaction) Starting transaction Thread-13::DEBUG::2014-06-21 16:17:14,359::persistentDict::175::Storage.PersistentDict::(transaction) Finished transaction Thread-13::INFO::2014-06-21 16:17:14,359::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain dc661957-c0c1-44ba-a5b9-e6558904207b (id: 250) Thread-24::DEBUG::2014-06-21 16:17:14,394::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state init -> state preparing Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::1185::TaskManager.Task::(prepare) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::finished: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state preparing -> state finished Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-24::DEBUG::2014-06-21 16:17:14,397::task::990::TaskManager.Task::(_decref) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::ref 0 aborting False Thread-13::ERROR::2014-06-21 16:17:15,361::task::866::TaskManager.Task::(_setError) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 573, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 467, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long')) Thread-13::DEBUG::2014-06-21 16:17:15,363::task::885::TaskManager.Task::(_run) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._run: d815e5e5-0202-4137-94be-21dc5e2b61c9 (None, '806d2356-12cf-437c-8917-dd13ee823e36', 'testing', 'dc661957-c0c1-44ba-a5b9-e6558904207b', ['dc661957-c0c1-44ba-a5b9-e6558904207b'], 2, None, 5, 60, 10, 3) {} failed - stopping task Thread-13::DEBUG::2014-06-21 16:17:15,364::task::1211::TaskManager.Task::(stop) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::stopping in state preparing (force False) Thread-13::DEBUG::2014-06-21 16:17:15,364::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting True Thread-13::INFO::2014-06-21 16:17:15,365::task::1168::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 Thread-13::DEBUG::2014-06-21 16:17:15,365::task::1173::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Prepare: aborted: Cannot acquire host id Thread-13::DEBUG::2014-06-21 16:17:15,365::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 0 aborting True Thread-13::DEBUG::2014-06-21 16:17:15,366::task::925::TaskManager.Task::(_doAbort) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._doAbort: force False Thread-13::DEBUG::2014-06-21 16:17:15,366::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,366::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state preparing -> state aborting Thread-13::DEBUG::2014-06-21 16:17:15,366::task::550::TaskManager.Task::(__state_aborting) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_aborting: recover policy none Thread-13::DEBUG::2014-06-21 16:17:15,367::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state aborting -> state failed Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b': < ResourceRef 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', isValid: 'True' obj: 'None'>, 'Storage.806d2356-12cf-437c-8917-dd13ee823e36': < ResourceRef 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', isValid: 'True' obj: 'None'>} Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,368::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', Clearing records. Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,371::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', Clearing records. Thread-13::ERROR::2014-06-21 16:17:15,371::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': "Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))", 'code': 661}}
My oVirt version: 3.4.2-1.el6 (CentOS 6.5) The hypervisor hosts run GlusterFS 3.5.0-3.fc19.(Fedora 19) The two storage servers run GlusterFS 3.5.0-2.el6 (Centos 6.5)
So I am NOT using local storage of the hypervisor hosts for the GlusterFS bricks.
What can I do to solve this error?
By the way, the options on the GlusterFS volume are as follows:
Volume Name: vmimage Type: Replicate Volume ID: 348e1d45-1b80-420b-91c2-93f0d764f227 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.120:/export/gluster01/brick Brick2: 192.168.10.149:/export/gluster01/brick Options Reconfigured: network.ping-timeout: 10 cluster.quorum-count: 1 cluster.quorum-type: auto server.allow-insecure: on storage.owner-uid: 36 storage.owner-gid: 36
OK, fixed it. For someone else's reference, I had to set the following options on the gluster volume:
network.remote-dio: on performance.io-cache: off performance.read-ahead: off performance.quick-read: off cluster.eager-lock: enable
Apparently that's done by the 'optimize for virt store' checkbox, but obviously not when the volume is created manually. Having this in the documentation on ovirt.org would have saved me a lot of time and frustration.
its a wiki, how about adding this for the next guy? thanks, Itamar

Hi, You can also use the below method on every gluster node :- For Group-virt (optimize for virt store) 1. Create the file name /var/lib/glusterd/groups/virt 2. And paste all the contents from this location to this file :- https://raw.githubusercontent.com/gluster/glusterfs/master/extras/group-virt... 3. service glusterd restart 4. service vdsmd restart -------------- quick-read=off read-ahead=off io-cache=off stat-prefetch=off eager-lock=enable remote-dio=enable quorum-type=auto server-quorum-type=server -------------- Thanks, Punit Dambiwal On Mon, Jun 23, 2014 at 4:35 PM, Itamar Heim <iheim@redhat.com> wrote:
On 06/22/2014 06:38 PM, Tiemen Ruiten wrote:
On 06/21/14 16:57, Tiemen Ruiten wrote:
On 06/21/14 16:37, Tiemen Ruiten wrote:
Hello,
I've been struggling to set up an Ovirt cluster and am now bumping into this problem:
When I try to create a new (Gluster) storage domain, it fails to attach to the data center. The error on the node from vdsm.log:
Thread-13::DEBUG::2014-06-21 16:17:14,157::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,159::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,160::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.10.120:/vmimage', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}], options=None) Thread-13::DEBUG::2014-06-21 16:17:14,172::hsm::2340::Storage.HSM::(__prefetchDomains) glusterDomPath: glusterSD/* Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD uuids: ('dc661957-c0c1-44ba-a5b9-e6558904207b',) Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs: {dc661957-c0c1-44ba-a5b9-e6558904207b: storage.glusterSD.findDomain} Thread-13::INFO::2014-06-21 16:17:14,186::logUtils::47::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,186::task::1185::TaskManager.Task::(prepare) Task=`97b78287-45d2-4d5a-8336-460987df3840`::finished: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,187::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state preparing -> state finished Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:14,188::task::990::TaskManager.Task::(_decref) Task=`97b78287-45d2-4d5a-8336-460987df3840`::ref 0 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,195::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,195::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,196::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='806d2356-12cf-437c-8917-dd13ee823e36', poolName='testing', masterDom='dc661957-c0c1-44ba-a5b9-e6558904207b', domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'], masterVersion=2, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None) Thread-13::DEBUG::2014-06-21 16:17:14,196::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan) Thread-13::DEBUG::2014-06-21 16:17:14,198::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,198::iscsi::407::Storage.ISCSI::(rescan) Performing SCSI scan, this will take up to 30 seconds Thread-13::DEBUG::2014-06-21 16:17:14,199::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,228::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,229::multipath::110::Storage.Misc.excCmd::(rescan) '/usr/bin/sudo -n /sbin/multipath -r' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,294::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS: <err> = ''; <rc> = 0 Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::497::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::499::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::508::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::510::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::528::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::530::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,298::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,318::fileSD::150::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/glusterSD/192.168.10.120:_ vmimage/dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,322::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-13::DEBUG::2014-06-21 16:17:14,328::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,334::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing remnants of deleted images [] Thread-13::INFO::2014-06-21 16:17:14,335::sd::383::Storage.StorageDomain::(_ registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_imageNS already registered Thread-13::INFO::2014-06-21 16:17:14,335::sd::391::Storage.StorageDomain::(_ registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_volumeNS already registered Thread-13::INFO::2014-06-21 16:17:14,336::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,340::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,341::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36` ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '980' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,343::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36` ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,343::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.806d2356-12cf-437c-8917-dd13ee823e36 (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,344::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b` ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '982' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b` ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,347::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.dc661957-c0c1-44ba-a5b9-e6558904207b (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,347::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::INFO::2014-06-21 16:17:14,347::sp::133::Storage.StoragePool::(setBackend) updating pool 806d2356-12cf-437c-8917-dd13ee823e36 backend from type NoneType instance 0x39e278bf00 to type StoragePoolDiskBackend instance 0x7f764c093cb0 Thread-13::INFO::2014-06-21 16:17:14,348::sp::548::Storage.StoragePool::(create) spUUID=806d2356-12cf-437c-8917-dd13ee823e36 poolName=testing master_sd=dc661957-c0c1-44ba-a5b9-e6558904207b domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'] masterVersion=2 {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3, 'LOCKRENEWALINTERVALSEC': 5} Thread-13::INFO::2014-06-21 16:17:14,348::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,352::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,357::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::WARNING::2014-06-21 16:17:14,358::fileUtils::167::Storage.fileUtils::(createdir) Dir /rhev/data-center/806d2356-12cf-437c-8917-dd13ee823e36 already exists Thread-13::DEBUG::2014-06-21 16:17:14,358::persistentDict::167::Storage.PersistentDict::( transaction) Starting transaction Thread-13::DEBUG::2014-06-21 16:17:14,359::persistentDict::175::Storage.PersistentDict::( transaction) Finished transaction Thread-13::INFO::2014-06-21 16:17:14,359::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain dc661957-c0c1-44ba-a5b9-e6558904207b (id: 250) Thread-24::DEBUG::2014-06-21 16:17:14,394::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state init -> state preparing Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::1185::TaskManager.Task::(prepare) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::finished: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state preparing -> state finished Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-24::DEBUG::2014-06-21 16:17:14,397::task::990::TaskManager.Task::(_decref) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::ref 0 aborting False Thread-13::ERROR::2014-06-21 16:17:15,361::task::866::TaskManager.Task::(_setError) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 573, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 467, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long')) Thread-13::DEBUG::2014-06-21 16:17:15,363::task::885::TaskManager.Task::(_run) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._run: d815e5e5-0202-4137-94be-21dc5e2b61c9 (None, '806d2356-12cf-437c-8917-dd13ee823e36', 'testing', 'dc661957-c0c1-44ba-a5b9-e6558904207b', ['dc661957-c0c1-44ba-a5b9-e6558904207b'], 2, None, 5, 60, 10, 3) {} failed - stopping task Thread-13::DEBUG::2014-06-21 16:17:15,364::task::1211::TaskManager.Task::(stop) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::stopping in state preparing (force False) Thread-13::DEBUG::2014-06-21 16:17:15,364::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting True Thread-13::INFO::2014-06-21 16:17:15,365::task::1168::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 Thread-13::DEBUG::2014-06-21 16:17:15,365::task::1173::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Prepare: aborted: Cannot acquire host id Thread-13::DEBUG::2014-06-21 16:17:15,365::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 0 aborting True Thread-13::DEBUG::2014-06-21 16:17:15,366::task::925::TaskManager.Task::(_doAbort) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._doAbort: force False Thread-13::DEBUG::2014-06-21 16:17:15,366::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,366::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state preparing -> state aborting Thread-13::DEBUG::2014-06-21 16:17:15,366::task::550::TaskManager.Task::(__state_aborting) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_aborting: recover policy none Thread-13::DEBUG::2014-06-21 16:17:15,367::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state aborting -> state failed Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b': < ResourceRef 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', isValid: 'True' obj: 'None'>, 'Storage.806d2356-12cf-437c-8917-dd13ee823e36': < ResourceRef 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', isValid: 'True' obj: 'None'>} Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,368::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.dc661957-c0c1-44ba- a5b9-e6558904207b' Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', Clearing records. Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.806d2356-12cf-437c- 8917-dd13ee823e36' Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,371::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', Clearing records. Thread-13::ERROR::2014-06-21 16:17:15,371::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': "Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))", 'code': 661}}
My oVirt version: 3.4.2-1.el6 (CentOS 6.5) The hypervisor hosts run GlusterFS 3.5.0-3.fc19.(Fedora 19) The two storage servers run GlusterFS 3.5.0-2.el6 (Centos 6.5)
So I am NOT using local storage of the hypervisor hosts for the GlusterFS bricks.
What can I do to solve this error?
By the way, the options on the GlusterFS volume are as follows:
Volume Name: vmimage Type: Replicate Volume ID: 348e1d45-1b80-420b-91c2-93f0d764f227 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.120:/export/gluster01/brick Brick2: 192.168.10.149:/export/gluster01/brick Options Reconfigured: network.ping-timeout: 10 cluster.quorum-count: 1 cluster.quorum-type: auto server.allow-insecure: on storage.owner-uid: 36 storage.owner-gid: 36
OK, fixed it. For someone else's reference, I had to set the following options on the gluster volume:
network.remote-dio: on performance.io-cache: off performance.read-ahead: off performance.quick-read: off cluster.eager-lock: enable
Apparently that's done by the 'optimize for virt store' checkbox, but obviously not when the volume is created manually. Having this in the documentation on ovirt.org would have saved me a lot of time and frustration.
its a wiki, how about adding this for the next guy?
thanks, Itamar
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

--------------090300020209010206060509 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit I found the relevant Redhat documentation: https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.0/html/... So for Punit's suggestion to work, you should also execute this command:| gluster volume set /|VOLNAME|/ group virt| Hope this helps someone! Tiemen On 06/24/14 04:11, Punit Dambiwal wrote:
Hi,
You can also use the below method on every gluster node :-
For Group-virt (optimize for virt store)
1. Create the file name /var/lib/glusterd/groups/virt 2. And paste all the contents from this location to this file :- https://raw.githubusercontent.com/gluster/glusterfs/master/extras/group-virt... 3. service glusterd restart 4. service vdsmd restart
-------------- quick-read=off read-ahead=off io-cache=off stat-prefetch=off eager-lock=enable remote-dio=enable quorum-type=auto server-quorum-type=server --------------
Thanks, Punit Dambiwal
On Mon, Jun 23, 2014 at 4:35 PM, Itamar Heim <iheim@redhat.com <mailto:iheim@redhat.com>> wrote:
On 06/22/2014 06:38 PM, Tiemen Ruiten wrote:
On 06/21/14 16:57, Tiemen Ruiten wrote:
On 06/21/14 16:37, Tiemen Ruiten wrote:
Hello,
I've been struggling to set up an Ovirt cluster and am now bumping into this problem:
When I try to create a new (Gluster) storage domain, it fails to attach to the data center. The error on the node from vdsm.log:
Thread-13::DEBUG::2014-06-21 16:17:14,157::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,159::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,160::logUtils::44::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=7, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '', 'connection': '192.168.10.120:/vmimage', 'iqn': '', 'user': '', 'tpgt': '1', 'vfs_type': 'glusterfs', 'password': '******', 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}], options=None) Thread-13::DEBUG::2014-06-21 16:17:14,172::hsm::2340::Storage.HSM::(__prefetchDomains) glusterDomPath: glusterSD/* Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD uuids: ('dc661957-c0c1-44ba-a5b9-e6558904207b',) Thread-13::DEBUG::2014-06-21 16:17:14,185::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs: {dc661957-c0c1-44ba-a5b9-e6558904207b: storage.glusterSD.findDomain} Thread-13::INFO::2014-06-21 16:17:14,186::logUtils::47::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,186::task::1185::TaskManager.Task::(prepare) Task=`97b78287-45d2-4d5a-8336-460987df3840`::finished: {'statuslist': [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]} Thread-13::DEBUG::2014-06-21 16:17:14,187::task::595::TaskManager.Task::(_updateState) Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state preparing -> state finished Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-13::DEBUG::2014-06-21 16:17:14,187::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:14,188::task::990::TaskManager.Task::(_decref) Task=`97b78287-45d2-4d5a-8336-460987df3840`::ref 0 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,195::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119] flowID [6e44c0a3] Thread-13::DEBUG::2014-06-21 16:17:14,195::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state init -> state preparing Thread-13::INFO::2014-06-21 16:17:14,196::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='806d2356-12cf-437c-8917-dd13ee823e36', poolName='testing', masterDom='dc661957-c0c1-44ba-a5b9-e6558904207b', domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'], masterVersion=2, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None) Thread-13::DEBUG::2014-06-21 16:17:14,196::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,197::misc::756::SamplingMethod::(__call__) Trying to enter sampling method (storage.iscsi.rescan) Thread-13::DEBUG::2014-06-21 16:17:14,198::misc::758::SamplingMethod::(__call__) Got in to sampling method Thread-13::DEBUG::2014-06-21 16:17:14,198::iscsi::407::Storage.ISCSI::(rescan) Performing SCSI scan, this will take up to 30 seconds Thread-13::DEBUG::2014-06-21 16:17:14,199::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,228::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,229::multipath::110::Storage.Misc.excCmd::(rescan) '/usr/bin/sudo -n /sbin/multipath -r' (cwd None) Thread-13::DEBUG::2014-06-21 16:17:14,294::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS: <err> = ''; <rc> = 0 Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::497::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,295::lvm::499::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::508::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,296::lvm::510::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::528::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,297::lvm::530::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex Thread-13::DEBUG::2014-06-21 16:17:14,298::misc::766::SamplingMethod::(__call__) Returning last result Thread-13::DEBUG::2014-06-21 16:17:14,318::fileSD::150::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/glusterSD/192.168.10.120:_vmimage/dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,322::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-13::DEBUG::2014-06-21 16:17:14,328::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,334::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing remnants of deleted images [] Thread-13::INFO::2014-06-21 16:17:14,335::sd::383::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_imageNS already registered Thread-13::INFO::2014-06-21 16:17:14,335::sd::391::Storage.StorageDomain::(_registerResourceNamespaces) Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_volumeNS already registered Thread-13::INFO::2014-06-21 16:17:14,336::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,340::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,341::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '980' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,342::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,343::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,343::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.806d2356-12cf-437c-8917-dd13ee823e36 (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,344::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::198::ResourceManager.Request::(__init__) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '982' at 'createStoragePool' Thread-13::DEBUG::2014-06-21 16:17:14,345::resourceManager::542::ResourceManager::(registerResource) Trying to register resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' for lock type 'exclusive' Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::601::ResourceManager::(registerResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free. Now locking as 'exclusive' (1 active user) Thread-13::DEBUG::2014-06-21 16:17:14,346::resourceManager::238::ResourceManager.Request::(grant) ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Granted request Thread-13::DEBUG::2014-06-21 16:17:14,347::task::827::TaskManager.Task::(resourceAcquired) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired: Storage.dc661957-c0c1-44ba-a5b9-e6558904207b (exclusive) Thread-13::DEBUG::2014-06-21 16:17:14,347::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False Thread-13::INFO::2014-06-21 16:17:14,347::sp::133::Storage.StoragePool::(setBackend) updating pool 806d2356-12cf-437c-8917-dd13ee823e36 backend from type NoneType instance 0x39e278bf00 to type StoragePoolDiskBackend instance 0x7f764c093cb0 Thread-13::INFO::2014-06-21 16:17:14,348::sp::548::Storage.StoragePool::(create) spUUID=806d2356-12cf-437c-8917-dd13ee823e36 poolName=testing master_sd=dc661957-c0c1-44ba-a5b9-e6558904207b domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'] masterVersion=2 {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3, 'LOCKRENEWALINTERVALSEC': 5} Thread-13::INFO::2014-06-21 16:17:14,348::fileSD::350::Storage.StorageDomain::(validate) sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b Thread-13::DEBUG::2014-06-21 16:17:14,352::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::DEBUG::2014-06-21 16:17:14,357::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular', 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS', 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57'] Thread-13::WARNING::2014-06-21 16:17:14,358::fileUtils::167::Storage.fileUtils::(createdir) Dir /rhev/data-center/806d2356-12cf-437c-8917-dd13ee823e36 already exists Thread-13::DEBUG::2014-06-21 16:17:14,358::persistentDict::167::Storage.PersistentDict::(transaction) Starting transaction Thread-13::DEBUG::2014-06-21 16:17:14,359::persistentDict::175::Storage.PersistentDict::(transaction) Finished transaction Thread-13::INFO::2014-06-21 16:17:14,359::clusterlock::184::SANLock::(acquireHostId) Acquiring host id for domain dc661957-c0c1-44ba-a5b9-e6558904207b (id: 250) Thread-24::DEBUG::2014-06-21 16:17:14,394::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state init -> state preparing Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-24::INFO::2014-06-21 16:17:14,395::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::1185::TaskManager.Task::(prepare) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::finished: {} Thread-24::DEBUG::2014-06-21 16:17:14,396::task::595::TaskManager.Task::(_updateState) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state preparing -> state finished Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-24::DEBUG::2014-06-21 16:17:14,396::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-24::DEBUG::2014-06-21 16:17:14,397::task::990::TaskManager.Task::(_decref) Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::ref 0 aborting False Thread-13::ERROR::2014-06-21 16:17:15,361::task::866::TaskManager.Task::(_setError) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 573, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/share/vdsm/storage/sp.py", line 515, in _acquireTemporaryClusterLock msd.acquireHostId(self.id <http://self.id>) File "/usr/share/vdsm/storage/sd.py", line 467, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long')) Thread-13::DEBUG::2014-06-21 16:17:15,363::task::885::TaskManager.Task::(_run) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._run: d815e5e5-0202-4137-94be-21dc5e2b61c9 (None, '806d2356-12cf-437c-8917-dd13ee823e36', 'testing', 'dc661957-c0c1-44ba-a5b9-e6558904207b', ['dc661957-c0c1-44ba-a5b9-e6558904207b'], 2, None, 5, 60, 10, 3) {} failed - stopping task Thread-13::DEBUG::2014-06-21 16:17:15,364::task::1211::TaskManager.Task::(stop) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::stopping in state preparing (force False) Thread-13::DEBUG::2014-06-21 16:17:15,364::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting True Thread-13::INFO::2014-06-21 16:17:15,365::task::1168::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::aborting: Task is aborted: 'Cannot acquire host id' - code 661 Thread-13::DEBUG::2014-06-21 16:17:15,365::task::1173::TaskManager.Task::(prepare) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Prepare: aborted: Cannot acquire host id Thread-13::DEBUG::2014-06-21 16:17:15,365::task::990::TaskManager.Task::(_decref) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 0 aborting True Thread-13::DEBUG::2014-06-21 16:17:15,366::task::925::TaskManager.Task::(_doAbort) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._doAbort: force False Thread-13::DEBUG::2014-06-21 16:17:15,366::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,366::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state preparing -> state aborting Thread-13::DEBUG::2014-06-21 16:17:15,366::task::550::TaskManager.Task::(__state_aborting) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_aborting: recover policy none Thread-13::DEBUG::2014-06-21 16:17:15,367::task::595::TaskManager.Task::(_updateState) Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state aborting -> state failed Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b': < ResourceRef 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', isValid: 'True' obj: 'None'>, 'Storage.806d2356-12cf-437c-8917-dd13ee823e36': < ResourceRef 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', isValid: 'True' obj: 'None'>} Thread-13::DEBUG::2014-06-21 16:17:15,367::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-13::DEBUG::2014-06-21 16:17:15,368::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,369::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', Clearing records. Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::616::ResourceManager::(releaseResource) Trying to release resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::635::ResourceManager::(releaseResource) Released resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' (0 active users) Thread-13::DEBUG::2014-06-21 16:17:15,370::resourceManager::641::ResourceManager::(releaseResource) Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free, finding out if anyone is waiting for it. Thread-13::DEBUG::2014-06-21 16:17:15,371::resourceManager::649::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', Clearing records. Thread-13::ERROR::2014-06-21 16:17:15,371::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': "Cannot acquire host id: ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))", 'code': 661}}
My oVirt version: 3.4.2-1.el6 (CentOS 6.5) The hypervisor hosts run GlusterFS 3.5.0-3.fc19.(Fedora 19) The two storage servers run GlusterFS 3.5.0-2.el6 (Centos 6.5)
So I am NOT using local storage of the hypervisor hosts for the GlusterFS bricks.
What can I do to solve this error?
By the way, the options on the GlusterFS volume are as follows:
Volume Name: vmimage Type: Replicate Volume ID: 348e1d45-1b80-420b-91c2-93f0d764f227 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 192.168.10.120:/export/gluster01/brick Brick2: 192.168.10.149:/export/gluster01/brick Options Reconfigured: network.ping-timeout: 10 cluster.quorum-count: 1 cluster.quorum-type: auto server.allow-insecure: on storage.owner-uid: 36 storage.owner-gid: 36
OK, fixed it. For someone else's reference, I had to set the following options on the gluster volume:
network.remote-dio: on performance.io-cache: off performance.read-ahead: off performance.quick-read: off cluster.eager-lock: enable
Apparently that's done by the 'optimize for virt store' checkbox, but obviously not when the volume is created manually. Having this in the documentation on ovirt.org <http://ovirt.org> would have saved me a lot of time and frustration.
its a wiki, how about adding this for the next guy?
thanks, Itamar
_______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users
--------------090300020209010206060509 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=UTF-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">I found the relevant Redhat documentation: <a class="moz-txt-link-freetext" href="https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.0/html/Quick_Start_Guide/chap-Quick_Start_Guide-Virtual_Preparation.html">https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.0/html/Quick_Start_Guide/chap-Quick_Start_Guide-Virtual_Preparation.html</a><br> <br> So for Punit's suggestion to work, you should also execute this command:<code class="command"><br> <br> gluster volume set <em class="replaceable"><code>VOLNAME</code></em> group virt</code><br> <br> Hope this helps someone!<br> <br> Tiemen<br> <br> On 06/24/14 04:11, Punit Dambiwal wrote:<br> </div> <blockquote cite="mid:CAGZcrBk5K8qaUHN7N1PrVf8Ue6YA4pX=yoyVGrHJtNwmZHbvNQ@mail.gmail.com" type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <div dir="ltr"> <div>Hi,<br> <br> </div> You can also use the below method on every gluster node :- <br> <br> For Group-virt (optimize for virt store)<br> <br> 1. Create the file name /var/lib/glusterd/groups/virt<br> 2. And paste all the contents from this location to this file :- <a moz-do-not-send="true" href="https://raw.githubusercontent.com/gluster/glusterfs/master/extras/group-virt.example">https://raw.githubusercontent.com/gluster/glusterfs/master/extras/group-virt.example</a><br> 3. service glusterd restart <br> 4. service vdsmd restart<br> <br> <pre>-------------- quick-read=off read-ahead=off io-cache=off stat-prefetch=off eager-lock=enable remote-dio=enable quorum-type=auto server-quorum-type=server -------------- </pre> <pre>Thanks, Punit Dambiwal </pre> <br> </div> <div class="gmail_extra"><br> <br> <div class="gmail_quote">On Mon, Jun 23, 2014 at 4:35 PM, Itamar Heim <span dir="ltr"><<a moz-do-not-send="true" href="mailto:iheim@redhat.com" target="_blank">iheim@redhat.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div class="HOEnZb"> <div class="h5">On 06/22/2014 06:38 PM, Tiemen Ruiten wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 06/21/14 16:57, Tiemen Ruiten wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 06/21/14 16:37, Tiemen Ruiten wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Hello,<br> <br> I've been struggling to set up an Ovirt cluster and am now bumping into<br> this problem:<br> <br> When I try to create a new (Gluster) storage domain, it fails to attach<br> to the data center. The error on the node from vdsm.log:<br> <br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,157::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119]<br> flowID [6e44c0a3]<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,159::task::595::TaskManager.Task::(_updateState)<br> Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state init -><br> state preparing<br> Thread-13::INFO::2014-06-21<br> 16:17:14,160::logUtils::44::dispatcher::(wrapper) Run and protect:<br> connectStorageServer(domType=7,<br> spUUID='00000000-0000-0000-0000-000000000000', conList=[{'port': '',<br> 'connection': '192.168.10.120:/vmimage', 'iqn': '', 'user': '', 'tpgt':<br> '1', 'vfs_type': 'glusterfs', 'password': '******', 'id':<br> '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}], options=None)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,172::hsm::2340::Storage.HSM::(__prefetchDomains)<br> glusterDomPath: glusterSD/*<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,185::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD<br> uuids: ('dc661957-c0c1-44ba-a5b9-e6558904207b',)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,185::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs:<br> {dc661957-c0c1-44ba-a5b9-e6558904207b: storage.glusterSD.findDomain}<br> Thread-13::INFO::2014-06-21<br> 16:17:14,186::logUtils::47::dispatcher::(wrapper) Run and protect:<br> connectStorageServer, Return response: {'statuslist': [{'status': 0,<br> 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]}<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,186::task::1185::TaskManager.Task::(prepare)<br> Task=`97b78287-45d2-4d5a-8336-460987df3840`::finished: {'statuslist':<br> [{'status': 0, 'id': '901b15ec-6b05-43c1-8a50-06b34c8ffdbd'}]}<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,187::task::595::TaskManager.Task::(_updateState)<br> Task=`97b78287-45d2-4d5a-8336-460987df3840`::moving from state preparing<br> -> state finished<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,187::resourceManager::940::ResourceManager.Owner::(releaseAll)<br> Owner.releaseAll requests {} resources {}<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,187::resourceManager::977::ResourceManager.Owner::(cancelAll)<br> Owner.cancelAll requests {}<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,188::task::990::TaskManager.Task::(_decref)<br> Task=`97b78287-45d2-4d5a-8336-460987df3840`::ref 0 aborting False<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,195::BindingXMLRPC::251::vds::(wrapper) client [192.168.10.119]<br> flowID [6e44c0a3]<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,195::task::595::TaskManager.Task::(_updateState)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state init -><br> state preparing<br> Thread-13::INFO::2014-06-21<br> 16:17:14,196::logUtils::44::dispatcher::(wrapper) Run and protect:<br> createStoragePool(poolType=None,<br> spUUID='806d2356-12cf-437c-8917-dd13ee823e36', poolName='testing',<br> masterDom='dc661957-c0c1-44ba-a5b9-e6558904207b',<br> domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'], masterVersion=2,<br> lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60,<br> ioOpTimeoutSec=10, leaseRetries=3, options=None)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,196::misc::756::SamplingMethod::(__call__) Trying to enter<br> sampling method (storage.sdc.refreshStorage)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,197::misc::758::SamplingMethod::(__call__) Got in to sampling<br> method<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,197::misc::756::SamplingMethod::(__call__) Trying to enter<br> sampling method (storage.iscsi.rescan)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,198::misc::758::SamplingMethod::(__call__) Got in to sampling<br> method<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,198::iscsi::407::Storage.ISCSI::(rescan) Performing SCSI scan,<br> this will take up to 30 seconds<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,199::iscsiadm::92::Storage.Misc.excCmd::(_runCmd)<br> '/usr/bin/sudo -n /sbin/iscsiadm -m session -R' (cwd None)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,228::misc::766::SamplingMethod::(__call__) Returning last result<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,229::multipath::110::Storage.Misc.excCmd::(rescan)<br> '/usr/bin/sudo -n /sbin/multipath -r' (cwd None)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,294::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS:<br> <err> = ''; <rc> = 0<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,295::lvm::497::OperationMutex::(_invalidateAllPvs) Operation<br> 'lvm invalidate operation' got the operation mutex<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,295::lvm::499::OperationMutex::(_invalidateAllPvs) Operation<br> 'lvm invalidate operation' released the operation mutex<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,296::lvm::508::OperationMutex::(_invalidateAllVgs) Operation<br> 'lvm invalidate operation' got the operation mutex<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,296::lvm::510::OperationMutex::(_invalidateAllVgs) Operation<br> 'lvm invalidate operation' released the operation mutex<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,297::lvm::528::OperationMutex::(_invalidateAllLvs) Operation<br> 'lvm invalidate operation' got the operation mutex<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,297::lvm::530::OperationMutex::(_invalidateAllLvs) Operation<br> 'lvm invalidate operation' released the operation mutex<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,298::misc::766::SamplingMethod::(__call__) Returning last result<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,318::fileSD::150::Storage.StorageDomain::(__init__) Reading<br> domain in path<br> /rhev/data-center/mnt/glusterSD/192.168.10.120:_vmimage/dc661957-c0c1-44ba-a5b9-e6558904207b<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,322::persistentDict::192::Storage.PersistentDict::(__init__)<br> Created a persistent dict with FileMetadataRW backend<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,328::persistentDict::234::Storage.PersistentDict::(refresh)<br> read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage',<br> 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=',<br> 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=',<br> 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular',<br> 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS',<br> 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57']<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,334::fileSD::609::Storage.StorageDomain::(imageGarbageCollector) Removing<br> remnants of deleted images []<br> Thread-13::INFO::2014-06-21<br> 16:17:14,335::sd::383::Storage.StorageDomain::(_registerResourceNamespaces)<br> Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_imageNS already<br> registered<br> Thread-13::INFO::2014-06-21<br> 16:17:14,335::sd::391::Storage.StorageDomain::(_registerResourceNamespaces)<br> Resource namespace dc661957-c0c1-44ba-a5b9-e6558904207b_volumeNS already<br> registered<br> Thread-13::INFO::2014-06-21<br> 16:17:14,336::fileSD::350::Storage.StorageDomain::(validate)<br> sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,340::persistentDict::234::Storage.PersistentDict::(refresh)<br> read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage',<br> 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=',<br> 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=',<br> 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular',<br> 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS',<br> 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57']<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,341::resourceManager::198::ResourceManager.Request::(__init__)<br> ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Request<br> was made in '/usr/share/vdsm/storage/hsm.py' line '980' at<br> 'createStoragePool'<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,342::resourceManager::542::ResourceManager::(registerResource)<br> Trying to register resource<br> 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' for lock type 'exclusive'<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,342::resourceManager::601::ResourceManager::(registerResource)<br> Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free. Now<br> locking as 'exclusive' (1 active user)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,343::resourceManager::238::ResourceManager.Request::(grant)<br> ResName=`Storage.806d2356-12cf-437c-8917-dd13ee823e36`ReqID=`de2ede47-22fa-43b8-9f3b-dc714a45b450`::Granted<br> request<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,343::task::827::TaskManager.Task::(resourceAcquired)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired:<br> Storage.806d2356-12cf-437c-8917-dd13ee823e36 (exclusive)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,344::task::990::TaskManager.Task::(_decref)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,345::resourceManager::198::ResourceManager.Request::(__init__)<br> ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Request<br> was made in '/usr/share/vdsm/storage/hsm.py' line '982' at<br> 'createStoragePool'<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,345::resourceManager::542::ResourceManager::(registerResource)<br> Trying to register resource<br> 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' for lock type 'exclusive'<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,346::resourceManager::601::ResourceManager::(registerResource)<br> Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free. Now<br> locking as 'exclusive' (1 active user)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,346::resourceManager::238::ResourceManager.Request::(grant)<br> ResName=`Storage.dc661957-c0c1-44ba-a5b9-e6558904207b`ReqID=`71bf6917-b501-4016-ad8e-8b84849da8cb`::Granted<br> request<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,347::task::827::TaskManager.Task::(resourceAcquired)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_resourcesAcquired:<br> Storage.dc661957-c0c1-44ba-a5b9-e6558904207b (exclusive)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,347::task::990::TaskManager.Task::(_decref)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting False<br> Thread-13::INFO::2014-06-21<br> 16:17:14,347::sp::133::Storage.StoragePool::(setBackend) updating pool<br> 806d2356-12cf-437c-8917-dd13ee823e36 backend from type NoneType instance<br> 0x39e278bf00 to type StoragePoolDiskBackend instance 0x7f764c093cb0<br> Thread-13::INFO::2014-06-21<br> 16:17:14,348::sp::548::Storage.StoragePool::(create)<br> spUUID=806d2356-12cf-437c-8917-dd13ee823e36 poolName=testing<br> master_sd=dc661957-c0c1-44ba-a5b9-e6558904207b<br> domList=['dc661957-c0c1-44ba-a5b9-e6558904207b'] masterVersion=2<br> {'LEASETIMESEC': 60, 'IOOPTIMEOUTSEC': 10, 'LEASERETRIES': 3,<br> 'LOCKRENEWALINTERVALSEC': 5}<br> Thread-13::INFO::2014-06-21<br> 16:17:14,348::fileSD::350::Storage.StorageDomain::(validate)<br> sdUUID=dc661957-c0c1-44ba-a5b9-e6558904207b<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,352::persistentDict::234::Storage.PersistentDict::(refresh)<br> read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage',<br> 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=',<br> 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=',<br> 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular',<br> 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS',<br> 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57']<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,357::persistentDict::234::Storage.PersistentDict::(refresh)<br> read lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=vmimage',<br> 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=',<br> 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=',<br> 'REMOTE_PATH=192.168.10.120:/vmimage', 'ROLE=Regular',<br> 'SDUUID=dc661957-c0c1-44ba-a5b9-e6558904207b', 'TYPE=GLUSTERFS',<br> 'VERSION=3', '_SHA_CKSUM=9fdc035c398d2cd8b5c31bf5eea2882c8782ed57']<br> Thread-13::WARNING::2014-06-21<br> 16:17:14,358::fileUtils::167::Storage.fileUtils::(createdir) Dir<br> /rhev/data-center/806d2356-12cf-437c-8917-dd13ee823e36 already exists<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,358::persistentDict::167::Storage.PersistentDict::(transaction)<br> Starting transaction<br> Thread-13::DEBUG::2014-06-21<br> 16:17:14,359::persistentDict::175::Storage.PersistentDict::(transaction)<br> Finished transaction<br> Thread-13::INFO::2014-06-21<br> 16:17:14,359::clusterlock::184::SANLock::(acquireHostId) Acquiring host<br> id for domain dc661957-c0c1-44ba-a5b9-e6558904207b (id: 250)<br> Thread-24::DEBUG::2014-06-21<br> 16:17:14,394::task::595::TaskManager.Task::(_updateState)<br> Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state init -><br> state preparing<br> Thread-24::INFO::2014-06-21<br> 16:17:14,395::logUtils::44::dispatcher::(wrapper) Run and protect:<br> repoStats(options=None)<br> Thread-24::INFO::2014-06-21<br> 16:17:14,395::logUtils::47::dispatcher::(wrapper) Run and protect:<br> repoStats, Return response: {}<br> Thread-24::DEBUG::2014-06-21<br> 16:17:14,396::task::1185::TaskManager.Task::(prepare)<br> Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::finished: {}<br> Thread-24::DEBUG::2014-06-21<br> 16:17:14,396::task::595::TaskManager.Task::(_updateState)<br> Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::moving from state preparing<br> -> state finished<br> Thread-24::DEBUG::2014-06-21<br> 16:17:14,396::resourceManager::940::ResourceManager.Owner::(releaseAll)<br> Owner.releaseAll requests {} resources {}<br> Thread-24::DEBUG::2014-06-21<br> 16:17:14,396::resourceManager::977::ResourceManager.Owner::(cancelAll)<br> Owner.cancelAll requests {}<br> Thread-24::DEBUG::2014-06-21<br> 16:17:14,397::task::990::TaskManager.Task::(_decref)<br> Task=`c4430b80-31d9-4a1d-bee8-fae01a438da6`::ref 0 aborting False<br> Thread-13::ERROR::2014-06-21<br> 16:17:15,361::task::866::TaskManager.Task::(_setError)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Unexpected error<br> Traceback (most recent call last):<br> File "/usr/share/vdsm/storage/task.py", line 873, in _run<br> return fn(*args, **kargs)<br> File "/usr/share/vdsm/logUtils.py", line 45, in wrapper<br> res = f(*args, **kwargs)<br> File "/usr/share/vdsm/storage/hsm.py", line 988, in createStoragePool<br> leaseParams)<br> File "/usr/share/vdsm/storage/sp.py", line 573, in create<br> self._acquireTemporaryClusterLock(msdUUID, leaseParams)<br> File "/usr/share/vdsm/storage/sp.py", line 515, in<br> _acquireTemporaryClusterLock<br> msd.acquireHostId(<a moz-do-not-send="true" href="http://self.id" target="_blank">self.id</a>)<br> File "/usr/share/vdsm/storage/sd.py", line 467, in acquireHostId<br> self._clusterLock.acquireHostId(hostId, async)<br> File "/usr/share/vdsm/storage/clusterlock.py", line 199, in acquireHostId<br> raise se.AcquireHostIdFailure(self._sdUUID, e)<br> AcquireHostIdFailure: Cannot acquire host id:<br> ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock<br> lockspace add failure', 'Message too long'))<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,363::task::885::TaskManager.Task::(_run)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._run:<br> d815e5e5-0202-4137-94be-21dc5e2b61c9 (None,<br> '806d2356-12cf-437c-8917-dd13ee823e36', 'testing',<br> 'dc661957-c0c1-44ba-a5b9-e6558904207b',<br> ['dc661957-c0c1-44ba-a5b9-e6558904207b'], 2, None, 5, 60, 10, 3) {}<br> failed - stopping task<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,364::task::1211::TaskManager.Task::(stop)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::stopping in state preparing<br> (force False)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,364::task::990::TaskManager.Task::(_decref)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 1 aborting True<br> Thread-13::INFO::2014-06-21<br> 16:17:15,365::task::1168::TaskManager.Task::(prepare)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::aborting: Task is aborted:<br> 'Cannot acquire host id' - code 661<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,365::task::1173::TaskManager.Task::(prepare)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Prepare: aborted: Cannot<br> acquire host id<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,365::task::990::TaskManager.Task::(_decref)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::ref 0 aborting True<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,366::task::925::TaskManager.Task::(_doAbort)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::Task._doAbort: force False<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,366::resourceManager::977::ResourceManager.Owner::(cancelAll)<br> Owner.cancelAll requests {}<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,366::task::595::TaskManager.Task::(_updateState)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state preparing<br> -> state aborting<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,366::task::550::TaskManager.Task::(__state_aborting)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::_aborting: recover policy none<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,367::task::595::TaskManager.Task::(_updateState)<br> Task=`d815e5e5-0202-4137-94be-21dc5e2b61c9`::moving from state aborting<br> -> state failed<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,367::resourceManager::940::ResourceManager.Owner::(releaseAll)<br> Owner.releaseAll requests {} resources<br> {'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b': < ResourceRef<br> 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', isValid: 'True' obj:<br> 'None'>, 'Storage.806d2356-12cf-437c-8917-dd13ee823e36': < ResourceRef<br> 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', isValid: 'True' obj:<br> 'None'>}<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,367::resourceManager::977::ResourceManager.Owner::(cancelAll)<br> Owner.cancelAll requests {}<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,368::resourceManager::616::ResourceManager::(releaseResource)<br> Trying to release resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b'<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,369::resourceManager::635::ResourceManager::(releaseResource)<br> Released resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' (0<br> active users)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,369::resourceManager::641::ResourceManager::(releaseResource)<br> Resource 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b' is free, finding<br> out if anyone is waiting for it.<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,369::resourceManager::649::ResourceManager::(releaseResource)<br> No one is waiting for resource<br> 'Storage.dc661957-c0c1-44ba-a5b9-e6558904207b', Clearing records.<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,370::resourceManager::616::ResourceManager::(releaseResource)<br> Trying to release resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36'<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,370::resourceManager::635::ResourceManager::(releaseResource)<br> Released resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' (0<br> active users)<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,370::resourceManager::641::ResourceManager::(releaseResource)<br> Resource 'Storage.806d2356-12cf-437c-8917-dd13ee823e36' is free, finding<br> out if anyone is waiting for it.<br> Thread-13::DEBUG::2014-06-21<br> 16:17:15,371::resourceManager::649::ResourceManager::(releaseResource)<br> No one is waiting for resource<br> 'Storage.806d2356-12cf-437c-8917-dd13ee823e36', Clearing records.<br> Thread-13::ERROR::2014-06-21<br> 16:17:15,371::dispatcher::65::Storage.Dispatcher.Protect::(run)<br> {'status': {'message': "Cannot acquire host id:<br> ('dc661957-c0c1-44ba-a5b9-e6558904207b', SanlockException(90, 'Sanlock<br> lockspace add failure', 'Message too long'))", 'code': 661}}<br> <br> <br> My oVirt version: 3.4.2-1.el6 (CentOS 6.5)<br> The hypervisor hosts run GlusterFS 3.5.0-3.fc19.(Fedora 19)<br> The two storage servers run GlusterFS 3.5.0-2.el6 (Centos 6.5)<br> <br> So I am NOT using local storage of the hypervisor hosts for the<br> GlusterFS bricks.<br> <br> What can I do to solve this error?<br> <br> </blockquote> By the way, the options on the GlusterFS volume are as follows:<br> <br> Volume Name: vmimage<br> Type: Replicate<br> Volume ID: 348e1d45-1b80-420b-91c2-93f0d764f227<br> Status: Started<br> Number of Bricks: 1 x 2 = 2<br> Transport-type: tcp<br> Bricks:<br> Brick1: 192.168.10.120:/export/gluster01/brick<br> Brick2: 192.168.10.149:/export/gluster01/brick<br> Options Reconfigured:<br> network.ping-timeout: 10<br> cluster.quorum-count: 1<br> cluster.quorum-type: auto<br> server.allow-insecure: on<br> storage.owner-uid: 36<br> storage.owner-gid: 36<br> <br> </blockquote> OK, fixed it. For someone else's reference, I had to set the following<br> options on the gluster volume:<br> <br> network.remote-dio: on<br> performance.io-cache: off<br> performance.read-ahead: off<br> performance.quick-read: off<br> cluster.eager-lock: enable<br> <br> Apparently that's done by the 'optimize for virt store' checkbox, but<br> obviously not when the volume is created manually. Having this in the<br> documentation on <a moz-do-not-send="true" href="http://ovirt.org" target="_blank">ovirt.org</a> would have saved me a lot of time and<br> frustration.<br> <br> <br> </blockquote> <br> </div> </div> its a wiki, how about adding this for the next guy?<br> <br> thanks,<br> Itamar <div class="HOEnZb"> <div class="h5"><br> _______________________________________________<br> Users mailing list<br> <a moz-do-not-send="true" href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br> <a moz-do-not-send="true" href="http://lists.ovirt.org/mailman/listinfo/users" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br> </div> </div> </blockquote> </div> <br> </div> </blockquote> <br> </body> </html> --------------090300020209010206060509--
participants (3)
-
Itamar Heim
-
Punit Dambiwal
-
Tiemen Ruiten