[Engine-devel] Unable to add second Data domain to my Data Center.

Einav Cohen ecohen at redhat.com
Mon Jan 21 14:16:24 UTC 2013


hi, ping on that issue - any chance that someone can help with that?

some more info:

Host is fedora 17, with the following installed:
vdsm-4.10.0-10.fc17.x86_64
sanlock-2.4-3.fc17.x86_64
libvirt-0.9.11.8-2.fc17.x86_64

engine is latest oVirt-engine upstream.

when starting vdsm, getting the following error:

2013-01-18 07:56:18-0500 20 [1184]: sanlock daemon started 2.4 aio 1 10 renew 20 80 host 8828a575-4ace-4da0-b78e-53b26559a507.host1.loca time 1358513778
2013-01-18 07:56:18-0500 20 [1184]: wdmd connect failed for watchdog handling

following steps in [1] solve the above error, afterwards first Data NFS storage domain (V3) can be added successfully to the Data-Center. 
Attempting to add a second Data NFS storage-domain (V3) to the same data-center fails - see vdsm log [2] and san-lock log [3].
Attempting to add an ISO NFS storage-domain (V1) succeeds.

any idea?

thanks in advance.


[1] http://comments.gmane.org/gmane.comp.emulators.ovirt.vdsm.devel/1395

[2] vdsm log:

> Thread-2801::ERROR::2013-01-17
> 16:55:22,340::task::853::TaskManager.Task::
> (_setError) Task=`fe1ac39e-7681-4043-b2b7-b6e2611e1c10`::Unexpected
> error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 861, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
>     res = f(*args, **kwargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 960, in
>   attachStorageDomain
>     pool.attachSD(sdUUID)
>   File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
>     return f(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 919, in attachSD
>     dom.acquireClusterLock(self.id)
>   File "/usr/share/vdsm/storage/sd.py", line 429, in
>   acquireClusterLock
>     self._clusterLock.acquire(hostID)
>   File "/usr/share/vdsm/storage/safelease.py", line 205, in acquire
>     "Cannot acquire cluster lock", str(e))
> AcquireLockFailure: Cannot obtain lock:
> "id=807c2e91-52c3-4779-9ebb-771b96bf5a4c, rc=28, out=Cannot acquire
> cluster
> lock, err=(28, 'Sanlock resource not acquired', 'No space left on
> device')"
> Thread-2801::DEBUG::2013-01-17
> 16:55:22,341::task::872::TaskManager.Task::
> (_run) Task=`fe1ac39e-7681-4043-b2b7-b6e2611e1c10`::Task._run:
> fe1ac39e-7681-4043-b2b7-b6e2611e1c10
> ('807c2e91-52c3-4779-9ebb-771b96bf5a4c',
> '58
> 49b030-626e-47cb-ad90-3ce782d831b3') {} failed - stopping task

[3] sanlock log:

> 2013-01-17 16:55:22-0500 4161 [3187]: r12 cmd_acquire 3,14,3830
> invalid
> lockspace found -1 failed 0 name 807c2e91-52c3-4779-9ebb-771b96bf5a4c


----- Original Message -----
> From: "Alexander Wels" <awels at redhat.com>
> To: engine-devel at ovirt.org
> Sent: Thursday, January 17, 2013 5:00:13 PM
> Subject: [Engine-devel] Unable to add second Data domain to my Data Center.
> 
> Hello,
> 
> I am trying to setup a host for a seperate oVirt engine/node
> environment
> (version 3.1). I managed to get my host up and running and active in
> my Data
> Center. I managed to create 4 storage domains on that host. 2 Data
> domains and
> 1 ISO domain and one Export domain.
> 
> I am successful in attaching the ISO/Export/smaller Data domain to my
> Data
> Center. The web admin interface is indicating they are active. I
> activated my
> smaller test Data domain because I was having some issues earlier
> getting my
> storage attached. I worked through those issues and now I want to
> attach my
> actual Data storage so I can detach the test one. Whenever I attempt
> to attach
> the storage I get an error message and when I look in the vdsm log I
> see the
> following:
> 
> Thread-2801::ERROR::2013-01-17
> 16:55:22,340::task::853::TaskManager.Task::
> (_setError) Task=`fe1ac39e-7681-4043-b2b7-b6e2611e1c10`::Unexpected
> error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 861, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
>     res = f(*args, **kwargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 960, in
>   attachStorageDomain
>     pool.attachSD(sdUUID)
>   File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
>     return f(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 919, in attachSD
>     dom.acquireClusterLock(self.id)
>   File "/usr/share/vdsm/storage/sd.py", line 429, in
>   acquireClusterLock
>     self._clusterLock.acquire(hostID)
>   File "/usr/share/vdsm/storage/safelease.py", line 205, in acquire
>     "Cannot acquire cluster lock", str(e))
> AcquireLockFailure: Cannot obtain lock:
> "id=807c2e91-52c3-4779-9ebb-771b96bf5a4c, rc=28, out=Cannot acquire
> cluster
> lock, err=(28, 'Sanlock resource not acquired', 'No space left on
> device')"
> Thread-2801::DEBUG::2013-01-17
> 16:55:22,341::task::872::TaskManager.Task::
> (_run) Task=`fe1ac39e-7681-4043-b2b7-b6e2611e1c10`::Task._run:
> fe1ac39e-7681-4043-b2b7-b6e2611e1c10
> ('807c2e91-52c3-4779-9ebb-771b96bf5a4c',
> '58
> 49b030-626e-47cb-ad90-3ce782d831b3') {} failed - stopping task
> 
> Which seems to indicate the disk is full. Which is not the case as
> this is a
> brand new machine with a fresh install and a 1T drive in it. The
> error does
> point to sanlock so when I look at the sanlock log I see this:
> 
> 2013-01-17 16:55:22-0500 4161 [3187]: r12 cmd_acquire 3,14,3830
> invalid
> lockspace found -1 failed 0 name 807c2e91-52c3-4779-9ebb-771b96bf5a4c
> 
> I googled around for that particular error message and I couldn't
> find anyone
> with any suggestions as to the source of that problem.
> 
> Could someone please give me a pointer of where to look or help me
> debug the
> issue.
> 
> Thanks,
> Alexander
> 
> _______________________________________________
> Engine-devel mailing list
> Engine-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/engine-devel
> 
> 
> 



More information about the Engine-devel mailing list