[ovirt-users] Importing existing (dirty) storage domains

Doug Ingham dougti at gmail.com
Fri Feb 17 01:16:01 UTC 2017


Well that didn't go so well. I deleted both dom_md/ids & dom_md/leases in
the cloned volume, and I still can't import the storage domain.
The snapshot was also taken some 4 hours before the attempted import, so
I'm surprised the locks haven't expired by themselves...


2017-02-16 21:58:24,630-03 INFO
[org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand]
(default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] Lock Acquired to
object
'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2017-02-16 21:58:24,645-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand]
(default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] START,
ConnectStorageServerVDSCommand(HostName = v5.dc0.example.com,
StorageServerConnectionManagementVDSParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000',
storageType='GLUSTERFS',
connectionList='[StorageServerConnections:{id='null',
connection='localhost:data-teste2', iqn='null', vfsType='glusterfs',
mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null',
iface='null', netIfaceName='null'}]'}), log id: 726df65e
2017-02-16 21:58:26,046-03 INFO
[org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand]
(default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] Lock freed to
object 'EngineLock:{exclusiveLocks='[localhost:data
teste2=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]',
sharedLocks='null'}'
2017-02-16 21:58:26,206-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand]
(default task-52) [85548427-713f-4ffb-a385-a97a7ee4109d] START,
HSMGetStorageDomainsListVDSCommand(HostName = v5.dc0.example.com,
HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null',
storageDomainType='Data', path='localhost:data-teste2'}), log id: 79f6cc88
2017-02-16 21:58:27,899-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand]
(default task-50) [38e87311-a7a5-49a8-bf18-857dd969cd5f] START,
HSMGetStorageDomainsListVDSCommand(HostName = v5.dc0.example.com,
HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null',
storageDomainType='Data', path='localhost:data-teste2'}), log id: 7280d13
2017-02-16 21:58:29,156-03 INFO
[org.ovirt.engine.core.bll.storage.connection.RemoveStorageServerConnectionCommand]
(default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] Lock Acquired to
object
'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>,
5e5f6610-c759-448b-a53d-9a456f513681=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2017-02-16 21:58:29,168-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
(default task-57) [5e4b20cf-60d2-4ae9-951b-c2693603aa6f] START,
DisconnectStorageServerVDSCommand(HostName = v5.dc0.example.com,
StorageServerConnectionManagementVDSParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000',
storageType='GLUSTERFS',
connectionList='[StorageServerConnections:{id='5e5f6610-c759-448b-a53d-9a456f513681',
connection='localhost:data-teste2', iqn='null', vfsType='glusterfs',
mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null',
iface='null', netIfaceName='null'}]'}), log id: 6042b108
2017-02-16 21:58:29,193-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
(default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] START,
DisconnectStorageServerVDSCommand(HostName = v5.dc0.example.com,
StorageServerConnectionManagementVDSParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000',
storageType='GLUSTERFS',
connectionList='[StorageServerConnections:{id='5e5f6610-c759-448b-a53d-9a456f513681',
connection='localhost:data-teste2', iqn='null', vfsType='glusterfs',
mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null',
iface='null', netIfaceName='null'}]'}), log id: 4e9421cf
2017-02-16 21:58:31,398-03 INFO
[org.ovirt.engine.core.bll.storage.connection.RemoveStorageServerConnectionCommand]
(default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] Lock freed to
object
'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>,
5e5f6610-c759-448b-a53d-9a456f513681=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'

Again, many thanks!
 Doug

On 16 February 2017 at 18:53, Doug Ingham <dougti at gmail.com> wrote:

> Hi Nir,
>
> On 16 February 2017 at 13:55, Nir Soffer <nsoffer at redhat.com> wrote:
>
>> On Mon, Feb 13, 2017 at 3:35 PM, Doug Ingham <dougti at gmail.com> wrote:
>> > Hi Sahina,
>> >
>> > On 13 February 2017 at 05:45, Sahina Bose <sabose at redhat.com> wrote:
>> >>
>> >> Any errors in the gluster mount logs for this gluster volume?
>> >>
>> >> How about "gluster vol heal <volname> info" - does it list any entries
>> to
>> >> heal?
>> >
>> >
>> > After more investigating, I found out that there is a sanlock daemon
>> that
>> > runs with VDSM, independently of the HE, so I'd basically have to bring
>> the
>> > volume down & wait for the leases to expire/delete them* before I can
>> import
>> > the domain.
>> >
>> > *I understand removing /dom_md/leases/ should do the job?
>>
>> No, the issue is probably dom_md/ids accessed by sanlock, but removing
>> files
>> accessed by sanlock will not help, an open file will remain open until
>> sanlock
>> close the file.
>>
>
> I'm testing this with volume snapshots at the moment, so there are no
> processes accessing the new volume.
>
>
> Did you try to reboot the host before installing it again? If you did and
>> you
>> still have these issues, you probably need to remove the previous
>> installation
>> properly before installing again.
>>
>> Adding Simone to help with uninstalling and reinstalling hosted engine.
>>
>
> The Hosted-Engine database had been corrupted and the restore wasn't
> running correctly, so I installed a new engine on a new server - no
> restores or old data. The aim is to import the old storage domain into the
> new Engine & then import the VMs into the new storage domain.
> My only problem with this is that there appear to be some file based
> leases somewhere that, unless I manage to locate & delete them, force me to
> wait for the leases to timeout before I can import the old storage domain.
> To minimise downtime, I'm trying to avoid having to wait for the leases to
> timeout.
>
> Regards,
>  Doug
>
>
>>
>> Nir
>>
>> >
>> >
>> >>
>> >>
>> >> On Thu, Feb 9, 2017 at 11:57 PM, Doug Ingham <dougti at gmail.com> wrote:
>> >>>
>> >>> Some interesting output from the vdsm log...
>> >>>
>> >>>
>> >>> 2017-02-09 15:16:24,051 INFO  (jsonrpc/1) [storage.StorageDomain]
>> >>> Resource namespace 01_img_60455567-ad30-42e3-a9df-62fe86c7fd25
>> already
>> >>> registered (sd:731)
>> >>> 2017-02-09 15:16:24,051 INFO  (jsonrpc/1) [storage.StorageDomain]
>> >>> Resource namespace 02_vol_60455567-ad30-42e3-a9df-62fe86c7fd25
>> already
>> >>> registered (sd:740)
>> >>> 2017-02-09 15:16:24,052 INFO  (jsonrpc/1) [storage.SANLock] Acquiring
>> >>> Lease(name='SDM',
>> >>> path=u'/rhev/data-center/mnt/glusterSD/localhost:data2/60455
>> 567-ad30-42e3-a9df-6
>> >>> 2fe86c7fd25/dom_md/leases', offset=1048576) for host id 1
>> >>> (clusterlock:343)
>> >>> 2017-02-09 15:16:24,057 INFO  (jsonrpc/1) [storage.SANLock] Releasing
>> >>> host id for domain 60455567-ad30-42e3-a9df-62fe86c7fd25 (id: 1)
>> >>> (clusterlock:305)
>> >>> 2017-02-09 15:16:25,149 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
>> >>> call GlusterHost.list succeeded in 0.17 seconds (__init__:515)
>> >>> 2017-02-09 15:16:25,264 INFO  (Reactor thread)
>> >>> [ProtocolDetector.AcceptorImpl] Accepted connection from
>> >>> ::ffff:127.0.0.1:55060 (protocoldetector:72)
>> >>> 2017-02-09 15:16:25,270 INFO  (Reactor thread)
>> >>> [ProtocolDetector.Detector] Detected protocol stomp from
>> >>> ::ffff:127.0.0.1:55060 (protocoldetector:127)
>> >>> 2017-02-09 15:16:25,271 INFO  (Reactor thread) [Broker.StompAdapter]
>> >>> Processing CONNECT request (stompreactor:102)
>> >>> 2017-02-09 15:16:25,271 INFO  (JsonRpc (StompReactor))
>> >>> [Broker.StompAdapter] Subscribe command received (stompreactor:129)
>> >>> 2017-02-09 15:16:25,416 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
>> >>> call Host.getHardwareInfo succeeded in 0.01 seconds (__init__:515)
>> >>> 2017-02-09 15:16:25,419 INFO  (jsonrpc/6) [dispatcher] Run and
>> protect:
>> >>> repoStats(options=None) (logUtils:49)
>> >>> 2017-02-09 15:16:25,419 INFO  (jsonrpc/6) [dispatcher] Run and
>> protect:
>> >>> repoStats, Return response: {u'e8d04da7-ad3d-4227-a45d-b5a29b2f43e5':
>> >>> {'code': 0, 'actual': True
>> >>> , 'version': 4, 'acquired': True, 'delay': '0.000854128', 'lastCheck':
>> >>> '5.1', 'valid': True}, u'a77b8821-ff19-4d17-a3ce-a6c3a69436d5':
>> {'code': 0,
>> >>> 'actual': True, 'vers
>> >>> ion': 4, 'acquired': True, 'delay': '0.000966556', 'lastCheck': '2.6',
>> >>> 'valid': True}} (logUtils:52)
>> >>> 2017-02-09 15:16:25,447 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
>> >>> call Host.getStats succeeded in 0.03 seconds (__init__:515)
>> >>> 2017-02-09 15:16:25,450 ERROR (JsonRpc (StompReactor))
>> [vds.dispatcher]
>> >>> SSL error receiving from <yajsonrpc.betterAsyncore.Dispatcher
>> connected
>> >>> ('::ffff:127.0.0.1', 55060, 0, 0) at 0x7f69c0043cf8>: unexpected eof
>> >>> (betterAsyncore:113)
>> >>> 2017-02-09 15:16:25,812 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
>> >>> call GlusterVolume.list succeeded in 0.10 seconds (__init__:515)
>> >>> 2017-02-09 15:16:25,940 INFO  (Reactor thread)
>> >>> [ProtocolDetector.AcceptorImpl] Accepted connection from
>> >>> ::ffff:127.0.0.1:55062 (protocoldetector:72)
>> >>> 2017-02-09 15:16:25,946 INFO  (Reactor thread)
>> >>> [ProtocolDetector.Detector] Detected protocol stomp from
>> >>> ::ffff:127.0.0.1:55062 (protocoldetector:127)
>> >>> 2017-02-09 15:16:25,947 INFO  (Reactor thread) [Broker.StompAdapter]
>> >>> Processing CONNECT request (stompreactor:102)
>> >>> 2017-02-09 15:16:25,947 INFO  (JsonRpc (StompReactor))
>> >>> [Broker.StompAdapter] Subscribe command received (stompreactor:129)
>> >>> 2017-02-09 15:16:26,058 ERROR (jsonrpc/1) [storage.TaskManager.Task]
>> >>> (Task='02cad901-5fe8-4f2d-895b-14184f67feab') Unexpected error
>> (task:870)
>> >>> Traceback (most recent call last):
>> >>>   File "/usr/share/vdsm/storage/task.py", line 877, in _run
>> >>>     return fn(*args, **kargs)
>> >>>   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50,
>> in
>> >>> wrapper
>> >>>     res = f(*args, **kwargs)
>> >>>   File "/usr/share/vdsm/storage/hsm.py", line 812, in
>> >>> forcedDetachStorageDomain
>> >>>     self._deatchStorageDomainFromOldPools(sdUUID)
>> >>>   File "/usr/share/vdsm/storage/hsm.py", line 790, in
>> >>> _deatchStorageDomainFromOldPools
>> >>>     dom.acquireClusterLock(host_id)
>> >>>   File "/usr/share/vdsm/storage/sd.py", line 810, in
>> acquireClusterLock
>> >>>     self._manifest.acquireDomainLock(hostID)
>> >>>   File "/usr/share/vdsm/storage/sd.py", line 499, in
>> acquireDomainLock
>> >>>     self._domainLock.acquire(hostID, self.getDomainLease())
>> >>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.p
>> y",
>> >>> line 362, in acquire
>> >>>     "Cannot acquire %s" % (lease,), str(e))
>> >>> AcquireLockFailure: Cannot obtain lock:
>> >>> u"id=60455567-ad30-42e3-a9df-62fe86c7fd25, rc=5, out=Cannot acquire
>> >>> Lease(name='SDM',
>> >>> path=u'/rhev/data-center/mnt/glusterSD/localhost:data2/60455
>> 567-ad30-42e3-a9df-62fe86c7fd25/dom_md/leases',
>> >>> offset=1048576), err=(5, 'Sanlock resource not acquired',
>> 'Input/output
>> >>> error')"
>> >>> 2017-02-09 15:16:26,058 INFO  (jsonrpc/1) [storage.TaskManager.Task]
>> >>> (Task='02cad901-5fe8-4f2d-895b-14184f67feab') aborting: Task is
>> aborted:
>> >>> 'Cannot obtain lock' - code 651 (task:1175)
>> >>> 2017-02-09 15:16:26,059 ERROR (jsonrpc/1) [storage.Dispatcher]
>> {'status':
>> >>> {'message': 'Cannot obtain lock: u"id=60455567-ad30-42e3-a9df-6
>> 2fe86c7fd25,
>> >>> rc=5, out=Cannot acquire Lease(name=\'SDM\',
>> >>> path=u\'/rhev/data-center/mnt/glusterSD/localhost:data2/6045
>> 5567-ad30-42e3-a9df-62fe86c7fd25/dom_md/leases\',
>> >>> offset=1048576), err=(5, \'Sanlock resource not acquired\',
>> \'Input/output
>> >>> error\')"', 'code': 651}} (dispatcher:77)
>> >>> 2017-02-09 15:16:26,059 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC
>> >>> call StorageDomain.detach failed (error 651) in 23.04 seconds
>> (__init__:515)
>> >>>
>> >>> --
>> >>> Doug
>> >>>
>> >>> _______________________________________________
>> >>> Users mailing list
>> >>> Users at ovirt.org
>> >>> http://lists.ovirt.org/mailman/listinfo/users
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Doug
>> >
>> > _______________________________________________
>> > Users mailing list
>> > Users at ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/users
>> >
>>
>
>
>
> --
> Doug
>



-- 
Doug
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170216/8b644ba1/attachment-0001.html>


More information about the Users mailing list