Well that didn't go so well. I deleted both dom_md/ids & dom_md/leases in
the cloned volume, and I still can't import the storage domain.
The snapshot was also taken some 4 hours before the attempted import, so
I'm surprised the locks haven't expired by themselves...
2017-02-16 21:58:24,630-03 INFO
[org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand]
(default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] Lock Acquired to
object
'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2017-02-16 21:58:24,645-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand]
(default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] START,
ConnectStorageServerVDSCommand(HostName =
v5.dc0.example.com,
StorageServerConnectionManagementVDSParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000',
storageType='GLUSTERFS',
connectionList='[StorageServerConnections:{id='null',
connection='localhost:data-teste2', iqn='null',
vfsType='glusterfs',
mountOptions='null', nfsVersion='null', nfsRetrans='null',
nfsTimeo='null',
iface='null', netIfaceName='null'}]'}), log id: 726df65e
2017-02-16 21:58:26,046-03 INFO
[org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand]
(default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] Lock freed to
object 'EngineLock:{exclusiveLocks='[localhost:data
teste2=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]',
sharedLocks='null'}'
2017-02-16 21:58:26,206-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand]
(default task-52) [85548427-713f-4ffb-a385-a97a7ee4109d] START,
HSMGetStorageDomainsListVDSCommand(HostName =
v5.dc0.example.com,
HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null',
storageDomainType='Data', path='localhost:data-teste2'}), log id:
79f6cc88
2017-02-16 21:58:27,899-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand]
(default task-50) [38e87311-a7a5-49a8-bf18-857dd969cd5f] START,
HSMGetStorageDomainsListVDSCommand(HostName =
v5.dc0.example.com,
HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null',
storageDomainType='Data', path='localhost:data-teste2'}), log id: 7280d13
2017-02-16 21:58:29,156-03 INFO
[org.ovirt.engine.core.bll.storage.connection.RemoveStorageServerConnectionCommand]
(default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] Lock Acquired to
object
'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>,
5e5f6610-c759-448b-a53d-9a456f513681=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2017-02-16 21:58:29,168-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
(default task-57) [5e4b20cf-60d2-4ae9-951b-c2693603aa6f] START,
DisconnectStorageServerVDSCommand(HostName =
v5.dc0.example.com,
StorageServerConnectionManagementVDSParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000',
storageType='GLUSTERFS',
connectionList='[StorageServerConnections:{id='5e5f6610-c759-448b-a53d-9a456f513681',
connection='localhost:data-teste2', iqn='null',
vfsType='glusterfs',
mountOptions='null', nfsVersion='null', nfsRetrans='null',
nfsTimeo='null',
iface='null', netIfaceName='null'}]'}), log id: 6042b108
2017-02-16 21:58:29,193-03 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
(default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] START,
DisconnectStorageServerVDSCommand(HostName =
v5.dc0.example.com,
StorageServerConnectionManagementVDSParameters:{runAsync='true',
hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0',
storagePoolId='00000000-0000-0000-0000-000000000000',
storageType='GLUSTERFS',
connectionList='[StorageServerConnections:{id='5e5f6610-c759-448b-a53d-9a456f513681',
connection='localhost:data-teste2', iqn='null',
vfsType='glusterfs',
mountOptions='null', nfsVersion='null', nfsRetrans='null',
nfsTimeo='null',
iface='null', netIfaceName='null'}]'}), log id: 4e9421cf
2017-02-16 21:58:31,398-03 INFO
[org.ovirt.engine.core.bll.storage.connection.RemoveStorageServerConnectionCommand]
(default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] Lock freed to
object
'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>,
5e5f6610-c759-448b-a53d-9a456f513681=<STORAGE_CONNECTION,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
Again, many thanks!
Doug
On 16 February 2017 at 18:53, Doug Ingham <dougti(a)gmail.com> wrote:
Hi Nir,
On 16 February 2017 at 13:55, Nir Soffer <nsoffer(a)redhat.com> wrote:
> On Mon, Feb 13, 2017 at 3:35 PM, Doug Ingham <dougti(a)gmail.com> wrote:
> > Hi Sahina,
> >
> > On 13 February 2017 at 05:45, Sahina Bose <sabose(a)redhat.com> wrote:
> >>
> >> Any errors in the gluster mount logs for this gluster volume?
> >>
> >> How about "gluster vol heal <volname> info" - does it list
any entries
> to
> >> heal?
> >
> >
> > After more investigating, I found out that there is a sanlock daemon
> that
> > runs with VDSM, independently of the HE, so I'd basically have to bring
> the
> > volume down & wait for the leases to expire/delete them* before I can
> import
> > the domain.
> >
> > *I understand removing /dom_md/leases/ should do the job?
>
> No, the issue is probably dom_md/ids accessed by sanlock, but removing
> files
> accessed by sanlock will not help, an open file will remain open until
> sanlock
> close the file.
>
I'm testing this with volume snapshots at the moment, so there are no
processes accessing the new volume.
Did you try to reboot the host before installing it again? If you did and
> you
> still have these issues, you probably need to remove the previous
> installation
> properly before installing again.
>
> Adding Simone to help with uninstalling and reinstalling hosted engine.
>
The Hosted-Engine database had been corrupted and the restore wasn't
running correctly, so I installed a new engine on a new server - no
restores or old data. The aim is to import the old storage domain into the
new Engine & then import the VMs into the new storage domain.
My only problem with this is that there appear to be some file based
leases somewhere that, unless I manage to locate & delete them, force me to
wait for the leases to timeout before I can import the old storage domain.
To minimise downtime, I'm trying to avoid having to wait for the leases to
timeout.
Regards,
Doug
>
> Nir
>
> >
> >
> >>
> >>
> >> On Thu, Feb 9, 2017 at 11:57 PM, Doug Ingham <dougti(a)gmail.com>
wrote:
> >>>
> >>> Some interesting output from the vdsm log...
> >>>
> >>>
> >>> 2017-02-09 15:16:24,051 INFO (jsonrpc/1) [storage.StorageDomain]
> >>> Resource namespace 01_img_60455567-ad30-42e3-a9df-62fe86c7fd25
> already
> >>> registered (sd:731)
> >>> 2017-02-09 15:16:24,051 INFO (jsonrpc/1) [storage.StorageDomain]
> >>> Resource namespace 02_vol_60455567-ad30-42e3-a9df-62fe86c7fd25
> already
> >>> registered (sd:740)
> >>> 2017-02-09 15:16:24,052 INFO (jsonrpc/1) [storage.SANLock] Acquiring
> >>> Lease(name='SDM',
> >>> path=u'/rhev/data-center/mnt/glusterSD/localhost:data2/60455
> 567-ad30-42e3-a9df-6
> >>> 2fe86c7fd25/dom_md/leases', offset=1048576) for host id 1
> >>> (clusterlock:343)
> >>> 2017-02-09 15:16:24,057 INFO (jsonrpc/1) [storage.SANLock] Releasing
> >>> host id for domain 60455567-ad30-42e3-a9df-62fe86c7fd25 (id: 1)
> >>> (clusterlock:305)
> >>> 2017-02-09 15:16:25,149 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
> >>> call GlusterHost.list succeeded in 0.17 seconds (__init__:515)
> >>> 2017-02-09 15:16:25,264 INFO (Reactor thread)
> >>> [ProtocolDetector.AcceptorImpl] Accepted connection from
> >>> ::ffff:127.0.0.1:55060 (protocoldetector:72)
> >>> 2017-02-09 15:16:25,270 INFO (Reactor thread)
> >>> [ProtocolDetector.Detector] Detected protocol stomp from
> >>> ::ffff:127.0.0.1:55060 (protocoldetector:127)
> >>> 2017-02-09 15:16:25,271 INFO (Reactor thread) [Broker.StompAdapter]
> >>> Processing CONNECT request (stompreactor:102)
> >>> 2017-02-09 15:16:25,271 INFO (JsonRpc (StompReactor))
> >>> [Broker.StompAdapter] Subscribe command received (stompreactor:129)
> >>> 2017-02-09 15:16:25,416 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
> >>> call Host.getHardwareInfo succeeded in 0.01 seconds (__init__:515)
> >>> 2017-02-09 15:16:25,419 INFO (jsonrpc/6) [dispatcher] Run and
> protect:
> >>> repoStats(options=None) (logUtils:49)
> >>> 2017-02-09 15:16:25,419 INFO (jsonrpc/6) [dispatcher] Run and
> protect:
> >>> repoStats, Return response:
{u'e8d04da7-ad3d-4227-a45d-b5a29b2f43e5':
> >>> {'code': 0, 'actual': True
> >>> , 'version': 4, 'acquired': True, 'delay':
'0.000854128', 'lastCheck':
> >>> '5.1', 'valid': True},
u'a77b8821-ff19-4d17-a3ce-a6c3a69436d5':
> {'code': 0,
> >>> 'actual': True, 'vers
> >>> ion': 4, 'acquired': True, 'delay':
'0.000966556', 'lastCheck': '2.6',
> >>> 'valid': True}} (logUtils:52)
> >>> 2017-02-09 15:16:25,447 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
> >>> call Host.getStats succeeded in 0.03 seconds (__init__:515)
> >>> 2017-02-09 15:16:25,450 ERROR (JsonRpc (StompReactor))
> [vds.dispatcher]
> >>> SSL error receiving from <yajsonrpc.betterAsyncore.Dispatcher
> connected
> >>> ('::ffff:127.0.0.1', 55060, 0, 0) at 0x7f69c0043cf8>:
unexpected eof
> >>> (betterAsyncore:113)
> >>> 2017-02-09 15:16:25,812 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
> >>> call GlusterVolume.list succeeded in 0.10 seconds (__init__:515)
> >>> 2017-02-09 15:16:25,940 INFO (Reactor thread)
> >>> [ProtocolDetector.AcceptorImpl] Accepted connection from
> >>> ::ffff:127.0.0.1:55062 (protocoldetector:72)
> >>> 2017-02-09 15:16:25,946 INFO (Reactor thread)
> >>> [ProtocolDetector.Detector] Detected protocol stomp from
> >>> ::ffff:127.0.0.1:55062 (protocoldetector:127)
> >>> 2017-02-09 15:16:25,947 INFO (Reactor thread) [Broker.StompAdapter]
> >>> Processing CONNECT request (stompreactor:102)
> >>> 2017-02-09 15:16:25,947 INFO (JsonRpc (StompReactor))
> >>> [Broker.StompAdapter] Subscribe command received (stompreactor:129)
> >>> 2017-02-09 15:16:26,058 ERROR (jsonrpc/1) [storage.TaskManager.Task]
> >>> (Task='02cad901-5fe8-4f2d-895b-14184f67feab') Unexpected error
> (task:870)
> >>> Traceback (most recent call last):
> >>> File "/usr/share/vdsm/storage/task.py", line 877, in _run
> >>> return fn(*args, **kargs)
> >>> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py",
line 50,
> in
> >>> wrapper
> >>> res = f(*args, **kwargs)
> >>> File "/usr/share/vdsm/storage/hsm.py", line 812, in
> >>> forcedDetachStorageDomain
> >>> self._deatchStorageDomainFromOldPools(sdUUID)
> >>> File "/usr/share/vdsm/storage/hsm.py", line 790, in
> >>> _deatchStorageDomainFromOldPools
> >>> dom.acquireClusterLock(host_id)
> >>> File "/usr/share/vdsm/storage/sd.py", line 810, in
> acquireClusterLock
> >>> self._manifest.acquireDomainLock(hostID)
> >>> File "/usr/share/vdsm/storage/sd.py", line 499, in
> acquireDomainLock
> >>> self._domainLock.acquire(hostID, self.getDomainLease())
> >>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.p
> y",
> >>> line 362, in acquire
> >>> "Cannot acquire %s" % (lease,), str(e))
> >>> AcquireLockFailure: Cannot obtain lock:
> >>> u"id=60455567-ad30-42e3-a9df-62fe86c7fd25, rc=5, out=Cannot
acquire
> >>> Lease(name='SDM',
> >>> path=u'/rhev/data-center/mnt/glusterSD/localhost:data2/60455
> 567-ad30-42e3-a9df-62fe86c7fd25/dom_md/leases',
> >>> offset=1048576), err=(5, 'Sanlock resource not acquired',
> 'Input/output
> >>> error')"
> >>> 2017-02-09 15:16:26,058 INFO (jsonrpc/1) [storage.TaskManager.Task]
> >>> (Task='02cad901-5fe8-4f2d-895b-14184f67feab') aborting: Task is
> aborted:
> >>> 'Cannot obtain lock' - code 651 (task:1175)
> >>> 2017-02-09 15:16:26,059 ERROR (jsonrpc/1) [storage.Dispatcher]
> {'status':
> >>> {'message': 'Cannot obtain lock:
u"id=60455567-ad30-42e3-a9df-6
> 2fe86c7fd25,
> >>> rc=5, out=Cannot acquire Lease(name=\'SDM\',
> >>> path=u\'/rhev/data-center/mnt/glusterSD/localhost:data2/6045
> 5567-ad30-42e3-a9df-62fe86c7fd25/dom_md/leases\',
> >>> offset=1048576), err=(5, \'Sanlock resource not acquired\',
> \'Input/output
> >>> error\')"', 'code': 651}} (dispatcher:77)
> >>> 2017-02-09 15:16:26,059 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC
> >>> call StorageDomain.detach failed (error 651) in 23.04 seconds
> (__init__:515)
> >>>
> >>> --
> >>> Doug
> >>>
> >>> _______________________________________________
> >>> Users mailing list
> >>> Users(a)ovirt.org
> >>>
http://lists.ovirt.org/mailman/listinfo/users
> >>>
> >>
> >
> >
> >
> > --
> > Doug
> >
> > _______________________________________________
> > Users mailing list
> > Users(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/users
> >
>
--
Doug
--
Doug