On Fri, Jun 22, 2018 at 3:13 PM Enrico Becchetti <
enrico.becchetti(a)pg.infn.it> wrote:
Dear All,
my ovirt 4.2.1.7-1.el7.centos has three hypervisors, lvm storage and
virtiual machine with
ovirt-engine. All works fine but with one vm when I try to remove its
snapshot I have
this error:
2018-06-22 07:35:48,155+0200 INFO (jsonrpc/5) [vdsm.api] START
prepareMerge(spUUID=u'18d57688-6ed4-43b8-bd7c-0665b55950b7',
subchainInfo={u'img_id': u'c5611862-6504-445e-a6c8-f1e1a95b5df7',
u'sd_id':
u'47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5', u'top_id':
u'0e6f7512-871d-4645-b9c6-320ba7e3bee7', u'base_id':
u'e156ac2e-09ac-4e1e-a139-17fa374a96d4'}) from=::ffff:10.0.0.46,53304,
flow_id=07011450-2296-4a13-a9ed-5d5d2b91be98,
task_id=87f95d85-cc3d-4f29-9883-a4dbb3808f88 (api:46)
2018-06-22 07:35:48,406+0200 INFO (tasks/3) [storage.merge] Preparing
subchain <SubchainInfo sd_id=47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5,
img_id=c5611862-6504-445e-a6c8-f1e1a95b5df7,
top_id=0e6f7512-871d-4645-b9c6-320ba7e3bee7,
base_id=e156ac2e-09ac-4e1e-a139-17fa374a96d4 base_generation=None at
0x7fcf84ae2510> for merge (merge:177)
2018-06-22 07:35:48,614+0200 INFO (tasks/3) [storage.SANLock] Acquiring
Lease(name='e156ac2e-09ac-4e1e-a139-17fa374a96d4',
path='/dev/47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5/leases', offset=115343360)
for host id 1 (clusterlock:377)
2018-06-22 07:35:48,634+0200 ERROR (tasks/3) [storage.guarded] Error
acquiring lock <VolumeLease
ns=04_lease_47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5,
name=e156ac2e-09ac-4e1e-a139-17fa374a96d4, mode=exclusive at
0x7fcfe09ddf90> (guarded:96)
AcquireLockFailure: Cannot obtain lock:
"id=47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5, rc=-227, out=Cannot acquire
Lease(name='e156ac2e-09ac-4e1e-a139-17fa374a96d4',
path='/dev/47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5/leases', offset=115343360),
err=(-227, 'Sanlock resource not acquired', 'Lease resource name is
incorrect')"
2018-06-22 07:35:56,881+0200 INFO (jsonrpc/7) [vdsm.api] FINISH
getAllTasksStatuses return={'allTasksStatus':
{'87f95d85-cc3d-4f29-9883-a4dbb3808f88': {'code': 651, 'message':
'Cannot
obtain lock: "id=47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5, rc=-227, out=Cannot
acquire Lease(name=\'e156ac2e-09ac-4e1e-a139-17fa374a96d4\',
path=\'/dev/47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5/leases\',
offset=115343360), err=(-227, \'Sanlock resource not acquired\', \'Lease
resource name is incorrect\')"', 'taskState': 'finished',
'taskResult':
'cleanSuccess', 'taskID':
'87f95d85-cc3d-4f29-9883-a4dbb3808f88'}}}
from=::ffff:10.0.0.46,53136, task_id=d0e2f4e3-90cb-43c6-aa08-98d1f7efb1bd
(api:52)
The issue is corrupted lease for this volume:
AcquireLockFailure: Cannot obtain lock:
"id=47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5, rc=-227, out=Cannot acquire
Lease(name='e156ac2e-09ac-4e1e-a139-17fa374a96d4',
path='/dev/47b7c9aa-ef53-48bc-bb55-4a1a0ba5c8d5/leases', offset=115343360),
err=(-227, 'Sanlock resource not acquired', 'Lease resource name is
incorrect')"
This root cause is faulty merge code in ovirt < 4.1, creating volume leases
with
incorrect name. These corrupted leases were not detected until you upgrade
to
ovirt >= 4.1, because we started to use volume leases for storage
operations.
The fix is to run
vdsm-tool check-volume-leases
This will check and repair corrupted leases.
Adding Ala to add more info if needed.
Nir