Hi,
We're getting closer to solve it :-)
I'll answer below with my steps, there's one that fails and I don't know
why (probably I missed something).
El 2018-05-17 15:47, Benny Zlotnik escribió:
Sorry, I forgot it's ISCSI, it's a bit different
In my case it would look something like:
2018-05-17 17:30:12,740+0300 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer]
Return 'Volume.getInfo' in bridge with {'status': 'OK',
'domain':
'3e541b2d-
2a49-4eb8-ae4b-aa9acee228c6', 'voltype': 'INTERNAL',
'description':
'{"DiskAlias":"vm_Disk1","DiskDescription":""}',
'parent':
'00000000-0000-0000-
0000-000000000000', 'format': 'RAW', 'generation': 0,
'image':
'dd6b5ae0-196e-4879-b076-a0a8d8a1dfde', 'ctime': '1526566607',
'disktype': 'DATA', '
legality': 'LEGAL', 'mtime': '0', 'apparentsize':
'1073741824',
'children': [], 'pool': '', 'capacity':
'1073741824', 'uuid':
u'221c45e1-7f65-42c8-afc3-0ccc1d6fc148', 'truesize':
'1073741824',
'type': 'PREALLOCATED', 'lease': {'path':
'/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases', 'owners
': [], 'version': None, 'offset': 109051904}} (__init__:355)
I then look for 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 in sanlock.log:
2018-05-17 17:30:12 20753 [3335]: s10:r14 resource
3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb
8-ae4b-aa9acee228c6/leases:109051904 for 2,11,31496
I only could find the entry on one of the hosts. So when I grepped the
uuid I found:
2018-05-16 12:39:44 4761204 [1023]: s33:r103 resource
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
for 23,47,9206
So the resource would
be: 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
and the pid is 31496
Ok, so my resource is
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
and my PID is 9206.
running
$ sanlock direct dump
/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
offset lockspace
resource
timestamp own gen lver
00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6
221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000020753 0001 0004 5
...
In my case the output would be:
[...]
00000000 1876ab86-216f-4a37-a36b-2b5d99fcaad0
c2cfbb02-9981-4fb7-baea-7257a824145c 0004918032 0008 0004 2
[...]
If the vdsm pid changed (and it probably did) it will be different,
so I acquire it for the new pid
$ sanlock client acquire -r
3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
-p 32265
acquire pid 32265
I checked vdsmd's PID
# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
[...]
├─17758 /usr/bin/python2 /usr/share/vdsm/vdsm
So the new PID is 17758.
# sanlock client acquire -r
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
-p 17758
acquire pid 17758
acquire done 0
Then I can see the timestamp changed
$ sanlock direct dump
/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
offset lockspace
resource
timestamp own gen lver
00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6
221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000021210 0001 0005 6
And then I release it:
$ sanlock client release -r
3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
-p 32265
release pid 32265
release done 0
There's where it fails:
# sanlock direct release -r
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
-p 17758
release done -251
And the resource is still stuck.
Is there something I missed there?
> $ sanlock direct dump
> /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
> offset lockspace
> resource
> timestamp own gen lver
> 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6
> 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000000000 0001 0005 6
>
> The timestamp is zeroed and the lease is free
>
> On Thu, May 17, 2018 at 3:38 PM, <nicolas(a)devels.es> wrote:
>
>> This is vdsm 4.19.45. I grepped the disk uuid in
>> /var/log/sanlock.log but unfortunately no entry there...
>>
>> El 2018-05-17 13:11, Benny Zlotnik escribió:
>>
>> Which vdsm version are you using?
>>
>> You can try looking for the image uuid in /var/log/sanlock.log
>>
>> On Thu, May 17, 2018 at 2:40 PM, <nicolas(a)devels.es> wrote:
>>
>> Thanks.
>>
>> I've been able to see the line in the log, however, the format
>> differs slightly from yours.
>>
>> 2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6)
>> [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with
>> {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63',
>> u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38',
u'volumeID':
>> u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID':
>> u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556)
>> 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6)
>> [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with
>> {'status': 'OK', 'domain':
'1876ab86-216f-4a37-a36b-2b5d99fcaad0',
>> 'voltype': 'INTERNAL', 'description': 'None',
'parent':
>> 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format': 'COW',
>> 'generation': 1, 'image':
'b4013aba-a936-4a54-bb14-670d3a8b7c38',
>> 'ctime': '1526470759', 'disktype': '2',
'legality': 'LEGAL',
>> 'mtime': '0', 'apparentsize': '1073741824',
'children': [], 'pool':
>> '', 'capacity': '21474836480', 'uuid':
>> u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize':
'1073741824',
>> 'type': 'SPARSE', 'lease': {'owners': [8],
'version': 1L}}
>> (__init__:582)
>>
>> As you can see, there's no path field there.
>>
>> How should I procceed?
>>
>> El 2018-05-17 12:01, Benny Zlotnik escribió:
>> vdsm-client replaces vdsClient, take a look
>>
here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html
>> [1]
>> [1]
>> [4]
>>
>> On Thu, May 17, 2018 at 1:57 PM, <nicolas(a)devels.es> wrote:
>>
>> The issue is present in the logs:
>>
>> 2018-05-17 11:50:44,822+01 INFO
>> [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller]
>> (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e]
>> Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the
>> volume lease is not FREE - the job is running
>>
>> I tried setting the log level to debug but it seems I have not a
>> vdsm-client command. All I have is a vdsm-tool command. Is it
>> equivalent?
>>
>> Thanks
>>
>> El 2018-05-17 11:49, Benny Zlotnik escribió:
>> By the way, please verify it's the same issue, you should see "the
>> volume lease is not FREE - the job is running" in the engine log
>>
>> On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik
>> <bzlotnik(a)redhat.com>
>> wrote:
>>
>> I see because I am on debug level, you need to enable it in order
>> to
>> see
>>
>>
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2]
>> [2]
>>
>> [1]
>>
>> [3]
>>
>> On Thu, 17 May 2018, 13:10 , <nicolas(a)devels.es> wrote:
>>
>> Hi,
>>
>> Thanks. I've checked vdsm logs on all my hosts but the only entry
>> I can
>> find grepping by Volume.getInfo is like this:
>>
>> 2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0)
>> [jsonrpc.JsonRpcServer]
>> RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
>>
>> I cannot find a line like yours... any other way on how to obtain
>> those
>> parameters. This is an iSCSI based storage FWIW (both source and
>> destination of the movement).
>>
>> Thanks.
>>
>> El 2018-05-17 10:01, Benny Zlotnik escribió:
>> In the vdsm log you will find the volumeInfo log which looks
>> like
>> this:
>>
>> 2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6)
>> [jsonrpc.JsonRpcServer]
>> Return 'Volume.getInfo' in bridge with {'status': 'OK',
>> 'domain':
>> '5c4d2216-
>> 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL',
>> 'description':
>>
'{"DiskAlias":"vm_Disk1","DiskDescription":""}',
'parent':
>> '00000000-0000-0000-
>> 0000-000000000000', 'format': 'RAW', 'generation': 3,
'image':
>> 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime':
'1526543244',
>> 'disktype': 'DATA', '
>> legality': 'LEGAL', 'mtime': '0',
'apparentsize': '1073741824',
>> 'children': [], 'pool': '', 'capacity':
'1073741824', 'uuid':
>> u'7190913d-320c-4fc9-
>> a5b3-c55b26aa30f4', 'truesize': '0', 'type':
'SPARSE', 'lease':
>> {'path':
>>
>>
> u'/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2e
>>
>>
>
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
>>
>> 'owners': [1], 'version': 8L, 'o
>> ffset': 0}} (__init__:355)
>>
>> The lease path in my case is:
>> /rhev/data-center/mnt/10.35.0. [3] [3] [2]
>>
>>
>
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease
>>
>> Then you can look in /var/log/sanlock.log
>>
>> 2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
>>
>>
>
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9-a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease:0
>>
>> for 2,9,5049
>>
>> Then you can use this command to unlock, the pid in this case
>> is 5049
>>
>> sanlock client release -r RESOURCE -p pid
>>
>> On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik
>> <bzlotnik(a)redhat.com>
>> wrote:
>>
>> I believe you've hit this
>>
bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
>> [3]
>>
>> [2]
>>
>> [1]
>>
>> You can try to release the lease manually using the
>>
>> sanlock client
>>
>> command (there's an example in the comments on the bug),
>> once the lease is free the job will fail and the disk can be
> unlock
>
>> On Thu, May 17, 2018 at 11:05 AM, <nicolas(a)devels.es> wrote:
>>
>> Hi,
>>
>> We're running oVirt 4.1.9 (I know it's not the recommended
>> version, but we can't upgrade yet) and recently we had an
> issue
>
>> with a Storage Domain while a VM was moving a disk. The
> Storage
>
>> Domain went down for a few minutes, then it got back.
>>
>> However, the disk's state has stuck in a 'Migrating: 10%'
> state
>
>> (see ss-2.png).
>>
>> I run the 'unlock_entity.sh' script to try to unlock the
> disk,
>
>> with these parameters:
>>
>> # PGPASSWORD=...
>> /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t
> disk -u
>
>> engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
>>
>> The disk's state changed to 'OK', but the actual state still
>> states it's migrating (see ss-1.png).
>>
>> Calling the script with -t all doesn't make a difference
> either.
>
>> Currently, the disk is unmanageable: cannot be deactivated,
> moved
>
>> or copied, as it says there's a copying operation running
> already.
>
>> Could someone provide a way to unlock this disk? I don't mind
>> modifying a value directly into the database, I just need the
>> copying process cancelled.
>>
>> Thanks.
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>
> Links:
> ------
> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
> [3] [2]
>
> Links:
> ------
> [1]
http://10.35.0 [5] [5] [5].
> [2]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
> [3]
> [3]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/
> [2]
> [2] [1]
>
> Links:
> ------
> [1]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/
> [2] [2]
> [2]
http://10.35.0 [5] [5].
> [3]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
> [4]
https://lists.ovirt.org/pipermail/devel/2016-July/013535.html
> [1] [1]
> [5]
http://10.35.0 [5] [5]
>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
>
> Links:
> ------
> [1]
https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1]
> [2]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2]
> [3]
http://10.35.0 [5].
> [4]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4]
> [5]
http://10.35.0 [5]
>
>
>
> Links:
> ------
> [1]
https://lists.ovirt.org/pipermail/devel/2016-July/013535.html
> [2]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/
> [3]
http://10.35.0.
> [4]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040
> [5]
http://10.35.0