Please disregard the last e-mail. I re-run the command and now the exit
code was 0, and the migration process is not stuck anymore.
Thanks so much for all the help, Benny!
Regards.
El 2018-05-18 08:42, nicolas(a)devels.es escribió:
> Hi,
>
> We're getting closer to solve it :-)
>
> I'll answer below with my steps, there's one that fails and I don't
> know why (probably I missed something).
>
> El 2018-05-17 15:47, Benny Zlotnik escribió:
>> Sorry, I forgot it's ISCSI, it's a bit different
>>
>> In my case it would look something like:
>>
>> 2018-05-17 17:30:12,740+0300 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer]
>> Return 'Volume.getInfo' in bridge with {'status': 'OK',
'domain':
>> '3e541b2d-
>> 2a49-4eb8-ae4b-aa9acee228c6', 'voltype': 'INTERNAL',
'description':
>>
'{"DiskAlias":"vm_Disk1","DiskDescription":""}',
'parent':
>> '00000000-0000-0000-
>> 0000-000000000000', 'format': 'RAW', 'generation': 0,
'image':
>> 'dd6b5ae0-196e-4879-b076-a0a8d8a1dfde', 'ctime':
'1526566607',
>> 'disktype': 'DATA', '
>> legality': 'LEGAL', 'mtime': '0',
'apparentsize': '1073741824',
>> 'children': [], 'pool': '', 'capacity':
'1073741824', 'uuid':
>> u'221c45e1-7f65-42c8-afc3-0ccc1d6fc148', 'truesize':
'1073741824',
>> 'type': 'PREALLOCATED', 'lease': {'path':
>> '/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases', 'owners
>> ': [], 'version': None, 'offset': 109051904}} (__init__:355)
>>
>> I then look for 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 in sanlock.log:
>>
>> 2018-05-17 17:30:12 20753 [3335]: s10:r14 resource
>>
3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb
>> 8-ae4b-aa9acee228c6/leases:109051904 for 2,11,31496
>>
>
> I only could find the entry on one of the hosts. So when I grepped the
> uuid I found:
>
> 2018-05-16 12:39:44 4761204 [1023]: s33:r103 resource
>
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
> for 23,47,9206
>
>> So the resource would
>>
be: 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
>> and the pid is 31496
>>
>
> Ok, so my resource is
>
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
> and my PID is 9206.
>
>> running
>> $ sanlock direct dump
>> /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
>>
>> offset lockspace
>> resource
>> timestamp own gen lver
>>
>> 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6
>> 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000020753 0001 0004 5
>> ...
>
> In my case the output would be:
>
> [...]
> 00000000 1876ab86-216f-4a37-a36b-2b5d99fcaad0
> c2cfbb02-9981-4fb7-baea-7257a824145c 0004918032 0008 0004 2
> [...]
>
>>
>> If the vdsm pid changed (and it probably did) it will be different,
>> so I acquire it for the new pid
>> $ sanlock client acquire -r
>>
3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
>> -p 32265
>> acquire pid 32265
>>
>
> I checked vdsmd's PID
>
> # systemctl status vdsmd
> ● vdsmd.service - Virtual Desktop Server Manager
> [...]
> ├─17758 /usr/bin/python2 /usr/share/vdsm/vdsm
>
> So the new PID is 17758.
>
> # sanlock client acquire -r
>
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
> -p 17758
> acquire pid 17758
> acquire done 0
>
>
>> Then I can see the timestamp changed
>>
>> $ sanlock direct dump
>> /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
>> offset lockspace
>> resource
>> timestamp own gen lver
>> 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6
>> 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000021210 0001 0005 6
>>
>> And then I release it:
>> $ sanlock client release -r
>>
3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
>> -p 32265
>>
>> release pid 32265
>> release done 0
>>
>
> There's where it fails:
>
> # sanlock direct release -r
>
1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848
> -p 17758
> release done -251
>
> And the resource is still stuck.
>
> Is there something I missed there?
>
>> $ sanlock direct dump
>> /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
>> offset lockspace
>> resource
>> timestamp own gen lver
>> 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6
>> 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000000000 0001 0005 6
>>
>> The timestamp is zeroed and the lease is free
>>
>> On Thu, May 17, 2018 at 3:38 PM, <nicolas(a)devels.es> wrote:
>>
>>> This is vdsm 4.19.45. I grepped the disk uuid in
>>> /var/log/sanlock.log but unfortunately no entry there...
>>>
>>> El 2018-05-17 13:11, Benny Zlotnik escribió:
>>>
>>> Which vdsm version are you using?
>>>
>>> You can try looking for the image uuid in /var/log/sanlock.log
>>>
>>> On Thu, May 17, 2018 at 2:40 PM, <nicolas(a)devels.es> wrote:
>>>
>>> Thanks.
>>>
>>> I've been able to see the line in the log, however, the format
>>> differs slightly from yours.
>>>
>>> 2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6)
>>> [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with
>>> {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63',
>>> u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38',
u'volumeID':
>>> u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID':
>>> u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556)
>>> 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6)
>>> [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with
>>> {'status': 'OK', 'domain':
'1876ab86-216f-4a37-a36b-2b5d99fcaad0',
>>> 'voltype': 'INTERNAL', 'description': 'None',
'parent':
>>> 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format':
'COW',
>>> 'generation': 1, 'image':
'b4013aba-a936-4a54-bb14-670d3a8b7c38',
>>> 'ctime': '1526470759', 'disktype': '2',
'legality': 'LEGAL',
>>> 'mtime': '0', 'apparentsize': '1073741824',
'children': [], 'pool':
>>> '', 'capacity': '21474836480', 'uuid':
>>> u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize':
'1073741824',
>>> 'type': 'SPARSE', 'lease': {'owners': [8],
'version': 1L}}
>>> (__init__:582)
>>>
>>> As you can see, there's no path field there.
>>>
>>> How should I procceed?
>>>
>>> El 2018-05-17 12:01, Benny Zlotnik escribió:
>>> vdsm-client replaces vdsClient, take a look
>>>
here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html
>>> [1]
>>> [1]
>>> [4]
>>>
>>> On Thu, May 17, 2018 at 1:57 PM, <nicolas(a)devels.es> wrote:
>>>
>>> The issue is present in the logs:
>>>
>>> 2018-05-17 11:50:44,822+01 INFO
>>> [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller]
>>> (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e]
>>> Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the
>>> volume lease is not FREE - the job is running
>>>
>>> I tried setting the log level to debug but it seems I have not a
>>> vdsm-client command. All I have is a vdsm-tool command. Is it
>>> equivalent?
>>>
>>> Thanks
>>>
>>> El 2018-05-17 11:49, Benny Zlotnik escribió:
>>> By the way, please verify it's the same issue, you should see "the
>>> volume lease is not FREE - the job is running" in the engine log
>>>
>>> On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik
>>> <bzlotnik(a)redhat.com>
>>> wrote:
>>>
>>> I see because I am on debug level, you need to enable it in order
>>> to
>>> see
>>>
>>>
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2]
>>> [2]
>>>
>>> [1]
>>>
>>> [3]
>>>
>>> On Thu, 17 May 2018, 13:10 , <nicolas(a)devels.es> wrote:
>>>
>>> Hi,
>>>
>>> Thanks. I've checked vdsm logs on all my hosts but the only entry
>>> I can
>>> find grepping by Volume.getInfo is like this:
>>>
>>> 2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0)
>>> [jsonrpc.JsonRpcServer]
>>> RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
>>>
>>> I cannot find a line like yours... any other way on how to obtain
>>> those
>>> parameters. This is an iSCSI based storage FWIW (both source and
>>> destination of the movement).
>>>
>>> Thanks.
>>>
>>> El 2018-05-17 10:01, Benny Zlotnik escribió:
>>> In the vdsm log you will find the volumeInfo log which looks
>>> like
>>> this:
>>>
>>> 2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6)
>>> [jsonrpc.JsonRpcServer]
>>> Return 'Volume.getInfo' in bridge with {'status':
'OK',
>>> 'domain':
>>> '5c4d2216-
>>> 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL',
>>> 'description':
>>>
'{"DiskAlias":"vm_Disk1","DiskDescription":""}',
'parent':
>>> '00000000-0000-0000-
>>> 0000-000000000000', 'format': 'RAW',
'generation': 3, 'image':
>>> 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime':
'1526543244',
>>> 'disktype': 'DATA', '
>>> legality': 'LEGAL', 'mtime': '0',
'apparentsize': '1073741824',
>>> 'children': [], 'pool': '', 'capacity':
'1073741824', 'uuid':
>>> u'7190913d-320c-4fc9-
>>> a5b3-c55b26aa30f4', 'truesize': '0', 'type':
'SPARSE', 'lease':
>>> {'path':
>>>
>>>
>> u'/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2e
>>>
>>>
>>
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
>>>
>>> 'owners': [1], 'version': 8L, 'o
>>> ffset': 0}} (__init__:355)
>>>
>>> The lease path in my case is:
>>> /rhev/data-center/mnt/10.35.0. [3] [3] [2]
>>>
>>>
>>
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease
>>>
>>> Then you can look in /var/log/sanlock.log
>>>
>>> 2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
>>>
>>>
>>
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9-a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease:0
>>>
>>> for 2,9,5049
>>>
>>> Then you can use this command to unlock, the pid in this case
>>> is 5049
>>>
>>> sanlock client release -r RESOURCE -p pid
>>>
>>> On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik
>>> <bzlotnik(a)redhat.com>
>>> wrote:
>>>
>>> I believe you've hit this
>>>
bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
>>> [3]
>>>
>>> [2]
>>>
>>> [1]
>>>
>>> You can try to release the lease manually using the
>>>
>>> sanlock client
>>>
>>> command (there's an example in the comments on the bug),
>>> once the lease is free the job will fail and the disk can be
>> unlock
>>
>>> On Thu, May 17, 2018 at 11:05 AM, <nicolas(a)devels.es> wrote:
>>>
>>> Hi,
>>>
>>> We're running oVirt 4.1.9 (I know it's not the recommended
>>> version, but we can't upgrade yet) and recently we had an
>> issue
>>
>>> with a Storage Domain while a VM was moving a disk. The
>> Storage
>>
>>> Domain went down for a few minutes, then it got back.
>>>
>>> However, the disk's state has stuck in a 'Migrating: 10%'
>> state
>>
>>> (see ss-2.png).
>>>
>>> I run the 'unlock_entity.sh' script to try to unlock the
>> disk,
>>
>>> with these parameters:
>>>
>>> # PGPASSWORD=...
>>> /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t
>> disk -u
>>
>>> engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
>>>
>>> The disk's state changed to 'OK', but the actual state still
>>> states it's migrating (see ss-1.png).
>>>
>>> Calling the script with -t all doesn't make a difference
>> either.
>>
>>> Currently, the disk is unmanageable: cannot be deactivated,
>> moved
>>
>>> or copied, as it says there's a copying operation running
>> already.
>>
>>> Could someone provide a way to unlock this disk? I don't mind
>>> modifying a value directly into the database, I just need the
>>> copying process cancelled.
>>>
>>> Thanks.
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>
>> Links:
>> ------
>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
>> [3] [2]
>>
>> Links:
>> ------
>> [1]
http://10.35.0 [5] [5] [5].
>> [2]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
>> [3]
>> [3]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/
>> [2]
>> [2] [1]
>>
>> Links:
>> ------
>> [1]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/
>> [2] [2]
>> [2]
http://10.35.0 [5] [5].
>> [3]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4]
>> [4]
https://lists.ovirt.org/pipermail/devel/2016-July/013535.html
>> [1] [1]
>> [5]
http://10.35.0 [5] [5]
>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>>
>> Links:
>> ------
>> [1]
https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1]
>> [2]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2]
>> [3]
http://10.35.0 [5].
>> [4]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4]
>> [5]
http://10.35.0 [5]
>>
>>
>>
>> Links:
>> ------
>> [1]
https://lists.ovirt.org/pipermail/devel/2016-July/013535.html
>> [2]
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/
>> [3]
http://10.35.0.
>> [4]
https://bugzilla.redhat.com/show_bug.cgi?id=1565040
>> [5]
http://10.35.0
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org