Re: VM's disk stuck in migrating state

Thanks. I've been able to see the line in the log, however, the format differs slightly from yours. 2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63', u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38', u'volumeID': u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID': u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556) 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '1876ab86-216f-4a37-a36b-2b5d99fcaad0', 'voltype': 'INTERNAL', 'description': 'None', 'parent': 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format': 'COW', 'generation': 1, 'image': 'b4013aba-a936-4a54-bb14-670d3a8b7c38', 'ctime': '1526470759', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '21474836480', 'uuid': u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize': '1073741824', 'type': 'SPARSE', 'lease': {'owners': [8], 'version': 1L}} (__init__:582) As you can see, there's no path field there. How should I procceed? El 2018-05-17 12:01, Benny Zlotnik escribió:
vdsm-client replaces vdsClient, take a look here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [4]
On Thu, May 17, 2018 at 1:57 PM, <nicolas@devels.es> wrote:
The issue is present in the logs:
2018-05-17 11:50:44,822+01 INFO [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller] (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e] Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the volume lease is not FREE - the job is running
I tried setting the log level to debug but it seems I have not a vdsm-client command. All I have is a vdsm-tool command. Is it equivalent?
Thanks
El 2018-05-17 11:49, Benny Zlotnik escribió: By the way, please verify it's the same issue, you should see "the volume lease is not FREE - the job is running" in the engine log
On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I see because I am on debug level, you need to enable it in order to see
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [1] [3]
On Thu, 17 May 2018, 13:10 , <nicolas@devels.es> wrote:
Hi,
Thanks. I've checked vdsm logs on all my hosts but the only entry I can find grepping by Volume.getInfo is like this:
2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
I cannot find a line like yours... any other way on how to obtain those parameters. This is an iSCSI based storage FWIW (both source and destination of the movement).
Thanks.
El 2018-05-17 10:01, Benny Zlotnik escribió: In the vdsm log you will find the volumeInfo log which looks like this:
2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '5c4d2216- 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 3, 'image': 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime': '1526543244', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'7190913d-320c-4fc9- a5b3-c55b26aa30f4', 'truesize': '0', 'type': 'SPARSE', 'lease': {'path':
u'/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2e
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
'owners': [1], 'version': 8L, 'o ffset': 0}} (__init__:355)
The lease path in my case is: /rhev/data-center/mnt/10.35.0. [2]
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease
Then you can look in /var/log/sanlock.log
2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9-a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease:0
for 2,9,5049
Then you can use this command to unlock, the pid in this case is 5049
sanlock client release -r RESOURCE -p pid
On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I believe you've hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [3] [2]
[1]
You can try to release the lease manually using the sanlock client
command (there's an example in the comments on the bug), once the lease is free the job will fail and the disk can be unlock
On Thu, May 17, 2018 at 11:05 AM, <nicolas@devels.es> wrote:
Hi,
We're running oVirt 4.1.9 (I know it's not the recommended version, but we can't upgrade yet) and recently we had an issue
with a Storage Domain while a VM was moving a disk. The Storage
Domain went down for a few minutes, then it got back.
However, the disk's state has stuck in a 'Migrating: 10%' state
(see ss-2.png).
I run the 'unlock_entity.sh' script to try to unlock the disk,
with these parameters:
# PGPASSWORD=... /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t disk -u
engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
The disk's state changed to 'OK', but the actual state still states it's migrating (see ss-1.png).
Calling the script with -t all doesn't make a difference either.
Currently, the disk is unmanageable: cannot be deactivated, moved
or copied, as it says there's a copying operation running already.
Could someone provide a way to unlock this disk? I don't mind modifying a value directly into the database, I just need the copying process cancelled.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [3] [2]
Links: ------ [1] http://10.35.0 [5]. [2] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [3] [3] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [1]
Links: ------ [1] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] http://10.35.0. [3] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [5] http://10.35.0

Which vdsm version are you using? You can try looking for the image uuid in /var/log/sanlock.log On Thu, May 17, 2018 at 2:40 PM, <nicolas@devels.es> wrote:
Thanks.
I've been able to see the line in the log, however, the format differs slightly from yours.
2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63', u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38', u'volumeID': u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID': u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556) 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '1876ab86-216f-4a37-a36b-2b5d99fcaad0', 'voltype': 'INTERNAL', 'description': 'None', 'parent': 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format': 'COW', 'generation': 1, 'image': 'b4013aba-a936-4a54-bb14-670d3a8b7c38', 'ctime': '1526470759', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '21474836480', 'uuid': u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize': '1073741824', 'type': 'SPARSE', 'lease': {'owners': [8], 'version': 1L}} (__init__:582)
As you can see, there's no path field there.
How should I procceed?
El 2018-05-17 12:01, Benny Zlotnik escribió:
vdsm-client replaces vdsClient, take a look here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [4]
On Thu, May 17, 2018 at 1:57 PM, <nicolas@devels.es> wrote:
The issue is present in the logs:
2018-05-17 11:50:44,822+01 INFO [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller] (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e] Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the volume lease is not FREE - the job is running
I tried setting the log level to debug but it seems I have not a vdsm-client command. All I have is a vdsm-tool command. Is it equivalent?
Thanks
El 2018-05-17 11:49, Benny Zlotnik escribió: By the way, please verify it's the same issue, you should see "the volume lease is not FREE - the job is running" in the engine log
On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I see because I am on debug level, you need to enable it in order to see
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [1]
[3]
On Thu, 17 May 2018, 13:10 , <nicolas@devels.es> wrote:
Hi,
Thanks. I've checked vdsm logs on all my hosts but the only entry I can find grepping by Volume.getInfo is like this:
2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
I cannot find a line like yours... any other way on how to obtain those parameters. This is an iSCSI based storage FWIW (both source and destination of the movement).
Thanks.
El 2018-05-17 10:01, Benny Zlotnik escribió: In the vdsm log you will find the volumeInfo log which looks like this:
2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '5c4d2216- 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 3, 'image': 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime': '1526543244', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'7190913d-320c-4fc9- a5b3-c55b26aa30f4', 'truesize': '0', 'type': 'SPARSE', 'lease': {'path':
u'/rhev/data-center/mnt/10.35.0.233:_root_storage__ domains_sd1/5c4d2216-2e
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d- 6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
'owners': [1], 'version': 8L, 'o
ffset': 0}} (__init__:355)
The lease path in my case is: /rhev/data-center/mnt/10.35.0. [2]
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5 f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/71909 13d-320c-4fc9-a5b3-c55b26aa30f4.lease
Then you can look in /var/log/sanlock.log
2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9- a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_ storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/ images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d- 320c-4fc9-a5b3-c55b26aa30f4.lease:0
for 2,9,5049
Then you can use this command to unlock, the pid in this case is 5049
sanlock client release -r RESOURCE -p pid
On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I believe you've hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [3] [2]
[1]
You can try to release the lease manually using the
sanlock client
command (there's an example in the comments on the bug),
once the lease is free the job will fail and the disk can be
unlock
On Thu, May 17, 2018 at 11:05 AM, <nicolas@devels.es> wrote:
Hi,
We're running oVirt 4.1.9 (I know it's not the recommended version, but we can't upgrade yet) and recently we had an
issue
with a Storage Domain while a VM was moving a disk. The
Storage
Domain went down for a few minutes, then it got back.
However, the disk's state has stuck in a 'Migrating: 10%'
state
(see ss-2.png).
I run the 'unlock_entity.sh' script to try to unlock the
disk,
with these parameters:
# PGPASSWORD=... /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t
disk -u
engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
The disk's state changed to 'OK', but the actual state still states it's migrating (see ss-1.png).
Calling the script with -t all doesn't make a difference
either.
Currently, the disk is unmanageable: cannot be deactivated,
moved
or copied, as it says there's a copying operation running
already.
Could someone provide a way to unlock this disk? I don't mind
modifying a value directly into the database, I just need the copying process cancelled.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [3] [2]
Links: ------ [1] http://10.35.0 [5]. [2] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [3] [3] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [1]
Links: ------ [1] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] http://10.35.0. [3] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [5] http://10.35.0
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org

This is vdsm 4.19.45. I grepped the disk uuid in /var/log/sanlock.log but unfortunately no entry there... El 2018-05-17 13:11, Benny Zlotnik escribió:
Which vdsm version are you using?
You can try looking for the image uuid in /var/log/sanlock.log
On Thu, May 17, 2018 at 2:40 PM, <nicolas@devels.es> wrote:
Thanks.
I've been able to see the line in the log, however, the format differs slightly from yours.
2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63', u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38', u'volumeID': u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID': u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556) 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '1876ab86-216f-4a37-a36b-2b5d99fcaad0', 'voltype': 'INTERNAL', 'description': 'None', 'parent': 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format': 'COW', 'generation': 1, 'image': 'b4013aba-a936-4a54-bb14-670d3a8b7c38', 'ctime': '1526470759', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '21474836480', 'uuid': u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize': '1073741824', 'type': 'SPARSE', 'lease': {'owners': [8], 'version': 1L}} (__init__:582)
As you can see, there's no path field there.
How should I procceed?
El 2018-05-17 12:01, Benny Zlotnik escribió: vdsm-client replaces vdsClient, take a look here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [4]
On Thu, May 17, 2018 at 1:57 PM, <nicolas@devels.es> wrote:
The issue is present in the logs:
2018-05-17 11:50:44,822+01 INFO [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller] (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e] Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the volume lease is not FREE - the job is running
I tried setting the log level to debug but it seems I have not a vdsm-client command. All I have is a vdsm-tool command. Is it equivalent?
Thanks
El 2018-05-17 11:49, Benny Zlotnik escribió: By the way, please verify it's the same issue, you should see "the volume lease is not FREE - the job is running" in the engine log
On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I see because I am on debug level, you need to enable it in order to see
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [1]
[3]
On Thu, 17 May 2018, 13:10 , <nicolas@devels.es> wrote:
Hi,
Thanks. I've checked vdsm logs on all my hosts but the only entry I can find grepping by Volume.getInfo is like this:
2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
I cannot find a line like yours... any other way on how to obtain those parameters. This is an iSCSI based storage FWIW (both source and destination of the movement).
Thanks.
El 2018-05-17 10:01, Benny Zlotnik escribió: In the vdsm log you will find the volumeInfo log which looks like this:
2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '5c4d2216- 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 3, 'image': 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime': '1526543244', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'7190913d-320c-4fc9- a5b3-c55b26aa30f4', 'truesize': '0', 'type': 'SPARSE', 'lease': {'path':
u'/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2e
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
'owners': [1], 'version': 8L, 'o ffset': 0}} (__init__:355)
The lease path in my case is: /rhev/data-center/mnt/10.35.0. [3] [2]
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease
Then you can look in /var/log/sanlock.log
2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9-a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease:0
for 2,9,5049
Then you can use this command to unlock, the pid in this case is 5049
sanlock client release -r RESOURCE -p pid
On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I believe you've hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [3] [2]
[1]
You can try to release the lease manually using the
sanlock client
command (there's an example in the comments on the bug), once the lease is free the job will fail and the disk can be unlock
On Thu, May 17, 2018 at 11:05 AM, <nicolas@devels.es> wrote:
Hi,
We're running oVirt 4.1.9 (I know it's not the recommended version, but we can't upgrade yet) and recently we had an issue
with a Storage Domain while a VM was moving a disk. The Storage
Domain went down for a few minutes, then it got back.
However, the disk's state has stuck in a 'Migrating: 10%' state
(see ss-2.png).
I run the 'unlock_entity.sh' script to try to unlock the disk,
with these parameters:
# PGPASSWORD=... /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t disk -u
engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
The disk's state changed to 'OK', but the actual state still states it's migrating (see ss-1.png).
Calling the script with -t all doesn't make a difference either.
Currently, the disk is unmanageable: cannot be deactivated, moved
or copied, as it says there's a copying operation running already.
Could someone provide a way to unlock this disk? I don't mind modifying a value directly into the database, I just need the copying process cancelled.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [3] [2]
Links: ------ [1] http://10.35.0 [5] [5]. [2] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [3] [3] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [1]
Links: ------ [1] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2] http://10.35.0 [5]. [3] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [5] http://10.35.0 [5]
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [2] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [3] http://10.35.0. [4] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [5] http://10.35.0

Sorry, I forgot it's ISCSI, it's a bit different In my case it would look something like: 2018-05-17 17:30:12,740+0300 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '3e541b2d- 2a49-4eb8-ae4b-aa9acee228c6', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 0, 'image': 'dd6b5ae0-196e-4879-b076-a0a8d8a1dfde', 'ctime': '1526566607', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'221c45e1-7f65-42c8-afc3-0ccc1d6fc148', 'truesize': '1073741824', 'type': 'PREALLOCATED', 'lease': {'path': '/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases', 'owners ': [], 'version': None, 'offset': 109051904}} (__init__:355) I then look for 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 in sanlock.log: 2018-05-17 17:30:12 20753 [3335]: s10:r14 resource 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb 8-ae4b-aa9acee228c6/leases:109051904 for 2,11,31496 So the resource would be: 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 and the pid is 31496 running $ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 offset lockspace resource timestamp own gen lver 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000020753 0001 0004 5 ... If the vdsm pid changed (and it probably did) it will be different, so I acquire it for the new pid $ sanlock client acquire -r 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 -p 32265 acquire pid 32265 Then I can see the timestamp changed $ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 offset lockspace resource timestamp own gen lver 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000021210 0001 0005 6 And then I release it: $ sanlock client release -r 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 -p 32265 release pid 32265 release done 0 $ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 offset lockspace resource timestamp own gen lver 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000000000 0001 0005 6 The timestamp is zeroed and the lease is free On Thu, May 17, 2018 at 3:38 PM, <nicolas@devels.es> wrote:
This is vdsm 4.19.45. I grepped the disk uuid in /var/log/sanlock.log but unfortunately no entry there...
El 2018-05-17 13:11, Benny Zlotnik escribió:
Which vdsm version are you using?
You can try looking for the image uuid in /var/log/sanlock.log
On Thu, May 17, 2018 at 2:40 PM, <nicolas@devels.es> wrote:
Thanks.
I've been able to see the line in the log, however, the format differs slightly from yours.
2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63', u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38', u'volumeID': u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID': u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556) 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '1876ab86-216f-4a37-a36b-2b5d99fcaad0', 'voltype': 'INTERNAL', 'description': 'None', 'parent': 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format': 'COW', 'generation': 1, 'image': 'b4013aba-a936-4a54-bb14-670d3a8b7c38', 'ctime': '1526470759', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '21474836480', 'uuid': u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize': '1073741824', 'type': 'SPARSE', 'lease': {'owners': [8], 'version': 1L}} (__init__:582)
As you can see, there's no path field there.
How should I procceed?
El 2018-05-17 12:01, Benny Zlotnik escribió: vdsm-client replaces vdsClient, take a look here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [4]
On Thu, May 17, 2018 at 1:57 PM, <nicolas@devels.es> wrote:
The issue is present in the logs:
2018-05-17 11:50:44,822+01 INFO [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller] (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e] Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the volume lease is not FREE - the job is running
I tried setting the log level to debug but it seems I have not a vdsm-client command. All I have is a vdsm-tool command. Is it equivalent?
Thanks
El 2018-05-17 11:49, Benny Zlotnik escribió: By the way, please verify it's the same issue, you should see "the volume lease is not FREE - the job is running" in the engine log
On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I see because I am on debug level, you need to enable it in order to see
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2]
[1]
[3]
On Thu, 17 May 2018, 13:10 , <nicolas@devels.es> wrote:
Hi,
Thanks. I've checked vdsm logs on all my hosts but the only entry I can find grepping by Volume.getInfo is like this:
2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
I cannot find a line like yours... any other way on how to obtain those parameters. This is an iSCSI based storage FWIW (both source and destination of the movement).
Thanks.
El 2018-05-17 10:01, Benny Zlotnik escribió: In the vdsm log you will find the volumeInfo log which looks like this:
2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '5c4d2216- 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 3, 'image': 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime': '1526543244', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'7190913d-320c-4fc9- a5b3-c55b26aa30f4', 'truesize': '0', 'type': 'SPARSE', 'lease': {'path':
u'/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_
sd1/5c4d2216-2e
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-
6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
'owners': [1], 'version': 8L, 'o ffset': 0}} (__init__:355)
The lease path in my case is: /rhev/data-center/mnt/10.35.0. [3] [2]
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5
f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/71909 13d-320c-4fc9-a5b3-c55b26aa30f4.lease
Then you can look in /var/log/sanlock.log
2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9-
a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_ storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/ images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d- 320c-4fc9-a5b3-c55b26aa30f4.lease:0
for 2,9,5049
Then you can use this command to unlock, the pid in this case is 5049
sanlock client release -r RESOURCE -p pid
On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I believe you've hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [3]
[2]
[1]
You can try to release the lease manually using the
sanlock client
command (there's an example in the comments on the bug),
once the lease is free the job will fail and the disk can be
unlock
On Thu, May 17, 2018 at 11:05 AM, <nicolas@devels.es> wrote:
Hi,
We're running oVirt 4.1.9 (I know it's not the recommended version, but we can't upgrade yet) and recently we had an
issue
with a Storage Domain while a VM was moving a disk. The
Storage
Domain went down for a few minutes, then it got back.
However, the disk's state has stuck in a 'Migrating: 10%'
state
(see ss-2.png).
I run the 'unlock_entity.sh' script to try to unlock the
disk,
with these parameters:
# PGPASSWORD=... /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t
disk -u
engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
The disk's state changed to 'OK', but the actual state still states it's migrating (see ss-1.png).
Calling the script with -t all doesn't make a difference
either.
Currently, the disk is unmanageable: cannot be deactivated,
moved
or copied, as it says there's a copying operation running
already.
Could someone provide a way to unlock this disk? I don't mind
modifying a value directly into the database, I just need the copying process cancelled.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [3] [2]
Links: ------ [1] http://10.35.0 [5] [5]. [2] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [3] [3] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [1]
Links: ------ [1] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2] http://10.35.0 [5]. [3] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [5] http://10.35.0 [5]
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [2] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [3] http://10.35.0. [4] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [5] http://10.35.0

Hi, We're getting closer to solve it :-) I'll answer below with my steps, there's one that fails and I don't know why (probably I missed something). El 2018-05-17 15:47, Benny Zlotnik escribió:
Sorry, I forgot it's ISCSI, it's a bit different
In my case it would look something like:
2018-05-17 17:30:12,740+0300 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '3e541b2d- 2a49-4eb8-ae4b-aa9acee228c6', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 0, 'image': 'dd6b5ae0-196e-4879-b076-a0a8d8a1dfde', 'ctime': '1526566607', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'221c45e1-7f65-42c8-afc3-0ccc1d6fc148', 'truesize': '1073741824', 'type': 'PREALLOCATED', 'lease': {'path': '/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases', 'owners ': [], 'version': None, 'offset': 109051904}} (__init__:355)
I then look for 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 in sanlock.log:
2018-05-17 17:30:12 20753 [3335]: s10:r14 resource 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb 8-ae4b-aa9acee228c6/leases:109051904 for 2,11,31496
I only could find the entry on one of the hosts. So when I grepped the uuid I found: 2018-05-16 12:39:44 4761204 [1023]: s33:r103 resource 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 for 23,47,9206
So the resource would be: 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 and the pid is 31496
Ok, so my resource is 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 and my PID is 9206.
running $ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
offset lockspace resource timestamp own gen lver
00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000020753 0001 0004 5 ...
In my case the output would be: [...] 00000000 1876ab86-216f-4a37-a36b-2b5d99fcaad0 c2cfbb02-9981-4fb7-baea-7257a824145c 0004918032 0008 0004 2 [...]
If the vdsm pid changed (and it probably did) it will be different, so I acquire it for the new pid $ sanlock client acquire -r 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 -p 32265 acquire pid 32265
I checked vdsmd's PID # systemctl status vdsmd ● vdsmd.service - Virtual Desktop Server Manager [...] ├─17758 /usr/bin/python2 /usr/share/vdsm/vdsm So the new PID is 17758. # sanlock client acquire -r 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 -p 17758 acquire pid 17758 acquire done 0
Then I can see the timestamp changed
$ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 offset lockspace resource timestamp own gen lver 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000021210 0001 0005 6
And then I release it: $ sanlock client release -r 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 -p 32265
release pid 32265 release done 0
There's where it fails: # sanlock direct release -r 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 -p 17758 release done -251 And the resource is still stuck. Is there something I missed there?
$ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 offset lockspace resource timestamp own gen lver 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000000000 0001 0005 6
The timestamp is zeroed and the lease is free
On Thu, May 17, 2018 at 3:38 PM, <nicolas@devels.es> wrote:
This is vdsm 4.19.45. I grepped the disk uuid in /var/log/sanlock.log but unfortunately no entry there...
El 2018-05-17 13:11, Benny Zlotnik escribió:
Which vdsm version are you using?
You can try looking for the image uuid in /var/log/sanlock.log
On Thu, May 17, 2018 at 2:40 PM, <nicolas@devels.es> wrote:
Thanks.
I've been able to see the line in the log, however, the format differs slightly from yours.
2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63', u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38', u'volumeID': u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID': u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556) 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '1876ab86-216f-4a37-a36b-2b5d99fcaad0', 'voltype': 'INTERNAL', 'description': 'None', 'parent': 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format': 'COW', 'generation': 1, 'image': 'b4013aba-a936-4a54-bb14-670d3a8b7c38', 'ctime': '1526470759', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '21474836480', 'uuid': u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize': '1073741824', 'type': 'SPARSE', 'lease': {'owners': [8], 'version': 1L}} (__init__:582)
As you can see, there's no path field there.
How should I procceed?
El 2018-05-17 12:01, Benny Zlotnik escribió: vdsm-client replaces vdsClient, take a look here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [1] [4]
On Thu, May 17, 2018 at 1:57 PM, <nicolas@devels.es> wrote:
The issue is present in the logs:
2018-05-17 11:50:44,822+01 INFO [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller] (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e] Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the volume lease is not FREE - the job is running
I tried setting the log level to debug but it seems I have not a vdsm-client command. All I have is a vdsm-tool command. Is it equivalent?
Thanks
El 2018-05-17 11:49, Benny Zlotnik escribió: By the way, please verify it's the same issue, you should see "the volume lease is not FREE - the job is running" in the engine log
On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I see because I am on debug level, you need to enable it in order to see
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2]
[1]
[3]
On Thu, 17 May 2018, 13:10 , <nicolas@devels.es> wrote:
Hi,
Thanks. I've checked vdsm logs on all my hosts but the only entry I can find grepping by Volume.getInfo is like this:
2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
I cannot find a line like yours... any other way on how to obtain those parameters. This is an iSCSI based storage FWIW (both source and destination of the movement).
Thanks.
El 2018-05-17 10:01, Benny Zlotnik escribió: In the vdsm log you will find the volumeInfo log which looks like this:
2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '5c4d2216- 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 3, 'image': 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime': '1526543244', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'7190913d-320c-4fc9- a5b3-c55b26aa30f4', 'truesize': '0', 'type': 'SPARSE', 'lease': {'path':
u'/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2e
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
'owners': [1], 'version': 8L, 'o ffset': 0}} (__init__:355)
The lease path in my case is: /rhev/data-center/mnt/10.35.0. [3] [3] [2]
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease
Then you can look in /var/log/sanlock.log
2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9-a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease:0
for 2,9,5049
Then you can use this command to unlock, the pid in this case is 5049
sanlock client release -r RESOURCE -p pid
On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I believe you've hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [3]
[2]
[1]
You can try to release the lease manually using the
sanlock client
command (there's an example in the comments on the bug), once the lease is free the job will fail and the disk can be
unlock
On Thu, May 17, 2018 at 11:05 AM, <nicolas@devels.es> wrote:
Hi,
We're running oVirt 4.1.9 (I know it's not the recommended version, but we can't upgrade yet) and recently we had an issue
with a Storage Domain while a VM was moving a disk. The Storage
Domain went down for a few minutes, then it got back.
However, the disk's state has stuck in a 'Migrating: 10%' state
(see ss-2.png).
I run the 'unlock_entity.sh' script to try to unlock the disk,
with these parameters:
# PGPASSWORD=... /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t disk -u
engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
The disk's state changed to 'OK', but the actual state still states it's migrating (see ss-1.png).
Calling the script with -t all doesn't make a difference either.
Currently, the disk is unmanageable: cannot be deactivated, moved
or copied, as it says there's a copying operation running already.
Could someone provide a way to unlock this disk? I don't mind modifying a value directly into the database, I just need the copying process cancelled.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [3] [2]
Links: ------ [1] http://10.35.0 [5] [5] [5]. [2] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [3] [3] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2] [1]
Links: ------ [1] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2] [2] http://10.35.0 [5] [5]. [3] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [4] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [1] [5] http://10.35.0 [5] [5]
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [2] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [3] http://10.35.0 [5]. [4] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [5] http://10.35.0 [5]
Links: ------ [1] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [2] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [3] http://10.35.0. [4] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [5] http://10.35.0

Please disregard the last e-mail. I re-run the command and now the exit code was 0, and the migration process is not stuck anymore. Thanks so much for all the help, Benny! Regards. El 2018-05-18 08:42, nicolas@devels.es escribió:
Hi,
We're getting closer to solve it :-)
I'll answer below with my steps, there's one that fails and I don't know why (probably I missed something).
El 2018-05-17 15:47, Benny Zlotnik escribió:
Sorry, I forgot it's ISCSI, it's a bit different
In my case it would look something like:
2018-05-17 17:30:12,740+0300 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '3e541b2d- 2a49-4eb8-ae4b-aa9acee228c6', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 0, 'image': 'dd6b5ae0-196e-4879-b076-a0a8d8a1dfde', 'ctime': '1526566607', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'221c45e1-7f65-42c8-afc3-0ccc1d6fc148', 'truesize': '1073741824', 'type': 'PREALLOCATED', 'lease': {'path': '/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases', 'owners ': [], 'version': None, 'offset': 109051904}} (__init__:355)
I then look for 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 in sanlock.log:
2018-05-17 17:30:12 20753 [3335]: s10:r14 resource 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb 8-ae4b-aa9acee228c6/leases:109051904 for 2,11,31496
I only could find the entry on one of the hosts. So when I grepped the uuid I found:
2018-05-16 12:39:44 4761204 [1023]: s33:r103 resource 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 for 23,47,9206
So the resource would be: 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 and the pid is 31496
Ok, so my resource is 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 and my PID is 9206.
running $ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904
offset lockspace resource timestamp own gen lver
00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000020753 0001 0004 5 ...
In my case the output would be:
[...] 00000000 1876ab86-216f-4a37-a36b-2b5d99fcaad0 c2cfbb02-9981-4fb7-baea-7257a824145c 0004918032 0008 0004 2 [...]
If the vdsm pid changed (and it probably did) it will be different, so I acquire it for the new pid $ sanlock client acquire -r 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 -p 32265 acquire pid 32265
I checked vdsmd's PID
# systemctl status vdsmd ● vdsmd.service - Virtual Desktop Server Manager [...] ├─17758 /usr/bin/python2 /usr/share/vdsm/vdsm
So the new PID is 17758.
# sanlock client acquire -r 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 -p 17758 acquire pid 17758 acquire done 0
Then I can see the timestamp changed
$ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 offset lockspace resource timestamp own gen lver 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000021210 0001 0005 6
And then I release it: $ sanlock client release -r 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6:221c45e1-7f65-42c8-afc3-0ccc1d6fc148:/dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 -p 32265
release pid 32265 release done 0
There's where it fails:
# sanlock direct release -r 1876ab86-216f-4a37-a36b-2b5d99fcaad0:c2cfbb02-9981-4fb7-baea-7257a824145c:/dev/1876ab86-216f-4a37-a36b-2b5d99fcaad0/leases:128974848 -p 17758 release done -251
And the resource is still stuck.
Is there something I missed there?
$ sanlock direct dump /dev/3e541b2d-2a49-4eb8-ae4b-aa9acee228c6/leases:109051904 offset lockspace resource timestamp own gen lver 00000000 3e541b2d-2a49-4eb8-ae4b-aa9acee228c6 221c45e1-7f65-42c8-afc3-0ccc1d6fc148 0000000000 0001 0005 6
The timestamp is zeroed and the lease is free
On Thu, May 17, 2018 at 3:38 PM, <nicolas@devels.es> wrote:
This is vdsm 4.19.45. I grepped the disk uuid in /var/log/sanlock.log but unfortunately no entry there...
El 2018-05-17 13:11, Benny Zlotnik escribió:
Which vdsm version are you using?
You can try looking for the image uuid in /var/log/sanlock.log
On Thu, May 17, 2018 at 2:40 PM, <nicolas@devels.es> wrote:
Thanks.
I've been able to see the line in the log, however, the format differs slightly from yours.
2018-05-17 12:24:44,132+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Calling 'Volume.getInfo' in bridge with {u'storagepoolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63', u'imageID': u'b4013aba-a936-4a54-bb14-670d3a8b7c38', u'volumeID': u'c2cfbb02-9981-4fb7-baea-7257a824145c', u'storagedomainID': u'1876ab86-216f-4a37-a36b-2b5d99fcaad0'} (__init__:556) 2018-05-17 12:24:44,689+0100 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '1876ab86-216f-4a37-a36b-2b5d99fcaad0', 'voltype': 'INTERNAL', 'description': 'None', 'parent': 'ea9a0182-329f-4b8f-abe3-e894de95dac0', 'format': 'COW', 'generation': 1, 'image': 'b4013aba-a936-4a54-bb14-670d3a8b7c38', 'ctime': '1526470759', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '21474836480', 'uuid': u'c2cfbb02-9981-4fb7-baea-7257a824145c', 'truesize': '1073741824', 'type': 'SPARSE', 'lease': {'owners': [8], 'version': 1L}} (__init__:582)
As you can see, there's no path field there.
How should I procceed?
El 2018-05-17 12:01, Benny Zlotnik escribió: vdsm-client replaces vdsClient, take a look here: https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [1] [4]
On Thu, May 17, 2018 at 1:57 PM, <nicolas@devels.es> wrote:
The issue is present in the logs:
2018-05-17 11:50:44,822+01 INFO [org.ovirt.engine.core.bll.storage.disk.image.VdsmImagePoller] (DefaultQuartzScheduler1) [39755bb7-9082-40d6-ae5e-64b5b2b5f98e] Command CopyData id: '84a49b25-0e37-4338-834e-08bd67c42860': the volume lease is not FREE - the job is running
I tried setting the log level to debug but it seems I have not a vdsm-client command. All I have is a vdsm-tool command. Is it equivalent?
Thanks
El 2018-05-17 11:49, Benny Zlotnik escribió: By the way, please verify it's the same issue, you should see "the volume lease is not FREE - the job is running" in the engine log
On Thu, May 17, 2018 at 1:21 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I see because I am on debug level, you need to enable it in order to see
https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2]
[1]
[3]
On Thu, 17 May 2018, 13:10 , <nicolas@devels.es> wrote:
Hi,
Thanks. I've checked vdsm logs on all my hosts but the only entry I can find grepping by Volume.getInfo is like this:
2018-05-17 10:14:54,892+0100 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Volume.getInfo succeeded in 0.30 seconds (__init__:539)
I cannot find a line like yours... any other way on how to obtain those parameters. This is an iSCSI based storage FWIW (both source and destination of the movement).
Thanks.
El 2018-05-17 10:01, Benny Zlotnik escribió: In the vdsm log you will find the volumeInfo log which looks like this:
2018-05-17 11:55:03,257+0300 DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return 'Volume.getInfo' in bridge with {'status': 'OK', 'domain': '5c4d2216- 2eb3-4e24-b254-d5f83fde4dbe', 'voltype': 'INTERNAL', 'description': '{"DiskAlias":"vm_Disk1","DiskDescription":""}', 'parent': '00000000-0000-0000- 0000-000000000000', 'format': 'RAW', 'generation': 3, 'image': 'b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc', 'ctime': '1526543244', 'disktype': 'DATA', ' legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1073741824', 'children': [], 'pool': '', 'capacity': '1073741824', 'uuid': u'7190913d-320c-4fc9- a5b3-c55b26aa30f4', 'truesize': '0', 'type': 'SPARSE', 'lease': {'path':
u'/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2e
b3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease',
'owners': [1], 'version': 8L, 'o ffset': 0}} (__init__:355)
The lease path in my case is: /rhev/data-center/mnt/10.35.0. [3] [3] [2]
[1]233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease
Then you can look in /var/log/sanlock.log
2018-05-17 11:35:18 243132 [14847]: s2:r9 resource
5c4d2216-2eb3-4e24-b254-d5f83fde4dbe:7190913d-320c-4fc9-a5b3-c55b26aa30f4:/rhev/data-center/mnt/10.35.0.233:_root_storage__domains_sd1/5c4d2216-2eb3-4e24-b254-d5f83fde4dbe/images/b8eb8c82-fddd-4fbc-b80d-6ee04c1255bc/7190913d-320c-4fc9-a5b3-c55b26aa30f4.lease:0
for 2,9,5049
Then you can use this command to unlock, the pid in this case is 5049
sanlock client release -r RESOURCE -p pid
On Thu, May 17, 2018 at 11:52 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I believe you've hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [3]
[2]
[1]
You can try to release the lease manually using the
sanlock client
command (there's an example in the comments on the bug), once the lease is free the job will fail and the disk can be
unlock
On Thu, May 17, 2018 at 11:05 AM, <nicolas@devels.es> wrote:
Hi,
We're running oVirt 4.1.9 (I know it's not the recommended version, but we can't upgrade yet) and recently we had an issue
with a Storage Domain while a VM was moving a disk. The Storage
Domain went down for a few minutes, then it got back.
However, the disk's state has stuck in a 'Migrating: 10%' state
(see ss-2.png).
I run the 'unlock_entity.sh' script to try to unlock the disk,
with these parameters:
# PGPASSWORD=... /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t disk -u
engine -v b4013aba-a936-4a54-bb14-670d3a8b7c38
The disk's state changed to 'OK', but the actual state still states it's migrating (see ss-1.png).
Calling the script with -t all doesn't make a difference either.
Currently, the disk is unmanageable: cannot be deactivated, moved
or copied, as it says there's a copying operation running already.
Could someone provide a way to unlock this disk? I don't mind modifying a value directly into the database, I just need the copying process cancelled.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [3] [2]
Links: ------ [1] http://10.35.0 [5] [5] [5]. [2] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [3] [3] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2] [1]
Links: ------ [1] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [2] [2] http://10.35.0 [5] [5]. [3] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [4] [4] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [1] [5] http://10.35.0 [5] [5]
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Links: ------ [1] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [1] [2] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [2] [3] http://10.35.0 [5]. [4] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [4] [5] http://10.35.0 [5]
Links: ------ [1] https://lists.ovirt.org/pipermail/devel/2016-July/013535.html [2] https://www.ovirt.org/develop/developer-guide/vdsm/log-files/ [3] http://10.35.0. [4] https://bugzilla.redhat.com/show_bug.cgi?id=1565040 [5] http://10.35.0
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
participants (2)
-
Benny Zlotnik
-
nicolas@devels.es