
This is a multi-part message in MIME format. --------------8298F33A8974EF14E7F6DFE3 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi Fred, Unfortunately we reinstalled the cluster a while ago, so I don't know the exact version we were running at the time. It would have been the latest version in the 4.1 production repos though if that helps. On 2017-07-02 11:32 AM, Fred Rolland wrote:
Seems you hit [1]. Can you tell me what is the exact version of the engine ?
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1395793
On Thu, May 11, 2017 at 10:29 PM, Charles Tassell <charles@islandadmin.ca <mailto:charles@islandadmin.ca>> wrote:
Sure, it's pretty big so I've put it online for download at http://krissy.islandadmin.ca/public/engine.log.txt <http://krissy.islandadmin.ca/public/engine.log.txt>
On 2017-05-11 04:08 PM, Fred Rolland wrote:
The locking is on the engine side and restarting the vdsm will not affect it . Can you send the whole engine log ? Which exact version are you using ?
On Thu, May 11, 2017 at 9:30 PM, Charles Tassell <charles@islandadmin.ca <mailto:charles@islandadmin.ca>> wrote:
Just as an update, I created a new VM and had the same issue: the disk remains locked. So I then added a new data store (this one iSCSI not NFS) and create a new VM on that. Again, the disk remains locked. So the problem seems to be that any action that sets to modify a disk image on my cluster locks the disk and keeps it locked permanently.
I tried restarting the vdsm daemon, but that didn't make a difference. I'm seeing this in my sanlock.log file though, which worries me:
017-05-07 07:51:41-0300 1738538 [13575]: s2 renewal error -202 delta_length 10 last_success 1738508 2017-05-07 07:51:41-0300 1738538 [11513]: s1 renewal error -202 delta_length 10 last_success 1738508
Here's the last 20 lines: 2017-05-07 07:51:41-0300 1738538 [13580]: s3 renewal error -202 delta_length 10 last_success 1738508 2017-05-07 07:51:41-0300 1738538 [13575]: 20423d5e aio timeout RD 0x7fe1440008c0:0x7fe1440008d0:0x7fe160255000 ioto 10 to_count 67 2017-05-07 07:51:41-0300 1738538 [13575]: s2 delta_renew read timeout 10 sec offset 0 /rhev/data-center/mnt/192.168.130.217:_media_ovirt/20423d5e-188c-4e10-9893-588ceb81b354/dom_md/ids 2017-05-07 07:51:41-0300 1738538 [13575]: s2 renewal error -202 delta_length 10 last_success 1738508 2017-05-07 07:51:41-0300 1738538 [11513]: hosted-e aio timeout RD 0x7fe1480008c0:0x7fe1480008d0:0x7fe14e6fc000 ioto 10 to_count 65 2017-05-07 07:51:41-0300 1738538 [11513]: s1 delta_renew read timeout 10 sec offset 0 /var/run/vdsm/storage/5dccd07d-a923-4d4b-9cb1-3b51ebfdca4d/5a9c284f-0faa-4a25-94ce-c9efdae07484/ab2443f1-95ed-475d-886c-c1653257cf04 2017-05-07 07:51:41-0300 1738538 [11513]: s1 renewal error -202 delta_length 10 last_success 1738508 2017-05-07 07:51:47-0300 1738544 [13575]: 20423d5e aio collect RD 0x7fe1440008c0:0x7fe1440008d0:0x7fe160255000 result 1048576:0 match reap 2017-05-07 07:51:47-0300 1738544 [13580]: 5dccd07d aio collect RD 0x7fe13c0008c0:0x7fe13c0008d0:0x7fe14e5fa000 result 1048576:0 match reap 2017-05-07 07:51:47-0300 1738544 [11513]: hosted-e aio collect RD 0x7fe1480008c0:0x7fe1480008d0:0x7fe14e6fc000 result 1048576:0 match reap 2017-05-07 07:53:57-0300 1738674 [13590]: s2:r15 resource 20423d5e-188c-4e10-9893-588ceb81b354:SDM:/rhev/data-center/mnt/192.168.130.217:_media_ovirt/20423d5e-188c-4e10-9893-588ceb81b354/dom_md/leases:1048576 for 7,21,78395 2017-05-07 07:59:49-0300 1739027 [13575]: s2 delta_renew long write time 10 sec 2017-05-09 08:38:34-0300 1914151 [13590]: s2:r16 resource 20423d5e-188c-4e10-9893-588ceb81b354:SDM:/rhev/data-center/mnt/192.168.130.217:_media_ovirt/20423d5e-188c-4e10-9893-588ceb81b354/dom_md/leases:1048576 for 7,21,78395 2017-05-11 15:07:45-0300 2110302 [13590]: s2:r17 resource 20423d5e-188c-4e10-9893-588ceb81b354:SDM:/rhev/data-center/mnt/192.168.130.217:_media_ovirt/20423d5e-188c-4e10-9893-588ceb81b354/dom_md/leases:1048576 for 7,21,112346 2017-05-11 15:17:24-0300 2110881 [13590]: s4 lockspace b010093e-1924-46e1-bd57-2cf2b2445087:1:/dev/b010093e-1924-46e1-bd57-2cf2b2445087/ids:0 2017-05-11 15:17:45-0300 2110902 [1395]: s4 host 1 1 2110881 44ae07eb-3371-4750-8728-ab3b049dbae2.ovirt730-0 2017-05-11 15:17:45-0300 2110902 [1400]: s4:r18 resource b010093e-1924-46e1-bd57-2cf2b2445087:SDM:/dev/b010093e-1924-46e1-bd57-2cf2b2445087/leases:1048576 for 7,21,112346 2017-05-11 15:17:52-0300 2110909 [1399]: s5 lockspace b010093e-1924-46e1-bd57-2cf2b2445087:1:/dev/b010093e-1924-46e1-bd57-2cf2b2445087/ids:0 2017-05-11 15:18:13-0300 2110930 [1395]: s5 host 1 2 2110909 44ae07eb-3371-4750-8728-ab3b049dbae2.ovirt730-0 2017-05-11 15:18:13-0300 2110930 [1395]: s5 host 2 1 2110065 4d31313f-b2dd-4368-bf31-d39835e10afb.ovirt730-0
On 2017-05-11 10:09 AM, Charles Tassell wrote:
Hi Freddy,
Sure, thanks for looking into this. Here you go:
2017-05-10 11:35:30,249-03 INFO [org.ovirt.engine.core.bll.aaa.SessionDataContainer] (DefaultQuartzScheduler8) [1c84abac] Not removing session 'vZyqrcCljPC7hQtcILsk4uDug3QsiinZBOyoGDiQKkYYT2znGyWe4fasrPbjYxdjbfyR3DBnp+UZ9/k20dGsMA==', session has running commands for user 'admin@internal-authz'.[snip]
On 2017-05-11 04:30 AM, Fred Rolland wrote:
Hi,
Can you provide the engine log ?
Thanks,
Freddy
On Wed, May 10, 2017 at 5:57 PM, Charles Tassell <charles@islandadmin.ca <mailto:charles@islandadmin.ca>> wrote:
Hi Everyone,
I'm having some issues with my oVirt 4.1 (fully updated to latest release as of yesterday) cluster. When I clone a VM the disks of both the original and the clone stay in the locked state, and the only way I can resolve it is to go into the database on the engine and run "update images set imagestatus=1 where imagestatus=2;"
I'm using NFS4 as a datastore and the disks seem to copy fine (file sizes match and everything), but the locking worries me. To clone the VM I just shut the source VM down and then right click on it and select "Clone"
I've attached the full VDSM log from my last attempt, but here is the excerpt of the lines just referencing the two disks (d73206ed-89ba-48a9-82ff-c107c1af60f0 is the original VMs disk and 670a7b20-fecd-45c6-af5c-3c7b98258224 is the clone.)
[snip] _______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users <http://lists.ovirt.org/mailman/listinfo/users>
--------------8298F33A8974EF14E7F6DFE3 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <p>Hi Fred,</p> <p> Unfortunately we reinstalled the cluster a while ago, so I don't know the exact version we were running at the time. It would have been the latest version in the 4.1 production repos though if that helps.<br> </p> <br> <div class="moz-cite-prefix">On 2017-07-02 11:32 AM, Fred Rolland wrote:<br> </div> <blockquote type="cite" cite="mid:CAF_B0vF6BZFHXxbWaAGx1ya-8os5QfXe0xswfa-iYhH51O09Xg@mail.gmail.com"> <div dir="ltr"> <div>Seems you hit [1].<br> </div> Can you tell me what is the exact version of the engine ?<br> <div><br> [1] <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1395793" moz-do-not-send="true">https://bugzilla.redhat.com/show_bug.cgi?id=1395793</a><br> </div> </div> <div class="gmail_extra"><br> <div class="gmail_quote">On Thu, May 11, 2017 at 10:29 PM, Charles Tassell <span dir="ltr"><<a href="mailto:charles@islandadmin.ca" target="_blank" moz-do-not-send="true">charles@islandadmin.ca</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div text="#000000" bgcolor="#FFFFFF"> Sure, it's pretty big so I've put it online for download at <a class="m_4218111262618390950moz-txt-link-freetext" href="http://krissy.islandadmin.ca/public/engine.log.txt" target="_blank" moz-do-not-send="true">http://krissy.islandadmin.ca/<wbr>public/engine.log.txt</a> <div> <div class="h5"><br> <br> <div class="m_4218111262618390950moz-cite-prefix">On 2017-05-11 04:08 PM, Fred Rolland wrote:<br> </div> <blockquote type="cite"> <div dir="ltr"> <div> <div>The locking is on the engine side and restarting the vdsm will not affect it .<br> </div> Can you send the whole engine log ?<br> </div> Which exact version are you using ?<br> <div> <div><br> </div> </div> </div> <div class="gmail_extra"><br> <div class="gmail_quote">On Thu, May 11, 2017 at 9:30 PM, Charles Tassell <span dir="ltr"><<a href="mailto:charles@islandadmin.ca" target="_blank" moz-do-not-send="true">charles@islandadmin.ca</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div text="#000000" bgcolor="#FFFFFF"> <p>Just as an update, I created a new VM and had the same issue: the disk remains locked. So I then added a new data store (this one iSCSI not NFS) and create a new VM on that. Again, the disk remains locked. So the problem seems to be that any action that sets to modify a disk image on my cluster locks the disk and keeps it locked permanently.</p> <p>I tried restarting the vdsm daemon, but that didn't make a difference. I'm seeing this in my sanlock.log file though, which worries me:</p> <p><br> </p> 017-05-07 07:51:41-0300 1738538 [13575]: s2 renewal error -202 delta_length 10 last_success 1738508<br> 2017-05-07 07:51:41-0300 1738538 [11513]: s1 renewal error -202 delta_length 10 last_success 1738508<br> <br> Here's the last 20 lines:<br> 2017-05-07 07:51:41-0300 1738538 [13580]: s3 renewal error -202 delta_length 10 last_success 1738508<br> 2017-05-07 07:51:41-0300 1738538 [13575]: 20423d5e aio timeout RD 0x7fe1440008c0:0x7fe1440008d0:<wbr>0x7fe160255000 ioto 10 to_count 67<br> 2017-05-07 07:51:41-0300 1738538 [13575]: s2 delta_renew read timeout 10 sec offset 0 /rhev/data-center/mnt/192.168.<wbr>130.217:_media_ovirt/20423d5e-<wbr>188c-4e10-9893-588ceb81b354/do<wbr>m_md/ids<br> 2017-05-07 07:51:41-0300 1738538 [13575]: s2 renewal error -202 delta_length 10 last_success 1738508<br> 2017-05-07 07:51:41-0300 1738538 [11513]: hosted-e aio timeout RD 0x7fe1480008c0:0x7fe1480008d0:<wbr>0x7fe14e6fc000 ioto 10 to_count 65<br> 2017-05-07 07:51:41-0300 1738538 [11513]: s1 delta_renew read timeout 10 sec offset 0 /var/run/vdsm/storage/5dccd07d<wbr>-a923-4d4b-9cb1-3b51ebfdca4d/<wbr>5a9c284f-0faa-4a25-94ce-<wbr>c9efdae07484/ab2443f1-95ed-<wbr>475d-886c-c1653257cf04<br> 2017-05-07 07:51:41-0300 1738538 [11513]: s1 renewal error -202 delta_length 10 last_success 1738508<br> 2017-05-07 07:51:47-0300 1738544 [13575]: 20423d5e aio collect RD 0x7fe1440008c0:0x7fe1440008d0:<wbr>0x7fe160255000 result 1048576:0 match reap<br> 2017-05-07 07:51:47-0300 1738544 [13580]: 5dccd07d aio collect RD 0x7fe13c0008c0:0x7fe13c0008d0:<wbr>0x7fe14e5fa000 result 1048576:0 match reap<br> 2017-05-07 07:51:47-0300 1738544 [11513]: hosted-e aio collect RD 0x7fe1480008c0:0x7fe1480008d0:<wbr>0x7fe14e6fc000 result 1048576:0 match reap<br> 2017-05-07 07:53:57-0300 1738674 [13590]: s2:r15 resource 20423d5e-188c-4e10-9893-588ceb<wbr>81b354:SDM:/rhev/data-center/<wbr>mnt/192.168.130.217:_media_<wbr>ovirt/20423d5e-188c-4e10-9893-<wbr>588ceb81b354/dom_md/leases:<wbr>1048576 for 7,21,78395<br> 2017-05-07 07:59:49-0300 1739027 [13575]: s2 delta_renew long write time 10 sec<br> 2017-05-09 08:38:34-0300 1914151 [13590]: s2:r16 resource 20423d5e-188c-4e10-9893-588ceb<wbr>81b354:SDM:/rhev/data-center/<wbr>mnt/192.168.130.217:_media_<wbr>ovirt/20423d5e-188c-4e10-9893-<wbr>588ceb81b354/dom_md/leases:<wbr>1048576 for 7,21,78395<br> 2017-05-11 15:07:45-0300 2110302 [13590]: s2:r17 resource 20423d5e-188c-4e10-9893-588ceb<wbr>81b354:SDM:/rhev/data-center/<wbr>mnt/192.168.130.217:_media_<wbr>ovirt/20423d5e-188c-4e10-9893-<wbr>588ceb81b354/dom_md/leases:<wbr>1048576 for 7,21,112346<br> 2017-05-11 15:17:24-0300 2110881 [13590]: s4 lockspace b010093e-1924-46e1-bd57-2cf2b2<wbr>445087:1:/dev/b010093e-1924-<wbr>46e1-bd57-2cf2b2445087/ids:0<br> 2017-05-11 15:17:45-0300 2110902 [1395]: s4 host 1 1 2110881 44ae07eb-3371-4750-8728-ab3b04<wbr>9dbae2.ovirt730-0<br> 2017-05-11 15:17:45-0300 2110902 [1400]: s4:r18 resource b010093e-1924-46e1-bd57-2cf2b2<wbr>445087:SDM:/dev/b010093e-1924-<wbr>46e1-bd57-2cf2b2445087/leases:<wbr>1048576 for 7,21,112346<br> 2017-05-11 15:17:52-0300 2110909 [1399]: s5 lockspace b010093e-1924-46e1-bd57-2cf2b2<wbr>445087:1:/dev/b010093e-1924-<wbr>46e1-bd57-2cf2b2445087/ids:0<br> 2017-05-11 15:18:13-0300 2110930 [1395]: s5 host 1 2 2110909 44ae07eb-3371-4750-8728-ab3b04<wbr>9dbae2.ovirt730-0<br> 2017-05-11 15:18:13-0300 2110930 [1395]: s5 host 2 1 2110065 4d31313f-b2dd-4368-bf31-d39835<wbr>e10afb.ovirt730-0<span><br> <br> <br> <div class="m_4218111262618390950m_2097369334319903642moz-cite-prefix">On 2017-05-11 10:09 AM, Charles Tassell wrote:<br> </div> </span> <blockquote type="cite"><span> <p>Hi Freddy,</p> <p> Sure, thanks for looking into this. Here you go:</p> </span> <p>2017-05-10 11:35:30,249-03 INFO [org.ovirt.engine.core.bll.aaa<wbr>.SessionDataContainer] (DefaultQuartzScheduler8) [1c84abac] Not removing session 'vZyqrcCljPC7hQtcILsk4uDug3Qsi<wbr>inZBOyoGDiQKkYYT2znGyWe4fasrPb<wbr>jYxdjbfyR3DBnp+UZ9/k20dGsMA=='<wbr>, session has running commands for user 'admin@internal-authz'.[snip]<br> </p> <span> <div class="m_4218111262618390950m_2097369334319903642moz-cite-prefix">On 2017-05-11 04:30 AM, Fred Rolland wrote:<br> </div> </span> <blockquote type="cite"><span> <div dir="ltr"> <div> <div> <div>Hi,<br> <br> </div> Can you provide the engine log ?<br> <br> </div> Thanks,<br> <br> </div> Freddy<br> </div> </span> <div class="gmail_extra"><br> <div class="gmail_quote"><span>On Wed, May 10, 2017 at 5:57 PM, Charles Tassell <span dir="ltr"><<a href="mailto:charles@islandadmin.ca" target="_blank" moz-do-not-send="true">charles@islandadmin.ca</a>></span> wrote:<br> </span> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>Hi Everyone,<br> <br> I'm having some issues with my oVirt 4.1 (fully updated to latest release as of yesterday) cluster. When I clone a VM the disks of both the original and the clone stay in the locked state, and the only way I can resolve it is to go into the database on the engine and run "update images set imagestatus=1 where imagestatus=2;"<br> <br> I'm using NFS4 as a datastore and the disks seem to copy fine (file sizes match and everything), but the locking worries me. To clone the VM I just shut the source VM down and then right click on it and select "Clone"<br> <br> I've attached the full VDSM log from my last attempt, but here is the excerpt of the lines just referencing the two disks (d73206ed-89ba-48a9-82ff-c107c<wbr>1af60f0 is the original VMs disk and 670a7b20-fecd-45c6-af5c-3c7b98<wbr>258224 is the clone.)<br> <br> </span> [snip]<span><br> ______________________________<wbr>_________________<br> Users mailing list<br> <a href="mailto:Users@ovirt.org" target="_blank" moz-do-not-send="true">Users@ovirt.org</a><br> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br> <br> </span></blockquote> </div> <br> </div> </blockquote> <br> </blockquote> <br> </div> </blockquote> </div> <br> </div> </blockquote> <br> </div> </div> </div> </blockquote> </div> <br> </div> </blockquote> <br> </body> </html> --------------8298F33A8974EF14E7F6DFE3--