From soeren.malchow at mcon.net Sun May 31 19:35:39 2015 Content-Type: multipart/mixed; boundary="===============7262040281586986742==" MIME-Version: 1.0 From: Soeren Malchow To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Sun, 31 May 2015 23:35:36 +0000 Message-ID: --===============7262040281586986742== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable --_000_D1916735D978soerenmalchowmconnet_ Content-Type: text/plain; charset=3D"Windows-1252" Content-Transfer-Encoding: quoted-printable Small addition again: This error shows up in the log while removing snapshots WITHOUT rendering t= =3D he Vms unresponsive =3D97 Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Timed= =3D out during operation: cannot acquire state change lock Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.V= =3D m ERROR vmId=3D3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting bloc= k =3D job info Traceback= =3D (most recent call last): File "/= =3D usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs=3D85 =3D97 From: Soeren Malchow > Date: Monday 1 June 2015 00:56 To: "libvirt-users(a)redhat.com" >, users > Subject: [ovirt-users] Bug in Snapshot Removing Dear all I am not sure if the mail just did not get any attention between all the ma= =3D ils and this time it is also going to the libvirt mailing list. I am experiencing a problem with VM becoming unresponsive when removing Sna= =3D pshots (Live Merge) and i think there is a serious problem. Here are the previous mails, http://lists.ovirt.org/pipermail/users/2015-May/033083.html The problem is on a system with everything on the latest version, CentOS 7.= =3D 1 and ovirt 3.5.2.1 all upgrades applied. This Problem did NOT exist before upgrading to CentOS 7.1 with an environme= =3D nt running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview rep= =3D o activated. I think this is a bug in libvirt, not ovirt itself, but i am not sure. The = =3D actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py, = =3D line 697). We are very willing to help, test and supply log files in anyway we can. Regards Soeren --_000_D1916735D978soerenmalchowmconnet_ Content-Type: text/html; charset=3D"Windows-1252" Content-ID: <9FBCF9D40986EE4C94B0FC8E1FE85212(a)liquidcampaign.com> Content-Transfer-Encoding: quoted-printable
Small addition again:

This error shows up in the log while removing snapshots WITHOUT render= =3D ing the Vms unresponsive

=3D97
Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: = =3D Timed out during operation: cannot acquire state change lock
Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm= =3D vm.Vm ERROR vmId=3D3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting= b=3D lock job info
                    = =3D                      = =3D ;                     &nb= =3D sp;   Traceback (most recent call last):
                    = =3D                      = =3D ;                     &nb= =3D sp;     File "/usr/share/vdsm/virt/vm.py", line 5759, i= =3D n queryBlockJobs=3D85

=3D97



From: Soeren Malchow <soeren.malchow(a)mcon.net>
Date: Monday 1 June 2015 00:56
To: "libvirt-users(a)redhat.com" <libvirt-users(a)redhat.com>, users <= users(a)ovirt.org>
Subject: [ovirt-users] Bug in Sna= ps=3D hot Removing

Dear all

I am not sure if the mail just did not get any attention between all t= =3D he mails and this time it is also going to the libvirt mailing list.

I am experiencing a problem with VM becoming unresponsive when removin= =3D g Snapshots (Live Merge) and i think there is a serious problem.

Here are the previous mails,


The problem is on a system with everything on the latest version, Cent= =3D OS 7.1 and ovirt 3.5.2.1 all upgrades applied.

This Problem did NOT exist before upgrading to CentOS 7.1 with an envi= =3D ronment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-previe= =3D w repo activated.

I think this is a bug in libvirt, not ovirt itself, but i am not sure.= =3D The actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm= =3D .py, line 697).

We are very willing to help, test and supply log files in anyway we ca= =3D n. 

Regards
Soeren 

--_000_D1916735D978soerenmalchowmconnet_-- --===============7262040281586986742== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" LS1fMDAwX0QxOTE2NzM1RDk3OHNvZXJlbm1hbGNob3dtY29ubmV0XwpDb250ZW50LVR5cGU6IHRl eHQvcGxhaW47IGNoYXJzZXQ9IldpbmRvd3MtMTI1MiIKQ29udGVudC1UcmFuc2Zlci1FbmNvZGlu ZzogcXVvdGVkLXByaW50YWJsZQoKU21hbGwgYWRkaXRpb24gYWdhaW46CgpUaGlzIGVycm9yIHNo b3dzIHVwIGluIHRoZSBsb2cgd2hpbGUgcmVtb3Zpbmcgc25hcHNob3RzIFdJVEhPVVQgcmVuZGVy aW5nIHQ9CmhlIFZtcyB1bnJlc3BvbnNpdmUKCj05NwpKdW4gMDEgMDE6MzM6NDUgbWMtZGMzaGFt LWNvbXB1dGUtMDItbGl2ZS5tYy5tY29uLm5ldCBsaWJ2aXJ0ZFsxNjU3XTogVGltZWQ9CiBvdXQg ZHVyaW5nIG9wZXJhdGlvbjogY2Fubm90IGFjcXVpcmUgc3RhdGUgY2hhbmdlIGxvY2sKSnVuIDAx IDAxOjMzOjQ1IG1jLWRjM2hhbS1jb21wdXRlLTAyLWxpdmUubWMubWNvbi5uZXQgdmRzbVs2ODM5 XTogdmRzbSB2bS5WPQptIEVSUk9SIHZtSWQ9M0RgNTY4NDhmNGEtY2Q3My00ZWRhLWJmNzktN2Vi ODBhZTU2OWE5YDo6RXJyb3IgZ2V0dGluZyBibG9jayA9CmpvYiBpbmZvCiAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFRyYWNl YmFjaz0KIChtb3N0IHJlY2VudCBjYWxsIGxhc3QpOgogICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIEZpbGUgIi89CnVzci9z aGFyZS92ZHNtL3ZpcnQvdm0ucHkiLCBsaW5lIDU3NTksIGluIHF1ZXJ5QmxvY2tKb2JzPTg1Cgo9 OTcKCgoKRnJvbTogU29lcmVuIE1hbGNob3cgPHNvZXJlbi5tYWxjaG93QG1jb24ubmV0PG1haWx0 bzpzb2VyZW4ubWFsY2hvd0BtY29uLm5lPQp0Pj4KRGF0ZTogTW9uZGF5IDEgSnVuZSAyMDE1IDAw OjU2ClRvOiAibGlidmlydC11c2Vyc0ByZWRoYXQuY29tPG1haWx0bzpsaWJ2aXJ0LXVzZXJzQHJl ZGhhdC5jb20+IiA8bGlidmlydC11cz0KZXJzQHJlZGhhdC5jb208bWFpbHRvOmxpYnZpcnQtdXNl cnNAcmVkaGF0LmNvbT4+LCB1c2VycyA8dXNlcnNAb3ZpcnQub3JnPG1hPQppbHRvOnVzZXJzQG92 aXJ0Lm9yZz4+ClN1YmplY3Q6IFtvdmlydC11c2Vyc10gQnVnIGluIFNuYXBzaG90IFJlbW92aW5n CgpEZWFyIGFsbAoKSSBhbSBub3Qgc3VyZSBpZiB0aGUgbWFpbCBqdXN0IGRpZCBub3QgZ2V0IGFu eSBhdHRlbnRpb24gYmV0d2VlbiBhbGwgdGhlIG1hPQppbHMgYW5kIHRoaXMgdGltZSBpdCBpcyBh bHNvIGdvaW5nIHRvIHRoZSBsaWJ2aXJ0IG1haWxpbmcgbGlzdC4KCkkgYW0gZXhwZXJpZW5jaW5n IGEgcHJvYmxlbSB3aXRoIFZNIGJlY29taW5nIHVucmVzcG9uc2l2ZSB3aGVuIHJlbW92aW5nIFNu YT0KcHNob3RzIChMaXZlIE1lcmdlKSBhbmQgaSB0aGluayB0aGVyZSBpcyBhIHNlcmlvdXMgcHJv YmxlbS4KCkhlcmUgYXJlIHRoZSBwcmV2aW91cyBtYWlscywKCmh0dHA6Ly9saXN0cy5vdmlydC5v cmcvcGlwZXJtYWlsL3VzZXJzLzIwMTUtTWF5LzAzMzA4My5odG1sCgpUaGUgcHJvYmxlbSBpcyBv biBhIHN5c3RlbSB3aXRoIGV2ZXJ5dGhpbmcgb24gdGhlIGxhdGVzdCB2ZXJzaW9uLCBDZW50T1Mg Ny49CjEgYW5kIG92aXJ0IDMuNS4yLjEgYWxsIHVwZ3JhZGVzIGFwcGxpZWQuCgpUaGlzIFByb2Js ZW0gZGlkIE5PVCBleGlzdCBiZWZvcmUgdXBncmFkaW5nIHRvIENlbnRPUyA3LjEgd2l0aCBhbiBl bnZpcm9ubWU9Cm50IHJ1bm5pbmcgb3ZpcnQgMy41LjAgYW5kIDMuNS4xIGFuZCBGZWRvcmEgMjAg d2l0aCB0aGUgbGlidmlydC1wcmV2aWV3IHJlcD0KbyBhY3RpdmF0ZWQuCgpJIHRoaW5rIHRoaXMg aXMgYSBidWcgaW4gbGlidmlydCwgbm90IG92aXJ0IGl0c2VsZiwgYnV0IGkgYW0gbm90IHN1cmUu IFRoZSA9CmFjdHVhbCBmaWxlIHRocm93aW5nIHRoZSBleGNlcHRpb24gaXMgaW4gVkRTTSAoL3Vz ci9zaGFyZS92ZHNtL3ZpcnQvdm0ucHksID0KbGluZSA2OTcpLgoKV2UgYXJlIHZlcnkgd2lsbGlu ZyB0byBoZWxwLCB0ZXN0IGFuZCBzdXBwbHkgbG9nIGZpbGVzIGluIGFueXdheSB3ZSBjYW4uCgpS ZWdhcmRzClNvZXJlbgoKCi0tXzAwMF9EMTkxNjczNUQ5Nzhzb2VyZW5tYWxjaG93bWNvbm5ldF8K Q29udGVudC1UeXBlOiB0ZXh0L2h0bWw7IGNoYXJzZXQ9IldpbmRvd3MtMTI1MiIKQ29udGVudC1J RDogPDlGQkNGOUQ0MDk4NkVFNEM5NEIwRkM4RTFGRTg1MjEyQGxpcXVpZGNhbXBhaWduLmNvbT4K Q29udGVudC1UcmFuc2Zlci1FbmNvZGluZzogcXVvdGVkLXByaW50YWJsZQoKPGh0bWw+CjxoZWFk Pgo8bWV0YSBodHRwLWVxdWl2PTNEIkNvbnRlbnQtVHlwZSIgY29udGVudD0zRCJ0ZXh0L2h0bWw7 IGNoYXJzZXQ9M0RXaW5kb3dzLTE9CjI1MiI+CjwvaGVhZD4KPGJvZHkgc3R5bGU9M0Qid29yZC13 cmFwOiBicmVhay13b3JkOyAtd2Via2l0LW5ic3AtbW9kZTogc3BhY2U7IC13ZWJraXQtbGluPQpl LWJyZWFrOiBhZnRlci13aGl0ZS1zcGFjZTsgY29sb3I6IHJnYigwLCAwLCAwKTsgZm9udC1zaXpl OiAxNHB4OyBmb250LWZhbWk9Cmx5OiBDYWxpYnJpLCBzYW5zLXNlcmlmOyI+CjxkaXY+U21hbGwg YWRkaXRpb24gYWdhaW46PC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5UaGlzIGVycm9yIHNo b3dzIHVwIGluIHRoZSBsb2cgd2hpbGUgcmVtb3Zpbmcgc25hcHNob3RzIFdJVEhPVVQgcmVuZGVy PQppbmcgdGhlIFZtcyB1bnJlc3BvbnNpdmU8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2Pj05 NzwvZGl2Pgo8ZGl2Pgo8ZGl2Pkp1biAwMSAwMTozMzo0NSBtYy1kYzNoYW0tY29tcHV0ZS0wMi1s aXZlLm1jLm1jb24ubmV0IGxpYnZpcnRkWzE2NTddOiA9ClRpbWVkIG91dCBkdXJpbmcgb3BlcmF0 aW9uOiBjYW5ub3QgYWNxdWlyZSBzdGF0ZSBjaGFuZ2UgbG9jazwvZGl2Pgo8ZGl2Pkp1biAwMSAw MTozMzo0NSBtYy1kYzNoYW0tY29tcHV0ZS0wMi1saXZlLm1jLm1jb24ubmV0IHZkc21bNjgzOV06 IHZkc209CiB2bS5WbSBFUlJPUiB2bUlkPTNEYDU2ODQ4ZjRhLWNkNzMtNGVkYS1iZjc5LTdlYjgw YWU1NjlhOWA6OkVycm9yIGdldHRpbmcgYj0KbG9jayBqb2IgaW5mbzwvZGl2Pgo8ZGl2PiZuYnNw OyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7 ICZuYnNwOyA9CiZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNw OyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcD0KOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJz cDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5iPQpzcDsgJm5i c3A7IFRyYWNlYmFjayAobW9zdCByZWNlbnQgY2FsbCBsYXN0KTo8L2Rpdj4KPGRpdj4mbmJzcDsg Jm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAm bmJzcDsgPQombmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsg Jm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A9CjsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7 ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYj0Kc3A7ICZuYnNw OyAmbmJzcDsgRmlsZSAmcXVvdDsvdXNyL3NoYXJlL3Zkc20vdmlydC92bS5weSZxdW90OywgbGlu ZSA1NzU5LCBpPQpuIHF1ZXJ5QmxvY2tKb2JzPTg1PC9kaXY+CjwvZGl2Pgo8ZGl2Pjxicj4KPC9k aXY+CjxkaXY+PTk3PC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2 Pjxicj4KPC9kaXY+CjxzcGFuIGlkPTNEIk9MS19TUkNfQk9EWV9TRUNUSU9OIj4KPGRpdiBzdHls ZT0zRCJmb250LWZhbWlseTpDYWxpYnJpOyBmb250LXNpemU6MTFwdDsgdGV4dC1hbGlnbjpsZWZ0 OyBjb2xvcjpiPQpsYWNrOyBCT1JERVItQk9UVE9NOiBtZWRpdW0gbm9uZTsgQk9SREVSLUxFRlQ6 IG1lZGl1bSBub25lOyBQQURESU5HLUJPVFRPTTo9CiAwaW47IFBBRERJTkctTEVGVDogMGluOyBQ QURESU5HLVJJR0hUOiAwaW47IEJPUkRFUi1UT1A6ICNiNWM0ZGYgMXB0IHNvbGlkOz0KIEJPUkRF Ui1SSUdIVDogbWVkaXVtIG5vbmU7IFBBRERJTkctVE9QOiAzcHQiPgo8c3BhbiBzdHlsZT0zRCJm b250LXdlaWdodDpib2xkIj5Gcm9tOiA8L3NwYW4+U29lcmVuIE1hbGNob3cgJmx0OzxhIGhyZWY9 M0Q9CiJtYWlsdG86c29lcmVuLm1hbGNob3dAbWNvbi5uZXQiPnNvZXJlbi5tYWxjaG93QG1jb24u bmV0PC9hPiZndDs8YnI+CjxzcGFuIHN0eWxlPTNEImZvbnQtd2VpZ2h0OmJvbGQiPkRhdGU6IDwv c3Bhbj5Nb25kYXkgMSBKdW5lIDIwMTUgMDA6NTY8YnI+CjxzcGFuIHN0eWxlPTNEImZvbnQtd2Vp Z2h0OmJvbGQiPlRvOiA8L3NwYW4+JnF1b3Q7PGEgaHJlZj0zRCJtYWlsdG86bGlidmlydD0KLXVz ZXJzQHJlZGhhdC5jb20iPmxpYnZpcnQtdXNlcnNAcmVkaGF0LmNvbTwvYT4mcXVvdDsgJmx0Ozxh IGhyZWY9M0QibWFpbHRvPQo6bGlidmlydC11c2Vyc0ByZWRoYXQuY29tIj5saWJ2aXJ0LXVzZXJz QHJlZGhhdC5jb208L2E+Jmd0OywgdXNlcnMgJmx0OzxhIGg9CnJlZj0zRCJtYWlsdG86dXNlcnNA b3ZpcnQub3JnIj51c2Vyc0BvdmlydC5vcmc8L2E+Jmd0Ozxicj4KPHNwYW4gc3R5bGU9M0QiZm9u dC13ZWlnaHQ6Ym9sZCI+U3ViamVjdDogPC9zcGFuPltvdmlydC11c2Vyc10gQnVnIGluIFNuYXBz PQpob3QgUmVtb3Zpbmc8YnI+CjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+CjxkaXYgc3R5 bGU9M0Qid29yZC13cmFwOiBicmVhay13b3JkOyAtd2Via2l0LW5ic3AtbW9kZTogc3BhY2U7IC13 ZWJraXQtbGluZT0KLWJyZWFrOiBhZnRlci13aGl0ZS1zcGFjZTsgY29sb3I6IHJnYigwLCAwLCAw KTsgZm9udC1zaXplOiAxNHB4OyBmb250LWZhbWlsPQp5OiBDYWxpYnJpLCBzYW5zLXNlcmlmOyI+ CjxkaXY+RGVhciBhbGw8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2PkkgYW0gbm90IHN1cmUg aWYgdGhlIG1haWwganVzdCBkaWQgbm90IGdldCBhbnkgYXR0ZW50aW9uIGJldHdlZW4gYWxsIHQ9 CmhlIG1haWxzIGFuZCB0aGlzIHRpbWUgaXQgaXMgYWxzbyBnb2luZyB0byB0aGUgbGlidmlydCBt YWlsaW5nIGxpc3QuPC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5JIGFtIGV4cGVyaWVuY2lu ZyBhIHByb2JsZW0gd2l0aCBWTSBiZWNvbWluZyB1bnJlc3BvbnNpdmUgd2hlbiByZW1vdmluPQpn IFNuYXBzaG90cyAoTGl2ZSBNZXJnZSkgYW5kIGkgdGhpbmsgdGhlcmUgaXMgYSBzZXJpb3VzIHBy b2JsZW0uPC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5IZXJlIGFyZSB0aGUgcHJldmlvdXMg bWFpbHMsPC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj48YSBocmVmPTNEImh0dHA6Ly9saXN0 cy5vdmlydC5vcmcvcGlwZXJtYWlsL3VzZXJzLzIwMTUtTWF5LzAzMzA4My5odG1sPQoiPmh0dHA6 Ly9saXN0cy5vdmlydC5vcmcvcGlwZXJtYWlsL3VzZXJzLzIwMTUtTWF5LzAzMzA4My5odG1sPC9h PjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+VGhlIHByb2JsZW0gaXMgb24gYSBzeXN0ZW0g d2l0aCBldmVyeXRoaW5nIG9uIHRoZSBsYXRlc3QgdmVyc2lvbiwgQ2VudD0KT1MgNy4xIGFuZCBv dmlydCAzLjUuMi4xIGFsbCB1cGdyYWRlcyBhcHBsaWVkLjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+ CjxkaXY+VGhpcyBQcm9ibGVtIGRpZCBOT1QgZXhpc3QgYmVmb3JlIHVwZ3JhZGluZyB0byBDZW50 T1MgNy4xIHdpdGggYW4gZW52aT0Kcm9ubWVudCBydW5uaW5nIG92aXJ0IDMuNS4wIGFuZCAzLjUu MSBhbmQgRmVkb3JhIDIwIHdpdGggdGhlIGxpYnZpcnQtcHJldmllPQp3IHJlcG8gYWN0aXZhdGVk LjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+SSB0aGluayB0aGlzIGlzIGEgYnVnIGluIGxp YnZpcnQsIG5vdCBvdmlydCBpdHNlbGYsIGJ1dCBpIGFtIG5vdCBzdXJlLj0KIFRoZSBhY3R1YWwg ZmlsZSB0aHJvd2luZyB0aGUgZXhjZXB0aW9uIGlzIGluIFZEU00gKC91c3Ivc2hhcmUvdmRzbS92 aXJ0L3ZtPQoucHksIGxpbmUgNjk3KS48L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2PldlIGFy ZSB2ZXJ5IHdpbGxpbmcgdG8gaGVscCwgdGVzdCBhbmQgc3VwcGx5IGxvZyBmaWxlcyBpbiBhbnl3 YXkgd2UgY2E9Cm4uJm5ic3A7PC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5SZWdhcmRzPC9k aXY+CjxkaXY+U29lcmVuJm5ic3A7PC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPC9kaXY+CjwvZGl2 Pgo8L3NwYW4+CjwvYm9keT4KPC9odG1sPgoKLS1fMDAwX0QxOTE2NzM1RDk3OHNvZXJlbm1hbGNo b3dtY29ubmV0Xy0tCg== --===============7262040281586986742==-- From soeren.malchow at mcon.net Sun May 31 19:39:28 2015 Content-Type: multipart/mixed; boundary="===============2285857928821369350==" MIME-Version: 1.0 From: Soeren Malchow To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Sun, 31 May 2015 23:39:24 +0000 Message-ID: In-Reply-To: D1916735.D978%soeren.malchow@mcon.net --===============2285857928821369350== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable --_000_D1916815D97Csoerenmalchowmconnet_ Content-Type: text/plain; charset=3D"Windows-1252" Content-Transfer-Encoding: quoted-printable And sorry, another update, it does kill the VM partly, it was still pingabl= =3D e when i wrote the last mail, but no ssh and no spice console possible From: Soeren Malchow > Date: Monday 1 June 2015 01:35 To: Soeren Malchow =3D >, "libvirt-users(a)redhat.com" >, users > Subject: Re: [ovirt-users] Bug in Snapshot Removing Small addition again: This error shows up in the log while removing snapshots WITHOUT rendering t= =3D he Vms unresponsive =3D97 Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Timed= =3D out during operation: cannot acquire state change lock Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.V= =3D m ERROR vmId=3D3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting bloc= k =3D job info Traceback= =3D (most recent call last): File "/= =3D usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs=3D85 =3D97 From: Soeren Malchow > Date: Monday 1 June 2015 00:56 To: "libvirt-users(a)redhat.com" >, users > Subject: [ovirt-users] Bug in Snapshot Removing Dear all I am not sure if the mail just did not get any attention between all the ma= =3D ils and this time it is also going to the libvirt mailing list. I am experiencing a problem with VM becoming unresponsive when removing Sna= =3D pshots (Live Merge) and i think there is a serious problem. Here are the previous mails, http://lists.ovirt.org/pipermail/users/2015-May/033083.html The problem is on a system with everything on the latest version, CentOS 7.= =3D 1 and ovirt 3.5.2.1 all upgrades applied. This Problem did NOT exist before upgrading to CentOS 7.1 with an environme= =3D nt running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview rep= =3D o activated. I think this is a bug in libvirt, not ovirt itself, but i am not sure. The = =3D actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py, = =3D line 697). We are very willing to help, test and supply log files in anyway we can. Regards Soeren --_000_D1916815D97Csoerenmalchowmconnet_ Content-Type: text/html; charset=3D"Windows-1252" Content-ID: Content-Transfer-Encoding: quoted-printable
And sorry, another update, it does kill the VM partly, it was still pi= =3D ngable when i wrote the last mail, but no ssh and no spice console possible= =3D

From: Soeren Malchow <soeren.malchow(a)mcon.net>
Date: Monday 1 June 2015 01:35
To: Soeren Malchow <soeren.malchow(a)mcon.net>, "<= a hr=3D ef=3D3D"mailto:libvirt-users(a)redhat.com">libvirt-users(a)redhat.com&q= uot; &=3D lt;libvirt-users(a)redhat.c= om=3D >, users <users(a)ovirt.org>=
Subject: Re: [ovirt-users] Bug in= S=3D napshot Removing

Small addition again:

This error shows up in the log while removing snapshots WITHOUT render= =3D ing the Vms unresponsive

=3D97
Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: = =3D Timed out during operation: cannot acquire state change lock
Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm= =3D vm.Vm ERROR vmId=3D3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting= b=3D lock job info
                    = =3D                      = =3D ;                     &nb= =3D sp;   Traceback (most recent call last):
                    = =3D                      = =3D ;                     &nb= =3D sp;     File "/usr/share/vdsm/virt/vm.py", line 5759, i= =3D n queryBlockJobs=3D85

=3D97



From: Soeren Malchow <soeren.malchow(a)mcon.net>
Date: Monday 1 June 2015 00:56
To: "libvirt-users(a)redhat.com" <libvirt-users(a)redhat.com>, users <= users(a)ovirt.org>
Subject: [ovirt-users] Bug in Sna= ps=3D hot Removing

Dear all

I am not sure if the mail just did not get any attention between all t= =3D he mails and this time it is also going to the libvirt mailing list.

I am experiencing a problem with VM becoming unresponsive when removin= =3D g Snapshots (Live Merge) and i think there is a serious problem.

Here are the previous mails,


The problem is on a system with everything on the latest version, Cent= =3D OS 7.1 and ovirt 3.5.2.1 all upgrades applied.

This Problem did NOT exist before upgrading to CentOS 7.1 with an envi= =3D ronment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-previe= =3D w repo activated.

I think this is a bug in libvirt, not ovirt itself, but i am not sure.= =3D The actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm= =3D .py, line 697).

We are very willing to help, test and supply log files in anyway we ca= =3D n. 

Regards
Soeren 

--_000_D1916815D97Csoerenmalchowmconnet_-- --===============2285857928821369350== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" LS1fMDAwX0QxOTE2ODE1RDk3Q3NvZXJlbm1hbGNob3dtY29ubmV0XwpDb250ZW50LVR5cGU6IHRl eHQvcGxhaW47IGNoYXJzZXQ9IldpbmRvd3MtMTI1MiIKQ29udGVudC1UcmFuc2Zlci1FbmNvZGlu ZzogcXVvdGVkLXByaW50YWJsZQoKQW5kIHNvcnJ5LCBhbm90aGVyIHVwZGF0ZSwgaXQgZG9lcyBr aWxsIHRoZSBWTSBwYXJ0bHksIGl0IHdhcyBzdGlsbCBwaW5nYWJsPQplIHdoZW4gaSB3cm90ZSB0 aGUgbGFzdCBtYWlsLCBidXQgbm8gc3NoIGFuZCBubyBzcGljZSBjb25zb2xlIHBvc3NpYmxlCgpG cm9tOiBTb2VyZW4gTWFsY2hvdyA8c29lcmVuLm1hbGNob3dAbWNvbi5uZXQ8bWFpbHRvOnNvZXJl bi5tYWxjaG93QG1jb24ubmU9CnQ+PgpEYXRlOiBNb25kYXkgMSBKdW5lIDIwMTUgMDE6MzUKVG86 IFNvZXJlbiBNYWxjaG93IDxzb2VyZW4ubWFsY2hvd0BtY29uLm5ldDxtYWlsdG86c29lcmVuLm1h bGNob3dAbWNvbi5uZXQ+PQo+LCAibGlidmlydC11c2Vyc0ByZWRoYXQuY29tPG1haWx0bzpsaWJ2 aXJ0LXVzZXJzQHJlZGhhdC5jb20+IiA8bGlidmlydC11c2U9CnJzQHJlZGhhdC5jb208bWFpbHRv OmxpYnZpcnQtdXNlcnNAcmVkaGF0LmNvbT4+LCB1c2VycyA8dXNlcnNAb3ZpcnQub3JnPG1haT0K bHRvOnVzZXJzQG92aXJ0Lm9yZz4+ClN1YmplY3Q6IFJlOiBbb3ZpcnQtdXNlcnNdIEJ1ZyBpbiBT bmFwc2hvdCBSZW1vdmluZwoKU21hbGwgYWRkaXRpb24gYWdhaW46CgpUaGlzIGVycm9yIHNob3dz IHVwIGluIHRoZSBsb2cgd2hpbGUgcmVtb3Zpbmcgc25hcHNob3RzIFdJVEhPVVQgcmVuZGVyaW5n IHQ9CmhlIFZtcyB1bnJlc3BvbnNpdmUKCj05NwpKdW4gMDEgMDE6MzM6NDUgbWMtZGMzaGFtLWNv bXB1dGUtMDItbGl2ZS5tYy5tY29uLm5ldCBsaWJ2aXJ0ZFsxNjU3XTogVGltZWQ9CiBvdXQgZHVy aW5nIG9wZXJhdGlvbjogY2Fubm90IGFjcXVpcmUgc3RhdGUgY2hhbmdlIGxvY2sKSnVuIDAxIDAx OjMzOjQ1IG1jLWRjM2hhbS1jb21wdXRlLTAyLWxpdmUubWMubWNvbi5uZXQgdmRzbVs2ODM5XTog dmRzbSB2bS5WPQptIEVSUk9SIHZtSWQ9M0RgNTY4NDhmNGEtY2Q3My00ZWRhLWJmNzktN2ViODBh ZTU2OWE5YDo6RXJyb3IgZ2V0dGluZyBibG9jayA9CmpvYiBpbmZvCiAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFRyYWNlYmFj az0KIChtb3N0IHJlY2VudCBjYWxsIGxhc3QpOgogICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIEZpbGUgIi89CnVzci9zaGFy ZS92ZHNtL3ZpcnQvdm0ucHkiLCBsaW5lIDU3NTksIGluIHF1ZXJ5QmxvY2tKb2JzPTg1Cgo9OTcK CgoKRnJvbTogU29lcmVuIE1hbGNob3cgPHNvZXJlbi5tYWxjaG93QG1jb24ubmV0PG1haWx0bzpz b2VyZW4ubWFsY2hvd0BtY29uLm5lPQp0Pj4KRGF0ZTogTW9uZGF5IDEgSnVuZSAyMDE1IDAwOjU2 ClRvOiAibGlidmlydC11c2Vyc0ByZWRoYXQuY29tPG1haWx0bzpsaWJ2aXJ0LXVzZXJzQHJlZGhh dC5jb20+IiA8bGlidmlydC11cz0KZXJzQHJlZGhhdC5jb208bWFpbHRvOmxpYnZpcnQtdXNlcnNA cmVkaGF0LmNvbT4+LCB1c2VycyA8dXNlcnNAb3ZpcnQub3JnPG1hPQppbHRvOnVzZXJzQG92aXJ0 Lm9yZz4+ClN1YmplY3Q6IFtvdmlydC11c2Vyc10gQnVnIGluIFNuYXBzaG90IFJlbW92aW5nCgpE ZWFyIGFsbAoKSSBhbSBub3Qgc3VyZSBpZiB0aGUgbWFpbCBqdXN0IGRpZCBub3QgZ2V0IGFueSBh dHRlbnRpb24gYmV0d2VlbiBhbGwgdGhlIG1hPQppbHMgYW5kIHRoaXMgdGltZSBpdCBpcyBhbHNv IGdvaW5nIHRvIHRoZSBsaWJ2aXJ0IG1haWxpbmcgbGlzdC4KCkkgYW0gZXhwZXJpZW5jaW5nIGEg cHJvYmxlbSB3aXRoIFZNIGJlY29taW5nIHVucmVzcG9uc2l2ZSB3aGVuIHJlbW92aW5nIFNuYT0K cHNob3RzIChMaXZlIE1lcmdlKSBhbmQgaSB0aGluayB0aGVyZSBpcyBhIHNlcmlvdXMgcHJvYmxl bS4KCkhlcmUgYXJlIHRoZSBwcmV2aW91cyBtYWlscywKCmh0dHA6Ly9saXN0cy5vdmlydC5vcmcv cGlwZXJtYWlsL3VzZXJzLzIwMTUtTWF5LzAzMzA4My5odG1sCgpUaGUgcHJvYmxlbSBpcyBvbiBh IHN5c3RlbSB3aXRoIGV2ZXJ5dGhpbmcgb24gdGhlIGxhdGVzdCB2ZXJzaW9uLCBDZW50T1MgNy49 CjEgYW5kIG92aXJ0IDMuNS4yLjEgYWxsIHVwZ3JhZGVzIGFwcGxpZWQuCgpUaGlzIFByb2JsZW0g ZGlkIE5PVCBleGlzdCBiZWZvcmUgdXBncmFkaW5nIHRvIENlbnRPUyA3LjEgd2l0aCBhbiBlbnZp cm9ubWU9Cm50IHJ1bm5pbmcgb3ZpcnQgMy41LjAgYW5kIDMuNS4xIGFuZCBGZWRvcmEgMjAgd2l0 aCB0aGUgbGlidmlydC1wcmV2aWV3IHJlcD0KbyBhY3RpdmF0ZWQuCgpJIHRoaW5rIHRoaXMgaXMg YSBidWcgaW4gbGlidmlydCwgbm90IG92aXJ0IGl0c2VsZiwgYnV0IGkgYW0gbm90IHN1cmUuIFRo ZSA9CmFjdHVhbCBmaWxlIHRocm93aW5nIHRoZSBleGNlcHRpb24gaXMgaW4gVkRTTSAoL3Vzci9z aGFyZS92ZHNtL3ZpcnQvdm0ucHksID0KbGluZSA2OTcpLgoKV2UgYXJlIHZlcnkgd2lsbGluZyB0 byBoZWxwLCB0ZXN0IGFuZCBzdXBwbHkgbG9nIGZpbGVzIGluIGFueXdheSB3ZSBjYW4uCgpSZWdh cmRzClNvZXJlbgoKCi0tXzAwMF9EMTkxNjgxNUQ5N0Nzb2VyZW5tYWxjaG93bWNvbm5ldF8KQ29u dGVudC1UeXBlOiB0ZXh0L2h0bWw7IGNoYXJzZXQ9IldpbmRvd3MtMTI1MiIKQ29udGVudC1JRDog PEI2ODI3QjZBN0MyMTE4NEFBNkU5RkM4NTJBRjMzMjEzQGxpcXVpZGNhbXBhaWduLmNvbT4KQ29u dGVudC1UcmFuc2Zlci1FbmNvZGluZzogcXVvdGVkLXByaW50YWJsZQoKPGh0bWw+CjxoZWFkPgo8 bWV0YSBodHRwLWVxdWl2PTNEIkNvbnRlbnQtVHlwZSIgY29udGVudD0zRCJ0ZXh0L2h0bWw7IGNo YXJzZXQ9M0RXaW5kb3dzLTE9CjI1MiI+CjwvaGVhZD4KPGJvZHkgc3R5bGU9M0Qid29yZC13cmFw OiBicmVhay13b3JkOyAtd2Via2l0LW5ic3AtbW9kZTogc3BhY2U7IC13ZWJraXQtbGluPQplLWJy ZWFrOiBhZnRlci13aGl0ZS1zcGFjZTsgY29sb3I6IHJnYigwLCAwLCAwKTsgZm9udC1zaXplOiAx NHB4OyBmb250LWZhbWk9Cmx5OiBDYWxpYnJpLCBzYW5zLXNlcmlmOyI+CjxkaXY+QW5kIHNvcnJ5 LCBhbm90aGVyIHVwZGF0ZSwgaXQgZG9lcyBraWxsIHRoZSBWTSBwYXJ0bHksIGl0IHdhcyBzdGls bCBwaT0KbmdhYmxlIHdoZW4gaSB3cm90ZSB0aGUgbGFzdCBtYWlsLCBidXQgbm8gc3NoIGFuZCBu byBzcGljZSBjb25zb2xlIHBvc3NpYmxlPQo8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8c3BhbiBp ZD0zRCJPTEtfU1JDX0JPRFlfU0VDVElPTiI+CjxkaXYgc3R5bGU9M0QiZm9udC1mYW1pbHk6Q2Fs aWJyaTsgZm9udC1zaXplOjExcHQ7IHRleHQtYWxpZ246bGVmdDsgY29sb3I6Yj0KbGFjazsgQk9S REVSLUJPVFRPTTogbWVkaXVtIG5vbmU7IEJPUkRFUi1MRUZUOiBtZWRpdW0gbm9uZTsgUEFERElO Ry1CT1RUT006PQogMGluOyBQQURESU5HLUxFRlQ6IDBpbjsgUEFERElORy1SSUdIVDogMGluOyBC T1JERVItVE9QOiAjYjVjNGRmIDFwdCBzb2xpZDs9CiBCT1JERVItUklHSFQ6IG1lZGl1bSBub25l OyBQQURESU5HLVRPUDogM3B0Ij4KPHNwYW4gc3R5bGU9M0QiZm9udC13ZWlnaHQ6Ym9sZCI+RnJv bTogPC9zcGFuPlNvZXJlbiBNYWxjaG93ICZsdDs8YSBocmVmPTNEPQoibWFpbHRvOnNvZXJlbi5t YWxjaG93QG1jb24ubmV0Ij5zb2VyZW4ubWFsY2hvd0BtY29uLm5ldDwvYT4mZ3Q7PGJyPgo8c3Bh biBzdHlsZT0zRCJmb250LXdlaWdodDpib2xkIj5EYXRlOiA8L3NwYW4+TW9uZGF5IDEgSnVuZSAy MDE1IDAxOjM1PGJyPgo8c3BhbiBzdHlsZT0zRCJmb250LXdlaWdodDpib2xkIj5UbzogPC9zcGFu PlNvZXJlbiBNYWxjaG93ICZsdDs8YSBocmVmPTNEIm09CmFpbHRvOnNvZXJlbi5tYWxjaG93QG1j b24ubmV0Ij5zb2VyZW4ubWFsY2hvd0BtY29uLm5ldDwvYT4mZ3Q7LCAmcXVvdDs8YSBocj0KZWY9 M0QibWFpbHRvOmxpYnZpcnQtdXNlcnNAcmVkaGF0LmNvbSI+bGlidmlydC11c2Vyc0ByZWRoYXQu Y29tPC9hPiZxdW90OyAmPQpsdDs8YSBocmVmPTNEIm1haWx0bzpsaWJ2aXJ0LXVzZXJzQHJlZGhh dC5jb20iPmxpYnZpcnQtdXNlcnNAcmVkaGF0LmNvbTwvYT49CiZndDssCiB1c2VycyAmbHQ7PGEg aHJlZj0zRCJtYWlsdG86dXNlcnNAb3ZpcnQub3JnIj51c2Vyc0BvdmlydC5vcmc8L2E+Jmd0Ozxi cj4KPHNwYW4gc3R5bGU9M0QiZm9udC13ZWlnaHQ6Ym9sZCI+U3ViamVjdDogPC9zcGFuPlJlOiBb b3ZpcnQtdXNlcnNdIEJ1ZyBpbiBTPQpuYXBzaG90IFJlbW92aW5nPGJyPgo8L2Rpdj4KPGRpdj48 YnI+CjwvZGl2Pgo8ZGl2Pgo8ZGl2IHN0eWxlPTNEIndvcmQtd3JhcDogYnJlYWstd29yZDsgLXdl YmtpdC1uYnNwLW1vZGU6IHNwYWNlOyAtd2Via2l0LWxpbmU9Ci1icmVhazogYWZ0ZXItd2hpdGUt c3BhY2U7IGNvbG9yOiByZ2IoMCwgMCwgMCk7IGZvbnQtc2l6ZTogMTRweDsgZm9udC1mYW1pbD0K eTogQ2FsaWJyaSwgc2Fucy1zZXJpZjsiPgo8ZGl2PlNtYWxsIGFkZGl0aW9uIGFnYWluOjwvZGl2 Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+VGhpcyBlcnJvciBzaG93cyB1cCBpbiB0aGUgbG9nIHdo aWxlIHJlbW92aW5nIHNuYXBzaG90cyBXSVRIT1VUIHJlbmRlcj0KaW5nIHRoZSBWbXMgdW5yZXNw b25zaXZlPC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj49OTc8L2Rpdj4KPGRpdj4KPGRpdj5K dW4gMDEgMDE6MzM6NDUgbWMtZGMzaGFtLWNvbXB1dGUtMDItbGl2ZS5tYy5tY29uLm5ldCBsaWJ2 aXJ0ZFsxNjU3XTogPQpUaW1lZCBvdXQgZHVyaW5nIG9wZXJhdGlvbjogY2Fubm90IGFjcXVpcmUg c3RhdGUgY2hhbmdlIGxvY2s8L2Rpdj4KPGRpdj5KdW4gMDEgMDE6MzM6NDUgbWMtZGMzaGFtLWNv bXB1dGUtMDItbGl2ZS5tYy5tY29uLm5ldCB2ZHNtWzY4MzldOiB2ZHNtPQogdm0uVm0gRVJST1Ig dm1JZD0zRGA1Njg0OGY0YS1jZDczLTRlZGEtYmY3OS03ZWI4MGFlNTY5YTlgOjpFcnJvciBnZXR0 aW5nIGI9CmxvY2sgam9iIGluZm88L2Rpdj4KPGRpdj4mbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJz cDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgPQombmJzcDsgJm5i c3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJz cDsgJm5ic3A9CjsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5i c3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYj0Kc3A7ICZuYnNwOyBUcmFjZWJhY2sgKG1vc3Qg cmVjZW50IGNhbGwgbGFzdCk6PC9kaXY+CjxkaXY+Jm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7 ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ID0KJm5ic3A7ICZuYnNw OyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7 ICZuYnNwPQo7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNw OyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmI9CnNwOyAmbmJzcDsgJm5ic3A7IEZpbGUgJnF1b3Q7 L3Vzci9zaGFyZS92ZHNtL3ZpcnQvdm0ucHkmcXVvdDssIGxpbmUgNTc1OSwgaT0KbiBxdWVyeUJs b2NrSm9icz04NTwvZGl2Pgo8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2Pj05NzwvZGl2Pgo8 ZGl2Pjxicj4KPC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8c3BhbiBp ZD0zRCJPTEtfU1JDX0JPRFlfU0VDVElPTiI+CjxkaXYgc3R5bGU9M0QiZm9udC1mYW1pbHk6Q2Fs aWJyaTsgZm9udC1zaXplOjExcHQ7IHRleHQtYWxpZ246bGVmdDsgY29sb3I6Yj0KbGFjazsgQk9S REVSLUJPVFRPTTogbWVkaXVtIG5vbmU7IEJPUkRFUi1MRUZUOiBtZWRpdW0gbm9uZTsgUEFERElO Ry1CT1RUT006PQogMGluOyBQQURESU5HLUxFRlQ6IDBpbjsgUEFERElORy1SSUdIVDogMGluOyBC T1JERVItVE9QOiAjYjVjNGRmIDFwdCBzb2xpZDs9CiBCT1JERVItUklHSFQ6IG1lZGl1bSBub25l OyBQQURESU5HLVRPUDogM3B0Ij4KPHNwYW4gc3R5bGU9M0QiZm9udC13ZWlnaHQ6Ym9sZCI+RnJv bTogPC9zcGFuPlNvZXJlbiBNYWxjaG93ICZsdDs8YSBocmVmPTNEPQoibWFpbHRvOnNvZXJlbi5t YWxjaG93QG1jb24ubmV0Ij5zb2VyZW4ubWFsY2hvd0BtY29uLm5ldDwvYT4mZ3Q7PGJyPgo8c3Bh biBzdHlsZT0zRCJmb250LXdlaWdodDpib2xkIj5EYXRlOiA8L3NwYW4+TW9uZGF5IDEgSnVuZSAy MDE1IDAwOjU2PGJyPgo8c3BhbiBzdHlsZT0zRCJmb250LXdlaWdodDpib2xkIj5UbzogPC9zcGFu PiZxdW90OzxhIGhyZWY9M0QibWFpbHRvOmxpYnZpcnQ9Ci11c2Vyc0ByZWRoYXQuY29tIj5saWJ2 aXJ0LXVzZXJzQHJlZGhhdC5jb208L2E+JnF1b3Q7ICZsdDs8YSBocmVmPTNEIm1haWx0bz0KOmxp YnZpcnQtdXNlcnNAcmVkaGF0LmNvbSI+bGlidmlydC11c2Vyc0ByZWRoYXQuY29tPC9hPiZndDss IHVzZXJzICZsdDs8YSBoPQpyZWY9M0QibWFpbHRvOnVzZXJzQG92aXJ0Lm9yZyI+dXNlcnNAb3Zp cnQub3JnPC9hPiZndDs8YnI+CjxzcGFuIHN0eWxlPTNEImZvbnQtd2VpZ2h0OmJvbGQiPlN1Ympl Y3Q6IDwvc3Bhbj5bb3ZpcnQtdXNlcnNdIEJ1ZyBpbiBTbmFwcz0KaG90IFJlbW92aW5nPGJyPgo8 L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2Pgo8ZGl2IHN0eWxlPTNEIndvcmQtd3JhcDogYnJl YWstd29yZDsgLXdlYmtpdC1uYnNwLW1vZGU6IHNwYWNlOyAtd2Via2l0LWxpbmU9Ci1icmVhazog YWZ0ZXItd2hpdGUtc3BhY2U7IGNvbG9yOiByZ2IoMCwgMCwgMCk7IGZvbnQtc2l6ZTogMTRweDsg Zm9udC1mYW1pbD0KeTogQ2FsaWJyaSwgc2Fucy1zZXJpZjsiPgo8ZGl2PkRlYXIgYWxsPC9kaXY+ CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5JIGFtIG5vdCBzdXJlIGlmIHRoZSBtYWlsIGp1c3QgZGlk IG5vdCBnZXQgYW55IGF0dGVudGlvbiBiZXR3ZWVuIGFsbCB0PQpoZSBtYWlscyBhbmQgdGhpcyB0 aW1lIGl0IGlzIGFsc28gZ29pbmcgdG8gdGhlIGxpYnZpcnQgbWFpbGluZyBsaXN0LjwvZGl2Pgo8 ZGl2Pjxicj4KPC9kaXY+CjxkaXY+SSBhbSBleHBlcmllbmNpbmcgYSBwcm9ibGVtIHdpdGggVk0g YmVjb21pbmcgdW5yZXNwb25zaXZlIHdoZW4gcmVtb3Zpbj0KZyBTbmFwc2hvdHMgKExpdmUgTWVy Z2UpIGFuZCBpIHRoaW5rIHRoZXJlIGlzIGEgc2VyaW91cyBwcm9ibGVtLjwvZGl2Pgo8ZGl2Pjxi cj4KPC9kaXY+CjxkaXY+SGVyZSBhcmUgdGhlIHByZXZpb3VzIG1haWxzLDwvZGl2Pgo8ZGl2Pjxi cj4KPC9kaXY+CjxkaXY+PGEgaHJlZj0zRCJodHRwOi8vbGlzdHMub3ZpcnQub3JnL3BpcGVybWFp bC91c2Vycy8yMDE1LU1heS8wMzMwODMuaHRtbD0KIj5odHRwOi8vbGlzdHMub3ZpcnQub3JnL3Bp cGVybWFpbC91c2Vycy8yMDE1LU1heS8wMzMwODMuaHRtbDwvYT48L2Rpdj4KPGRpdj48YnI+Cjwv ZGl2Pgo8ZGl2PlRoZSBwcm9ibGVtIGlzIG9uIGEgc3lzdGVtIHdpdGggZXZlcnl0aGluZyBvbiB0 aGUgbGF0ZXN0IHZlcnNpb24sIENlbnQ9Ck9TIDcuMSBhbmQgb3ZpcnQgMy41LjIuMSBhbGwgdXBn cmFkZXMgYXBwbGllZC48L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2PlRoaXMgUHJvYmxlbSBk aWQgTk9UIGV4aXN0IGJlZm9yZSB1cGdyYWRpbmcgdG8gQ2VudE9TIDcuMSB3aXRoIGFuIGVudmk9 CnJvbm1lbnQgcnVubmluZyBvdmlydCAzLjUuMCBhbmQgMy41LjEgYW5kIEZlZG9yYSAyMCB3aXRo IHRoZSBsaWJ2aXJ0LXByZXZpZT0KdyByZXBvIGFjdGl2YXRlZC48L2Rpdj4KPGRpdj48YnI+Cjwv ZGl2Pgo8ZGl2PkkgdGhpbmsgdGhpcyBpcyBhIGJ1ZyBpbiBsaWJ2aXJ0LCBub3Qgb3ZpcnQgaXRz ZWxmLCBidXQgaSBhbSBub3Qgc3VyZS49CiBUaGUgYWN0dWFsIGZpbGUgdGhyb3dpbmcgdGhlIGV4 Y2VwdGlvbiBpcyBpbiBWRFNNICgvdXNyL3NoYXJlL3Zkc20vdmlydC92bT0KLnB5LCBsaW5lIDY5 NykuPC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5XZSBhcmUgdmVyeSB3aWxsaW5nIHRvIGhl bHAsIHRlc3QgYW5kIHN1cHBseSBsb2cgZmlsZXMgaW4gYW55d2F5IHdlIGNhPQpuLiZuYnNwOzwv ZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+UmVnYXJkczwvZGl2Pgo8ZGl2PlNvZXJlbiZuYnNw OzwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjwvZGl2Pgo8L2Rpdj4KPC9zcGFuPjwvZGl2Pgo8L2Rp dj4KPC9zcGFuPgo8L2JvZHk+CjwvaHRtbD4KCi0tXzAwMF9EMTkxNjgxNUQ5N0Nzb2VyZW5tYWxj aG93bWNvbm5ldF8tLQo= --===============2285857928821369350==-- From amureini at redhat.com Tue Jun 2 07:47:56 2015 Content-Type: multipart/mixed; boundary="===============6503396493054743287==" MIME-Version: 1.0 From: Allon Mureinik To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Tue, 02 Jun 2015 07:47:51 -0400 Message-ID: <666994998.9087427.1433245671830.JavaMail.zimbra@redhat.com> In-Reply-To: D1916815.D97C%soeren.malchow@mcon.net --===============6503396493054743287== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable ------=3D_Part_9087426_1230111981.1433245671829 Content-Type: text/plain; charset=3Dutf-8 Content-Transfer-Encoding: quoted-printable Adam, can you take a look at this please?=3D20 Thanks!=3D20 ----- Original Message ----- > From: "Soeren Malchow" > To: "Soeren Malchow" , libvirt-users(a)redhat.= com, > "users" > Sent: Monday, June 1, 2015 2:39:24 AM > Subject: Re: [ovirt-users] Bug in Snapshot Removing > And sorry, another update, it does kill the VM partly, it was still pinga= =3D ble > when i wrote the last mail, but no ssh and no spice console possible > From: Soeren Malchow < soeren.malchow(a)mcon.net > > Date: Monday 1 June 2015 01:35 > To: Soeren Malchow < soeren.malchow(a)mcon.net >, " libvirt-users(a)redha= t.co=3D m " > < libvirt-users(a)redhat.com >, users < users(a)ovirt.org > > Subject: Re: [ovirt-users] Bug in Snapshot Removing > Small addition again: > This error shows up in the log while removing snapshots WITHOUT rendering= =3D the > Vms unresponsive > =3DE2=3D80=3D94 > Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Tim= =3D ed > out during operation: cannot acquire state change lock > Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm= =3D .Vm > ERROR vmId=3D3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting bloc= k =3D job > info > Traceback (most recent call last): > File "/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs=3DE2=3D80= =3DA6 > =3DE2=3D80=3D94 > From: Soeren Malchow < soeren.malchow(a)mcon.net > > Date: Monday 1 June 2015 00:56 > To: " libvirt-users(a)redhat.com " < libvirt-users(a)redhat.com >, users < > users(a)ovirt.org > > Subject: [ovirt-users] Bug in Snapshot Removing > Dear all > I am not sure if the mail just did not get any attention between all the > mails and this time it is also going to the libvirt mailing list. > I am experiencing a problem with VM becoming unresponsive when removing > Snapshots (Live Merge) and i think there is a serious problem. > Here are the previous mails, > http://lists.ovirt.org/pipermail/users/2015-May/033083.html > The problem is on a system with everything on the latest version, CentOS = =3D 7.1 > and ovirt 3.5.2.1 all upgrades applied. > This Problem did NOT exist before upgrading to CentOS 7.1 with an environ= =3D ment > running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview repo > activated. > I think this is a bug in libvirt, not ovirt itself, but i am not sure. Th= =3D e > actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py= =3D , > line 697). > We are very willing to help, test and supply log files in anyway we can. > Regards > Soeren > _______________________________________________ > Users mailing list > Users(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/users ------=3D_Part_9087426_1230111981.1433245671829 Content-Type: text/html; charset=3Dutf-8 Content-Transfer-Encoding: quoted-printable
Adam, can you take a look at this p= =3D lease?

Thanks!



Fr= =3D om: "Soeren Malchow" <soeren.malchow(a)mcon.net>
To: "S= oe=3D ren Malchow" <soeren.malchow(a)mcon.net>, libvirt-users(a)redhat.com,= "us=3D ers" <users(a)ovirt.org>
Sent: Monday, June 1, 2015 2:39:24= A=3D M
Subject: Re: [ovirt-users] Bug in Snapshot Removing
And sorry, another update, it does kill the VM partly, it was still pi= =3D ngable when i wrote the last mail, but no ssh and no spice console possible= =3D


Small addition again:

This error shows up in the log while removing snapshots WITHOUT render= =3D ing the Vms unresponsive

=3DE2=3D80=3D94
Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: = =3D Timed out during operation: cannot acquire state change lock
Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm= =3D vm.Vm ERROR vmId=3D3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting= b=3D lock job info
                    = =3D                      = =3D ;                     &nb= =3D sp;   Traceback (most recent call last):
                    = =3D                      = =3D ;                     &nb= =3D sp;     File "/usr/share/vdsm/virt/vm.py", line 5759, in queryBlo= =3D ckJobs=3DE2=3D80=3DA6

=3DE2=3D80=3D94




Dear all

I am not sure if the mail just did not get any attention between all t= =3D he mails and this time it is also going to the libvirt mailing list.

I am experiencing a problem with VM becoming unresponsive when removin= =3D g Snapshots (Live Merge) and i think there is a serious problem.

Here are the previous mails,


The problem is on a system with everything on the latest version, Cent= =3D OS 7.1 and ovirt 3.5.2.1 all upgrades applied.

This Problem did NOT exist before upgrading to CentOS 7.1 with an envi= =3D ronment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-previe= =3D w repo activated.

I think this is a bug in libvirt, not ovirt itself, but i am not sure.= =3D The actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm= =3D .py, line 697).

We are very willing to help, test and supply log files in anyway we ca= =3D n. 

Regards
Soeren 


_______________________________________________
Users mailing listUsers(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

------=3D_Part_9087426_1230111981.1433245671829-- --===============6503396493054743287== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" LS0tLS0tPV9QYXJ0XzkwODc0MjZfMTIzMDExMTk4MS4xNDMzMjQ1NjcxODI5CkNvbnRlbnQtVHlw ZTogdGV4dC9wbGFpbjsgY2hhcnNldD11dGYtOApDb250ZW50LVRyYW5zZmVyLUVuY29kaW5nOiBx dW90ZWQtcHJpbnRhYmxlCgpBZGFtLCBjYW4geW91IHRha2UgYSBsb29rIGF0IHRoaXMgcGxlYXNl Pz0yMAoKVGhhbmtzIT0yMAoKLS0tLS0gT3JpZ2luYWwgTWVzc2FnZSAtLS0tLQoKPiBGcm9tOiAi U29lcmVuIE1hbGNob3ciIDxzb2VyZW4ubWFsY2hvd0BtY29uLm5ldD4KPiBUbzogIlNvZXJlbiBN YWxjaG93IiA8c29lcmVuLm1hbGNob3dAbWNvbi5uZXQ+LCBsaWJ2aXJ0LXVzZXJzQHJlZGhhdC5j b20sCj4gInVzZXJzIiA8dXNlcnNAb3ZpcnQub3JnPgo+IFNlbnQ6IE1vbmRheSwgSnVuZSAxLCAy MDE1IDI6Mzk6MjQgQU0KPiBTdWJqZWN0OiBSZTogW292aXJ0LXVzZXJzXSBCdWcgaW4gU25hcHNo b3QgUmVtb3ZpbmcKCj4gQW5kIHNvcnJ5LCBhbm90aGVyIHVwZGF0ZSwgaXQgZG9lcyBraWxsIHRo ZSBWTSBwYXJ0bHksIGl0IHdhcyBzdGlsbCBwaW5nYT0KYmxlCj4gd2hlbiBpIHdyb3RlIHRoZSBs YXN0IG1haWwsIGJ1dCBubyBzc2ggYW5kIG5vIHNwaWNlIGNvbnNvbGUgcG9zc2libGUKCj4gRnJv bTogU29lcmVuIE1hbGNob3cgPCBzb2VyZW4ubWFsY2hvd0BtY29uLm5ldCA+Cj4gRGF0ZTogTW9u ZGF5IDEgSnVuZSAyMDE1IDAxOjM1Cj4gVG86IFNvZXJlbiBNYWxjaG93IDwgc29lcmVuLm1hbGNo b3dAbWNvbi5uZXQgPiwgIiBsaWJ2aXJ0LXVzZXJzQHJlZGhhdC5jbz0KbSAiCj4gPCBsaWJ2aXJ0 LXVzZXJzQHJlZGhhdC5jb20gPiwgdXNlcnMgPCB1c2Vyc0BvdmlydC5vcmcgPgo+IFN1YmplY3Q6 IFJlOiBbb3ZpcnQtdXNlcnNdIEJ1ZyBpbiBTbmFwc2hvdCBSZW1vdmluZwoKPiBTbWFsbCBhZGRp dGlvbiBhZ2FpbjoKCj4gVGhpcyBlcnJvciBzaG93cyB1cCBpbiB0aGUgbG9nIHdoaWxlIHJlbW92 aW5nIHNuYXBzaG90cyBXSVRIT1VUIHJlbmRlcmluZz0KIHRoZQo+IFZtcyB1bnJlc3BvbnNpdmUK Cj4gPUUyPTgwPTk0Cj4gSnVuIDAxIDAxOjMzOjQ1IG1jLWRjM2hhbS1jb21wdXRlLTAyLWxpdmUu bWMubWNvbi5uZXQgbGlidmlydGRbMTY1N106IFRpbT0KZWQKPiBvdXQgZHVyaW5nIG9wZXJhdGlv bjogY2Fubm90IGFjcXVpcmUgc3RhdGUgY2hhbmdlIGxvY2sKPiBKdW4gMDEgMDE6MzM6NDUgbWMt ZGMzaGFtLWNvbXB1dGUtMDItbGl2ZS5tYy5tY29uLm5ldCB2ZHNtWzY4MzldOiB2ZHNtIHZtPQou Vm0KPiBFUlJPUiB2bUlkPTNEYDU2ODQ4ZjRhLWNkNzMtNGVkYS1iZjc5LTdlYjgwYWU1NjlhOWA6 OkVycm9yIGdldHRpbmcgYmxvY2sgPQpqb2IKPiBpbmZvCj4gVHJhY2ViYWNrIChtb3N0IHJlY2Vu dCBjYWxsIGxhc3QpOgo+IEZpbGUgIi91c3Ivc2hhcmUvdmRzbS92aXJ0L3ZtLnB5IiwgbGluZSA1 NzU5LCBpbiBxdWVyeUJsb2NrSm9icz1FMj04MD1BNgoKPiA9RTI9ODA9OTQKCj4gRnJvbTogU29l cmVuIE1hbGNob3cgPCBzb2VyZW4ubWFsY2hvd0BtY29uLm5ldCA+Cj4gRGF0ZTogTW9uZGF5IDEg SnVuZSAyMDE1IDAwOjU2Cj4gVG86ICIgbGlidmlydC11c2Vyc0ByZWRoYXQuY29tICIgPCBsaWJ2 aXJ0LXVzZXJzQHJlZGhhdC5jb20gPiwgdXNlcnMgPAo+IHVzZXJzQG92aXJ0Lm9yZyA+Cj4gU3Vi amVjdDogW292aXJ0LXVzZXJzXSBCdWcgaW4gU25hcHNob3QgUmVtb3ZpbmcKCj4gRGVhciBhbGwK Cj4gSSBhbSBub3Qgc3VyZSBpZiB0aGUgbWFpbCBqdXN0IGRpZCBub3QgZ2V0IGFueSBhdHRlbnRp b24gYmV0d2VlbiBhbGwgdGhlCj4gbWFpbHMgYW5kIHRoaXMgdGltZSBpdCBpcyBhbHNvIGdvaW5n IHRvIHRoZSBsaWJ2aXJ0IG1haWxpbmcgbGlzdC4KCj4gSSBhbSBleHBlcmllbmNpbmcgYSBwcm9i bGVtIHdpdGggVk0gYmVjb21pbmcgdW5yZXNwb25zaXZlIHdoZW4gcmVtb3ZpbmcKPiBTbmFwc2hv dHMgKExpdmUgTWVyZ2UpIGFuZCBpIHRoaW5rIHRoZXJlIGlzIGEgc2VyaW91cyBwcm9ibGVtLgoK PiBIZXJlIGFyZSB0aGUgcHJldmlvdXMgbWFpbHMsCgo+IGh0dHA6Ly9saXN0cy5vdmlydC5vcmcv cGlwZXJtYWlsL3VzZXJzLzIwMTUtTWF5LzAzMzA4My5odG1sCgo+IFRoZSBwcm9ibGVtIGlzIG9u IGEgc3lzdGVtIHdpdGggZXZlcnl0aGluZyBvbiB0aGUgbGF0ZXN0IHZlcnNpb24sIENlbnRPUyA9 CjcuMQo+IGFuZCBvdmlydCAzLjUuMi4xIGFsbCB1cGdyYWRlcyBhcHBsaWVkLgoKPiBUaGlzIFBy b2JsZW0gZGlkIE5PVCBleGlzdCBiZWZvcmUgdXBncmFkaW5nIHRvIENlbnRPUyA3LjEgd2l0aCBh biBlbnZpcm9uPQptZW50Cj4gcnVubmluZyBvdmlydCAzLjUuMCBhbmQgMy41LjEgYW5kIEZlZG9y YSAyMCB3aXRoIHRoZSBsaWJ2aXJ0LXByZXZpZXcgcmVwbwo+IGFjdGl2YXRlZC4KCj4gSSB0aGlu ayB0aGlzIGlzIGEgYnVnIGluIGxpYnZpcnQsIG5vdCBvdmlydCBpdHNlbGYsIGJ1dCBpIGFtIG5v dCBzdXJlLiBUaD0KZQo+IGFjdHVhbCBmaWxlIHRocm93aW5nIHRoZSBleGNlcHRpb24gaXMgaW4g VkRTTSAoL3Vzci9zaGFyZS92ZHNtL3ZpcnQvdm0ucHk9CiwKPiBsaW5lIDY5NykuCgo+IFdlIGFy ZSB2ZXJ5IHdpbGxpbmcgdG8gaGVscCwgdGVzdCBhbmQgc3VwcGx5IGxvZyBmaWxlcyBpbiBhbnl3 YXkgd2UgY2FuLgoKPiBSZWdhcmRzCj4gU29lcmVuCgo+IF9fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fCj4gVXNlcnMgbWFpbGluZyBsaXN0Cj4gVXNlcnNAb3Zp cnQub3JnCj4gaHR0cDovL2xpc3RzLm92aXJ0Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL3VzZXJzCgot LS0tLS09X1BhcnRfOTA4NzQyNl8xMjMwMTExOTgxLjE0MzMyNDU2NzE4MjkKQ29udGVudC1UeXBl OiB0ZXh0L2h0bWw7IGNoYXJzZXQ9dXRmLTgKQ29udGVudC1UcmFuc2Zlci1FbmNvZGluZzogcXVv dGVkLXByaW50YWJsZQoKPGh0bWw+PGJvZHk+PGRpdiBzdHlsZT0zRCJmb250LWZhbWlseTogdGlt ZXMgbmV3IHJvbWFuLG5ldyB5b3JrLHRpbWVzLHNlcmlmPQo7IGZvbnQtc2l6ZTogMTJwdDsgY29s b3I6ICMwMDAwMDAiPjxkaXY+QWRhbSwgY2FuIHlvdSB0YWtlIGEgbG9vayBhdCB0aGlzIHA9Cmxl YXNlPzwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+VGhhbmtzITxicj48YnI+PC9kaXY+PGRpdj48 YnI+PC9kaXY+PGhyIGlkPQo9M0QiendjaHIiPjxibG9ja3F1b3RlIHN0eWxlPTNEImJvcmRlci1s ZWZ0OjJweCBzb2xpZCAjMTAxMEZGO21hcmdpbi1sZWZ0OjU9CnB4O3BhZGRpbmctbGVmdDo1cHg7 Y29sb3I6IzAwMDtmb250LXdlaWdodDpub3JtYWw7Zm9udC1zdHlsZTpub3JtYWw7dGV4dC1kZT0K Y29yYXRpb246bm9uZTtmb250LWZhbWlseTpIZWx2ZXRpY2EsQXJpYWwsc2Fucy1zZXJpZjtmb250 LXNpemU6MTJwdDsiPjxiPkZyPQpvbTogPC9iPiJTb2VyZW4gTWFsY2hvdyIgJmx0O3NvZXJlbi5t YWxjaG93QG1jb24ubmV0Jmd0Ozxicj48Yj5UbzogPC9iPiJTb2U9CnJlbiBNYWxjaG93IiAmbHQ7 c29lcmVuLm1hbGNob3dAbWNvbi5uZXQmZ3Q7LCBsaWJ2aXJ0LXVzZXJzQHJlZGhhdC5jb20sICJ1 cz0KZXJzIiAmbHQ7dXNlcnNAb3ZpcnQub3JnJmd0Ozxicj48Yj5TZW50OiA8L2I+TW9uZGF5LCBK dW5lIDEsIDIwMTUgMjozOToyNCBBPQpNPGJyPjxiPlN1YmplY3Q6IDwvYj5SZTogW292aXJ0LXVz ZXJzXSBCdWcgaW4gU25hcHNob3QgUmVtb3Zpbmc8YnI+PGRpdj48YnI9Cj48L2Rpdj4KCgoKCjxk aXY+QW5kIHNvcnJ5LCBhbm90aGVyIHVwZGF0ZSwgaXQgZG9lcyBraWxsIHRoZSBWTSBwYXJ0bHks IGl0IHdhcyBzdGlsbCBwaT0KbmdhYmxlIHdoZW4gaSB3cm90ZSB0aGUgbGFzdCBtYWlsLCBidXQg bm8gc3NoIGFuZCBubyBzcGljZSBjb25zb2xlIHBvc3NpYmxlPQo8L2Rpdj4KPGRpdj48YnI+Cjwv ZGl2Pgo8c3BhbiBpZD0zRCJPTEtfU1JDX0JPRFlfU0VDVElPTiI+CjxkaXYgc3R5bGU9M0QiZm9u dC1mYW1pbHk6Q2FsaWJyaTsgZm9udC1zaXplOjExcHQ7IHRleHQtYWxpZ246bGVmdDsgY29sb3I6 Yj0KbGFjazsgQk9SREVSLUJPVFRPTTogbWVkaXVtIG5vbmU7IEJPUkRFUi1MRUZUOiBtZWRpdW0g bm9uZTsgUEFERElORy1CT1RUT006PQogMGluOyBQQURESU5HLUxFRlQ6IDBpbjsgUEFERElORy1S SUdIVDogMGluOyBCT1JERVItVE9QOiAjYjVjNGRmIDFwdCBzb2xpZDs9CiBCT1JERVItUklHSFQ6 IG1lZGl1bSBub25lOyBQQURESU5HLVRPUDogM3B0Ij4KPHNwYW4gc3R5bGU9M0QiZm9udC13ZWln aHQ6Ym9sZCI+RnJvbTogPC9zcGFuPlNvZXJlbiBNYWxjaG93ICZsdDs8YSBocmVmPTNEPQoibWFp bHRvOnNvZXJlbi5tYWxjaG93QG1jb24ubmV0IiB0YXJnZXQ9M0QiX2JsYW5rIj5zb2VyZW4ubWFs Y2hvd0BtY29uLm5ldDw9Ci9hPiZndDs8YnI+CjxzcGFuIHN0eWxlPTNEImZvbnQtd2VpZ2h0OmJv bGQiPkRhdGU6IDwvc3Bhbj5Nb25kYXkgMSBKdW5lIDIwMTUgMDE6MzU8YnI+CjxzcGFuIHN0eWxl PTNEImZvbnQtd2VpZ2h0OmJvbGQiPlRvOiA8L3NwYW4+U29lcmVuIE1hbGNob3cgJmx0OzxhIGhy ZWY9M0QibT0KYWlsdG86c29lcmVuLm1hbGNob3dAbWNvbi5uZXQiIHRhcmdldD0zRCJfYmxhbmsi PnNvZXJlbi5tYWxjaG93QG1jb24ubmV0PC9hPQo+Jmd0OywgIjxhIGhyZWY9M0QibWFpbHRvOmxp YnZpcnQtdXNlcnNAcmVkaGF0LmNvbSIgdGFyZ2V0PTNEIl9ibGFuayI+bGlidmk9CnJ0LXVzZXJz QHJlZGhhdC5jb208L2E+IiAmbHQ7PGEgaHJlZj0zRCJtYWlsdG86bGlidmlydC11c2Vyc0ByZWRo YXQuY29tIiB0YT0KcmdldD0zRCJfYmxhbmsiPmxpYnZpcnQtdXNlcnNAcmVkaGF0LmNvbTwvYT4m Z3Q7LAogdXNlcnMgJmx0OzxhIGhyZWY9M0QibWFpbHRvOnVzZXJzQG92aXJ0Lm9yZyIgdGFyZ2V0 PTNEIl9ibGFuayI+dXNlcnNAb3ZpcnQ9Ci5vcmc8L2E+Jmd0Ozxicj4KPHNwYW4gc3R5bGU9M0Qi Zm9udC13ZWlnaHQ6Ym9sZCI+U3ViamVjdDogPC9zcGFuPlJlOiBbb3ZpcnQtdXNlcnNdIEJ1ZyBp biBTPQpuYXBzaG90IFJlbW92aW5nPGJyPgo8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2Pgo8 ZGl2IHN0eWxlPTNEIndvcmQtd3JhcDogYnJlYWstd29yZDsgLXdlYmtpdC1uYnNwLW1vZGU6IHNw YWNlOyAtd2Via2l0LWxpbmU9Ci1icmVhazogYWZ0ZXItd2hpdGUtc3BhY2U7IGNvbG9yOiByZ2Io MCwgMCwgMCk7IGZvbnQtc2l6ZTogMTRweDsgZm9udC1mYW1pbD0KeTogQ2FsaWJyaSwgc2Fucy1z ZXJpZjsiPgo8ZGl2PlNtYWxsIGFkZGl0aW9uIGFnYWluOjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+ CjxkaXY+VGhpcyBlcnJvciBzaG93cyB1cCBpbiB0aGUgbG9nIHdoaWxlIHJlbW92aW5nIHNuYXBz aG90cyBXSVRIT1VUIHJlbmRlcj0KaW5nIHRoZSBWbXMgdW5yZXNwb25zaXZlPC9kaXY+CjxkaXY+ PGJyPgo8L2Rpdj4KPGRpdj49RTI9ODA9OTQ8L2Rpdj4KPGRpdj4KPGRpdj5KdW4gMDEgMDE6MzM6 NDUgbWMtZGMzaGFtLWNvbXB1dGUtMDItbGl2ZS5tYy5tY29uLm5ldCBsaWJ2aXJ0ZFsxNjU3XTog PQpUaW1lZCBvdXQgZHVyaW5nIG9wZXJhdGlvbjogY2Fubm90IGFjcXVpcmUgc3RhdGUgY2hhbmdl IGxvY2s8L2Rpdj4KPGRpdj5KdW4gMDEgMDE6MzM6NDUgbWMtZGMzaGFtLWNvbXB1dGUtMDItbGl2 ZS5tYy5tY29uLm5ldCB2ZHNtWzY4MzldOiB2ZHNtPQogdm0uVm0gRVJST1Igdm1JZD0zRGA1Njg0 OGY0YS1jZDczLTRlZGEtYmY3OS03ZWI4MGFlNTY5YTlgOjpFcnJvciBnZXR0aW5nIGI9CmxvY2sg am9iIGluZm88L2Rpdj4KPGRpdj4mbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZu YnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgPQombmJzcDsgJm5ic3A7ICZuYnNwOyAm bmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A9Cjsg Jm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAm bmJzcDsgJm5ic3A7ICZuYj0Kc3A7ICZuYnNwOyBUcmFjZWJhY2sgKG1vc3QgcmVjZW50IGNhbGwg bGFzdCk6PC9kaXY+CjxkaXY+Jm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJz cDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ID0KJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5i c3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwPQo7ICZu YnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5i c3A7ICZuYnNwOyAmbmI9CnNwOyAmbmJzcDsgJm5ic3A7IEZpbGUgIi91c3Ivc2hhcmUvdmRzbS92 aXJ0L3ZtLnB5IiwgbGluZSA1NzU5LCBpbiBxdWVyeUJsbz0KY2tKb2JzPUUyPTgwPUE2PC9kaXY+ CjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+PUUyPTgwPTk0PC9kaXY+CjxkaXY+PGJyPgo8 L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxzcGFuIGlkPTNEIk9MS19T UkNfQk9EWV9TRUNUSU9OIj4KPGRpdiBzdHlsZT0zRCJmb250LWZhbWlseTpDYWxpYnJpOyBmb250 LXNpemU6MTFwdDsgdGV4dC1hbGlnbjpsZWZ0OyBjb2xvcjpiPQpsYWNrOyBCT1JERVItQk9UVE9N OiBtZWRpdW0gbm9uZTsgQk9SREVSLUxFRlQ6IG1lZGl1bSBub25lOyBQQURESU5HLUJPVFRPTTo9 CiAwaW47IFBBRERJTkctTEVGVDogMGluOyBQQURESU5HLVJJR0hUOiAwaW47IEJPUkRFUi1UT1A6 ICNiNWM0ZGYgMXB0IHNvbGlkOz0KIEJPUkRFUi1SSUdIVDogbWVkaXVtIG5vbmU7IFBBRERJTkct VE9QOiAzcHQiPgo8c3BhbiBzdHlsZT0zRCJmb250LXdlaWdodDpib2xkIj5Gcm9tOiA8L3NwYW4+ U29lcmVuIE1hbGNob3cgJmx0OzxhIGhyZWY9M0Q9CiJtYWlsdG86c29lcmVuLm1hbGNob3dAbWNv bi5uZXQiIHRhcmdldD0zRCJfYmxhbmsiPnNvZXJlbi5tYWxjaG93QG1jb24ubmV0PD0KL2E+Jmd0 Ozxicj4KPHNwYW4gc3R5bGU9M0QiZm9udC13ZWlnaHQ6Ym9sZCI+RGF0ZTogPC9zcGFuPk1vbmRh eSAxIEp1bmUgMjAxNSAwMDo1Njxicj4KPHNwYW4gc3R5bGU9M0QiZm9udC13ZWlnaHQ6Ym9sZCI+ VG86IDwvc3Bhbj4iPGEgaHJlZj0zRCJtYWlsdG86bGlidmlydC11c2VyPQpzQHJlZGhhdC5jb20i IHRhcmdldD0zRCJfYmxhbmsiPmxpYnZpcnQtdXNlcnNAcmVkaGF0LmNvbTwvYT4iICZsdDs8YSBo cmVmPQo9M0QibWFpbHRvOmxpYnZpcnQtdXNlcnNAcmVkaGF0LmNvbSIgdGFyZ2V0PTNEIl9ibGFu ayI+bGlidmlydC11c2Vyc0ByZWRoYXQ9Ci5jb208L2E+Jmd0OywgdXNlcnMgJmx0OzxhIGhyZWY9 M0QibWFpbHRvOnVzZXJzQG92aXJ0Lm9yZyIgdGFyZ2V0PTNEIl9ibGFuaz0KIj51c2Vyc0Bvdmly dC5vcmc8L2E+Jmd0Ozxicj4KPHNwYW4gc3R5bGU9M0QiZm9udC13ZWlnaHQ6Ym9sZCI+U3ViamVj dDogPC9zcGFuPltvdmlydC11c2Vyc10gQnVnIGluIFNuYXBzPQpob3QgUmVtb3Zpbmc8YnI+Cjwv ZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+CjxkaXYgc3R5bGU9M0Qid29yZC13cmFwOiBicmVh ay13b3JkOyAtd2Via2l0LW5ic3AtbW9kZTogc3BhY2U7IC13ZWJraXQtbGluZT0KLWJyZWFrOiBh ZnRlci13aGl0ZS1zcGFjZTsgY29sb3I6IHJnYigwLCAwLCAwKTsgZm9udC1zaXplOiAxNHB4OyBm b250LWZhbWlsPQp5OiBDYWxpYnJpLCBzYW5zLXNlcmlmOyI+CjxkaXY+RGVhciBhbGw8L2Rpdj4K PGRpdj48YnI+CjwvZGl2Pgo8ZGl2PkkgYW0gbm90IHN1cmUgaWYgdGhlIG1haWwganVzdCBkaWQg bm90IGdldCBhbnkgYXR0ZW50aW9uIGJldHdlZW4gYWxsIHQ9CmhlIG1haWxzIGFuZCB0aGlzIHRp bWUgaXQgaXMgYWxzbyBnb2luZyB0byB0aGUgbGlidmlydCBtYWlsaW5nIGxpc3QuPC9kaXY+Cjxk aXY+PGJyPgo8L2Rpdj4KPGRpdj5JIGFtIGV4cGVyaWVuY2luZyBhIHByb2JsZW0gd2l0aCBWTSBi ZWNvbWluZyB1bnJlc3BvbnNpdmUgd2hlbiByZW1vdmluPQpnIFNuYXBzaG90cyAoTGl2ZSBNZXJn ZSkgYW5kIGkgdGhpbmsgdGhlcmUgaXMgYSBzZXJpb3VzIHByb2JsZW0uPC9kaXY+CjxkaXY+PGJy Pgo8L2Rpdj4KPGRpdj5IZXJlIGFyZSB0aGUgcHJldmlvdXMgbWFpbHMsPC9kaXY+CjxkaXY+PGJy Pgo8L2Rpdj4KPGRpdj48YSBocmVmPTNEImh0dHA6Ly9saXN0cy5vdmlydC5vcmcvcGlwZXJtYWls L3VzZXJzLzIwMTUtTWF5LzAzMzA4My5odG1sPQoiIHRhcmdldD0zRCJfYmxhbmsiPmh0dHA6Ly9s aXN0cy5vdmlydC5vcmcvcGlwZXJtYWlsL3VzZXJzLzIwMTUtTWF5LzAzMzA4My49Cmh0bWw8L2E+ PC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5UaGUgcHJvYmxlbSBpcyBvbiBhIHN5c3RlbSB3 aXRoIGV2ZXJ5dGhpbmcgb24gdGhlIGxhdGVzdCB2ZXJzaW9uLCBDZW50PQpPUyA3LjEgYW5kIG92 aXJ0IDMuNS4yLjEgYWxsIHVwZ3JhZGVzIGFwcGxpZWQuPC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4K PGRpdj5UaGlzIFByb2JsZW0gZGlkIE5PVCBleGlzdCBiZWZvcmUgdXBncmFkaW5nIHRvIENlbnRP UyA3LjEgd2l0aCBhbiBlbnZpPQpyb25tZW50IHJ1bm5pbmcgb3ZpcnQgMy41LjAgYW5kIDMuNS4x IGFuZCBGZWRvcmEgMjAgd2l0aCB0aGUgbGlidmlydC1wcmV2aWU9CncgcmVwbyBhY3RpdmF0ZWQu PC9kaXY+CjxkaXY+PGJyPgo8L2Rpdj4KPGRpdj5JIHRoaW5rIHRoaXMgaXMgYSBidWcgaW4gbGli dmlydCwgbm90IG92aXJ0IGl0c2VsZiwgYnV0IGkgYW0gbm90IHN1cmUuPQogVGhlIGFjdHVhbCBm aWxlIHRocm93aW5nIHRoZSBleGNlcHRpb24gaXMgaW4gVkRTTSAoL3Vzci9zaGFyZS92ZHNtL3Zp cnQvdm09Ci5weSwgbGluZSA2OTcpLjwvZGl2Pgo8ZGl2Pjxicj4KPC9kaXY+CjxkaXY+V2UgYXJl IHZlcnkgd2lsbGluZyB0byBoZWxwLCB0ZXN0IGFuZCBzdXBwbHkgbG9nIGZpbGVzIGluIGFueXdh eSB3ZSBjYT0Kbi4mbmJzcDs8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8ZGl2PlJlZ2FyZHM8L2Rp dj4KPGRpdj5Tb2VyZW4mbmJzcDs8L2Rpdj4KPGRpdj48YnI+CjwvZGl2Pgo8L2Rpdj4KPC9kaXY+ Cjwvc3Bhbj48L2Rpdj4KPC9kaXY+Cjwvc3Bhbj4KCgo8YnI+X19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX188YnI+VXNlcnMgbWFpbGluZyBsaXN0PGI9CnI+VXNl cnNAb3ZpcnQub3JnPGJyPmh0dHA6Ly9saXN0cy5vdmlydC5vcmcvbWFpbG1hbi9saXN0aW5mby91 c2Vyczxicj48L2Jsbz0KY2txdW90ZT48ZGl2Pjxicj48L2Rpdj48L2Rpdj48L2JvZHk+PC9odG1s PgotLS0tLS09X1BhcnRfOTA4NzQyNl8xMjMwMTExOTgxLjE0MzMyNDU2NzE4MjktLQo= --===============6503396493054743287==-- From alitke at redhat.com Tue Jun 2 12:53:41 2015 Content-Type: multipart/mixed; boundary="===============0109703741291895920==" MIME-Version: 1.0 From: Adam Litke To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Tue, 02 Jun 2015 12:53:38 -0400 Message-ID: <20150602165338.GB12507@redhat.com> In-Reply-To: D1916815.D97C%soeren.malchow@mcon.net --===============0109703741291895920== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hello Soeren. I've started to look at this issue and I'd agree that at first glance it looks like a libvirt issue. The 'cannot acquire state change lock' messages suggest a locking bug or severe contention at least. To help me better understand the problem I have a few questions about your setup. >From your earlier report it appears that you have 15 VMs running on the failing host. Are you attempting to remove snapshots from all VMs at the same time? Have you tried with fewer concurrent operations? I'd be curious to understand if the problem is connected to the number of VMs running or the number of active block jobs. Have you tried RHEL-7.1 as a hypervisor host? Rather than rebooting the host, does restarting libvirtd cause the VMs to become responsive again? Note that this operation may cause the host to move to Unresponsive state in the UI for a short period of time. Thanks for your report. On 31/05/15 23:39 +0000, Soeren Malchow wrote: >And sorry, another update, it does kill the VM partly, it was still pingab= le when i wrote the last mail, but no ssh and no spice console possible > >From: Soeren Malchow > >Date: Monday 1 June 2015 01:35 >To: Soeren Malchow >, "libvirt-users(a)redhat.com" >, users > >Subject: Re: [ovirt-users] Bug in Snapshot Removing > >Small addition again: > >This error shows up in the log while removing snapshots WITHOUT rendering = the Vms unresponsive > >=E2=80=94 >Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: Time= d out during operation: cannot acquire state change lock >Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm vm.= Vm ERROR vmId=3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting block= job info > Tracebac= k (most recent call last): > File "= /usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs=E2=80=A6 > >=E2=80=94 > > > >From: Soeren Malchow > >Date: Monday 1 June 2015 00:56 >To: "libvirt-users(a)redhat.com" >, users > >Subject: [ovirt-users] Bug in Snapshot Removing > >Dear all > >I am not sure if the mail just did not get any attention between all the m= ails and this time it is also going to the libvirt mailing list. > >I am experiencing a problem with VM becoming unresponsive when removing Sn= apshots (Live Merge) and i think there is a serious problem. > >Here are the previous mails, > >http://lists.ovirt.org/pipermail/users/2015-May/033083.html > >The problem is on a system with everything on the latest version, CentOS 7= .1 and ovirt 3.5.2.1 all upgrades applied. > >This Problem did NOT exist before upgrading to CentOS 7.1 with an environm= ent running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the libvirt-preview re= po activated. > >I think this is a bug in libvirt, not ovirt itself, but i am not sure. The= actual file throwing the exception is in VDSM (/usr/share/vdsm/virt/vm.py,= line 697). > >We are very willing to help, test and supply log files in anyway we can. > >Regards >Soeren > >_______________________________________________ >Users mailing list >Users(a)ovirt.org >http://lists.ovirt.org/mailman/listinfo/users -- = Adam Litke --===============0109703741291895920==-- From soeren.malchow at mcon.net Wed Jun 3 03:36:04 2015 Content-Type: multipart/mixed; boundary="===============0671369298406145350==" MIME-Version: 1.0 From: Soeren Malchow To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Wed, 03 Jun 2015 07:36:01 +0000 Message-ID: In-Reply-To: 20150602165338.GB12507@redhat.com --===============0671369298406145350== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Dear Adam First we were using a python script that was working on 4 threads and therefore removing 4 snapshots at the time throughout the cluster, that still caused problems. Now i took the snapshot removing out of the threaded part an i am just looping through each snapshot on each VM one after another, even with =C2=B3sleeps=C2=B2 inbetween, but the problem remains. But i am getting the impression that it is a problem with the amount of snapshots that are deleted in a certain time, if i delete manually and one after another (meaning every 10 min or so) i do not have problems, if i delete manually and do several at once and on one VM the next one just after one finished, the risk seems to increase. I do not think it is the number of VMS because we had this on hosts with only 3 or 4 Vms running I will try restarting the libvirt and see what happens. We are not using RHEL 7.1 only CentOS 7.1 Is there anything else we can look at when this happens again ? Regards Soeren = On 02/06/15 18:53, "Adam Litke" wrote: >Hello Soeren. > >I've started to look at this issue and I'd agree that at first glance >it looks like a libvirt issue. The 'cannot acquire state change lock' >messages suggest a locking bug or severe contention at least. To help >me better understand the problem I have a few questions about your >setup. > >From your earlier report it appears that you have 15 VMs running on >the failing host. Are you attempting to remove snapshots from all VMs >at the same time? Have you tried with fewer concurrent operations? >I'd be curious to understand if the problem is connected to the >number of VMs running or the number of active block jobs. > >Have you tried RHEL-7.1 as a hypervisor host? > >Rather than rebooting the host, does restarting libvirtd cause the VMs >to become responsive again? Note that this operation may cause the >host to move to Unresponsive state in the UI for a short period of >time. > >Thanks for your report. > >On 31/05/15 23:39 +0000, Soeren Malchow wrote: >>And sorry, another update, it does kill the VM partly, it was still >>pingable when i wrote the last mail, but no ssh and no spice console >>possible >> >>From: Soeren Malchow >>> >>Date: Monday 1 June 2015 01:35 >>To: Soeren Malchow >>>, >>"libvirt-users(a)redhat.com" >>>, users >>> >>Subject: Re: [ovirt-users] Bug in Snapshot Removing >> >>Small addition again: >> >>This error shows up in the log while removing snapshots WITHOUT >>rendering the Vms unresponsive >> >>=E2=80=B9 >>Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net libvirtd[1657]: >>Timed out during operation: cannot acquire state change lock >>Jun 01 01:33:45 mc-dc3ham-compute-02-live.mc.mcon.net vdsm[6839]: vdsm >>vm.Vm ERROR vmId=3D`56848f4a-cd73-4eda-bf79-7eb80ae569a9`::Error getting >>block job info >> = >>Traceback (most recent call last): >> File >>"/usr/share/vdsm/virt/vm.py", line 5759, in queryBlockJobs=C5=A0 >> >>=E2=80=B9 >> >> >> >>From: Soeren Malchow >>> >>Date: Monday 1 June 2015 00:56 >>To: "libvirt-users(a)redhat.com" >>>, users >>> >>Subject: [ovirt-users] Bug in Snapshot Removing >> >>Dear all >> >>I am not sure if the mail just did not get any attention between all the >>mails and this time it is also going to the libvirt mailing list. >> >>I am experiencing a problem with VM becoming unresponsive when removing >>Snapshots (Live Merge) and i think there is a serious problem. >> >>Here are the previous mails, >> >>http://lists.ovirt.org/pipermail/users/2015-May/033083.html >> >>The problem is on a system with everything on the latest version, CentOS >>7.1 and ovirt 3.5.2.1 all upgrades applied. >> >>This Problem did NOT exist before upgrading to CentOS 7.1 with an >>environment running ovirt 3.5.0 and 3.5.1 and Fedora 20 with the >>libvirt-preview repo activated. >> >>I think this is a bug in libvirt, not ovirt itself, but i am not sure. >>The actual file throwing the exception is in VDSM >>(/usr/share/vdsm/virt/vm.py, line 697). >> >>We are very willing to help, test and supply log files in anyway we can. >> >>Regards >>Soeren >> > >>_______________________________________________ >>Users mailing list >>Users(a)ovirt.org >>http://lists.ovirt.org/mailman/listinfo/users > > >-- = >Adam Litke --===============0671369298406145350==-- From alitke at redhat.com Wed Jun 3 09:20:32 2015 Content-Type: multipart/mixed; boundary="===============1755335372989444748==" MIME-Version: 1.0 From: Adam Litke To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Wed, 03 Jun 2015 09:20:23 -0400 Message-ID: <20150603132023.GF12507@redhat.com> In-Reply-To: D19479B5.DC9A%soeren.malchow@mcon.net --===============1755335372989444748== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On 03/06/15 07:36 +0000, Soeren Malchow wrote: >Dear Adam > >First we were using a python script that was working on 4 threads and >therefore removing 4 snapshots at the time throughout the cluster, that >still caused problems. > >Now i took the snapshot removing out of the threaded part an i am just >looping through each snapshot on each VM one after another, even with >=C2=B3sleeps=C2=B2 inbetween, but the problem remains. >But i am getting the impression that it is a problem with the amount of >snapshots that are deleted in a certain time, if i delete manually and one >after another (meaning every 10 min or so) i do not have problems, if i >delete manually and do several at once and on one VM the next one just >after one finished, the risk seems to increase. Hmm. In our lab we extensively tested removing a snapshot for a VM with 4 disks. This means 4 block jobs running simultaneously. Less than 10 minutes later (closer to 1 minute) we would remove a second snapshot for the same VM (again involving 4 block jobs). I guess we should rerun this flow on a fully updated CentOS 7.1 host to see about local reproduction. Seems your case is much simpler than this though. Is this happening every time or intermittently? >I do not think it is the number of VMS because we had this on hosts with >only 3 or 4 Vms running > >I will try restarting the libvirt and see what happens. > >We are not using RHEL 7.1 only CentOS 7.1 > >Is there anything else we can look at when this happens again ? I'll defer to Eric Blake for the libvirt side of this. Eric, would enabling debug logging in libvirtd help to shine some light on the problem? -- = Adam Litke --===============1755335372989444748==-- From soeren.malchow at mcon.net Wed Jun 3 12:07:18 2015 Content-Type: multipart/mixed; boundary="===============3909841681985779641==" MIME-Version: 1.0 From: Soeren Malchow To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Wed, 03 Jun 2015 16:07:13 +0000 Message-ID: In-Reply-To: 20150603132023.GF12507@redhat.com --===============3909841681985779641== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hi, This is not happening every time, the last time i had this, it was a script runnning, and something like th 9. Vm and the 23. Vm had a problem, and it is not always the same VMS, it is not about the OS (happen for Windows and Linux alike) And as i said it also happened when i tried to remove the snapshots sequentially, here is the code (i know it is probably not the elegant way, but i am not a developer) and the code actually has correct indentions. <=E2=80=95 snip =E2=80=95> print "Snapshot deletion" try: time.sleep(300) Connect() vms =3D api.vms.list() for vm in vms: print ("Deleting snapshots for %s ") % vm.name snapshotlist =3D vm.snapshots.list() for snapshot in snapshotlist: if snapshot.description !=3D "Active VM": time.sleep(30) snapshot.delete() try: while api.vms.get(name=3Dvm.name).snapshots.get(id=3Dsnapshot.id).snapshot_status= =3D=3D "locked": print("Waiting for snapshot %s on %s deletion to finish") % (snapshot.description, vm.name) time.sleep(60) except Exception as e: print ("Snapshot %s does not exist anymore") % snapshot.description print ("Snapshot deletion for %s done") % vm.name print ("Deletion of snapshots done") api.disconnect() except Exception as e: print ("Something went wrong when deleting the snapshots\n%s") % str(e) <=E2=80=95 snip =E2=80=95> = Cheers Soeren On 03/06/15 15:20, "Adam Litke" wrote: >On 03/06/15 07:36 +0000, Soeren Malchow wrote: >>Dear Adam >> >>First we were using a python script that was working on 4 threads and >>therefore removing 4 snapshots at the time throughout the cluster, that >>still caused problems. >> >>Now i took the snapshot removing out of the threaded part an i am just >>looping through each snapshot on each VM one after another, even with >>=C2=B3sleeps=C2=B2 inbetween, but the problem remains. >>But i am getting the impression that it is a problem with the amount of >>snapshots that are deleted in a certain time, if i delete manually and >>one >>after another (meaning every 10 min or so) i do not have problems, if i >>delete manually and do several at once and on one VM the next one just >>after one finished, the risk seems to increase. > >Hmm. In our lab we extensively tested removing a snapshot for a VM >with 4 disks. This means 4 block jobs running simultaneously. Less >than 10 minutes later (closer to 1 minute) we would remove a second >snapshot for the same VM (again involving 4 block jobs). I guess we >should rerun this flow on a fully updated CentOS 7.1 host to see about >local reproduction. Seems your case is much simpler than this though. >Is this happening every time or intermittently? > >>I do not think it is the number of VMS because we had this on hosts with >>only 3 or 4 Vms running >> >>I will try restarting the libvirt and see what happens. >> >>We are not using RHEL 7.1 only CentOS 7.1 >> >>Is there anything else we can look at when this happens again ? > >I'll defer to Eric Blake for the libvirt side of this. Eric, would >enabling debug logging in libvirtd help to shine some light on the >problem? > >-- = >Adam Litke --===============3909841681985779641==-- From soeren.malchow at mcon.net Thu Jun 4 09:08:11 2015 Content-Type: multipart/mixed; boundary="===============6804437006520735161==" MIME-Version: 1.0 From: Soeren Malchow To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Thu, 04 Jun 2015 13:08:07 +0000 Message-ID: In-Reply-To: D194F1C9.DEC1%soeren.malchow@mcon.net --===============6804437006520735161== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hi Adam, Hi Eric, We had this issue again a few minutes ago. One machine went down exactly the same way as described, the machine had only one snapshot and it was the only snapshot that was removed, before that in the same scriptrun we deleted the snapshots of 15 other Vms, some without, some with 1 and some with several snapshots. Can i provide anything from the logs that helps ? Regards Soeren = On 03/06/15 18:07, "Soeren Malchow" wrote: >Hi, > >This is not happening every time, the last time i had this, it was a >script runnning, and something like th 9. Vm and the 23. Vm had a problem, >and it is not always the same VMS, it is not about the OS (happen for >Windows and Linux alike) > >And as i said it also happened when i tried to remove the snapshots >sequentially, here is the code (i know it is probably not the elegant way, >but i am not a developer) and the code actually has correct indentions. > ><=E2=80=95 snip =E2=80=95> > >print "Snapshot deletion" >try: > time.sleep(300) > Connect() > vms =3D api.vms.list() > for vm in vms: > print ("Deleting snapshots for %s ") % vm.name > snapshotlist =3D vm.snapshots.list() > for snapshot in snapshotlist: > if snapshot.description !=3D "Active VM": > time.sleep(30) > snapshot.delete() > try: > while >api.vms.get(name=3Dvm.name).snapshots.get(id=3Dsnapshot.id).snapshot_statu= s =3D=3D >"locked": > print("Waiting for snapshot %s on %s deletion to >finish") % (snapshot.description, vm.name) > time.sleep(60) > except Exception as e: > print ("Snapshot %s does not exist anymore") % >snapshot.description > print ("Snapshot deletion for %s done") % vm.name > print ("Deletion of snapshots done") > api.disconnect() >except Exception as e: > print ("Something went wrong when deleting the snapshots\n%s") % >str(e) > > > ><=E2=80=95 snip =E2=80=95> = > > >Cheers >Soeren > > > > > >On 03/06/15 15:20, "Adam Litke" wrote: > >>On 03/06/15 07:36 +0000, Soeren Malchow wrote: >>>Dear Adam >>> >>>First we were using a python script that was working on 4 threads and >>>therefore removing 4 snapshots at the time throughout the cluster, that >>>still caused problems. >>> >>>Now i took the snapshot removing out of the threaded part an i am just >>>looping through each snapshot on each VM one after another, even with >>>=C2=B3sleeps=C2=B2 inbetween, but the problem remains. >>>But i am getting the impression that it is a problem with the amount of >>>snapshots that are deleted in a certain time, if i delete manually and >>>one >>>after another (meaning every 10 min or so) i do not have problems, if i >>>delete manually and do several at once and on one VM the next one just >>>after one finished, the risk seems to increase. >> >>Hmm. In our lab we extensively tested removing a snapshot for a VM >>with 4 disks. This means 4 block jobs running simultaneously. Less >>than 10 minutes later (closer to 1 minute) we would remove a second >>snapshot for the same VM (again involving 4 block jobs). I guess we >>should rerun this flow on a fully updated CentOS 7.1 host to see about >>local reproduction. Seems your case is much simpler than this though. >>Is this happening every time or intermittently? >> >>>I do not think it is the number of VMS because we had this on hosts with >>>only 3 or 4 Vms running >>> >>>I will try restarting the libvirt and see what happens. >>> >>>We are not using RHEL 7.1 only CentOS 7.1 >>> >>>Is there anything else we can look at when this happens again ? >> >>I'll defer to Eric Blake for the libvirt side of this. Eric, would >>enabling debug logging in libvirtd help to shine some light on the >>problem? >> >>-- = >>Adam Litke > >_______________________________________________ >Users mailing list >Users(a)ovirt.org >http://lists.ovirt.org/mailman/listinfo/users --===============6804437006520735161==-- From alitke at redhat.com Thu Jun 4 10:17:29 2015 Content-Type: multipart/mixed; boundary="===============7443175364875275208==" MIME-Version: 1.0 From: Adam Litke To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Thu, 04 Jun 2015 10:17:10 -0400 Message-ID: <20150604141709.GJ12507@redhat.com> In-Reply-To: D1961922.E12A%soeren.malchow@mcon.net --===============7443175364875275208== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On 04/06/15 13:08 +0000, Soeren Malchow wrote: >Hi Adam, Hi Eric, > >We had this issue again a few minutes ago. > >One machine went down exactly the same way as described, the machine had >only one snapshot and it was the only snapshot that was removed, before >that in the same scriptrun we deleted the snapshots of 15 other Vms, some >without, some with 1 and some with several snapshots. > >Can i provide anything from the logs that helps ? Let's start with the libvirtd.log on that host. It might be rather large so we may need to find a creative place to host it. > >Regards >Soeren > > > >On 03/06/15 18:07, "Soeren Malchow" wrote: > >>Hi, >> >>This is not happening every time, the last time i had this, it was a >>script runnning, and something like th 9. Vm and the 23. Vm had a problem, >>and it is not always the same VMS, it is not about the OS (happen for >>Windows and Linux alike) >> >>And as i said it also happened when i tried to remove the snapshots >>sequentially, here is the code (i know it is probably not the elegant way, >>but i am not a developer) and the code actually has correct indentions. >> >><=E2=80=95 snip =E2=80=95> >> >>print "Snapshot deletion" >>try: >> time.sleep(300) >> Connect() >> vms =3D api.vms.list() >> for vm in vms: >> print ("Deleting snapshots for %s ") % vm.name >> snapshotlist =3D vm.snapshots.list() >> for snapshot in snapshotlist: >> if snapshot.description !=3D "Active VM": >> time.sleep(30) >> snapshot.delete() >> try: >> while >>api.vms.get(name=3Dvm.name).snapshots.get(id=3Dsnapshot.id).snapshot_stat= us =3D=3D >>"locked": >> print("Waiting for snapshot %s on %s deletion to >>finish") % (snapshot.description, vm.name) >> time.sleep(60) >> except Exception as e: >> print ("Snapshot %s does not exist anymore") % >>snapshot.description >> print ("Snapshot deletion for %s done") % vm.name >> print ("Deletion of snapshots done") >> api.disconnect() >>except Exception as e: >> print ("Something went wrong when deleting the snapshots\n%s") % >>str(e) >> >> >> >><=E2=80=95 snip =E2=80=95> >> >> >>Cheers >>Soeren >> >> >> >> >> >>On 03/06/15 15:20, "Adam Litke" wrote: >> >>>On 03/06/15 07:36 +0000, Soeren Malchow wrote: >>>>Dear Adam >>>> >>>>First we were using a python script that was working on 4 threads and >>>>therefore removing 4 snapshots at the time throughout the cluster, that >>>>still caused problems. >>>> >>>>Now i took the snapshot removing out of the threaded part an i am just >>>>looping through each snapshot on each VM one after another, even with >>>>=C2=B3sleeps=C2=B2 inbetween, but the problem remains. >>>>But i am getting the impression that it is a problem with the amount of >>>>snapshots that are deleted in a certain time, if i delete manually and >>>>one >>>>after another (meaning every 10 min or so) i do not have problems, if i >>>>delete manually and do several at once and on one VM the next one just >>>>after one finished, the risk seems to increase. >>> >>>Hmm. In our lab we extensively tested removing a snapshot for a VM >>>with 4 disks. This means 4 block jobs running simultaneously. Less >>>than 10 minutes later (closer to 1 minute) we would remove a second >>>snapshot for the same VM (again involving 4 block jobs). I guess we >>>should rerun this flow on a fully updated CentOS 7.1 host to see about >>>local reproduction. Seems your case is much simpler than this though. >>>Is this happening every time or intermittently? >>> >>>>I do not think it is the number of VMS because we had this on hosts with >>>>only 3 or 4 Vms running >>>> >>>>I will try restarting the libvirt and see what happens. >>>> >>>>We are not using RHEL 7.1 only CentOS 7.1 >>>> >>>>Is there anything else we can look at when this happens again ? >>> >>>I'll defer to Eric Blake for the libvirt side of this. Eric, would >>>enabling debug logging in libvirtd help to shine some light on the >>>problem? >>> >>>-- >>>Adam Litke >> >>_______________________________________________ >>Users mailing list >>Users(a)ovirt.org >>http://lists.ovirt.org/mailman/listinfo/users > -- = Adam Litke --===============7443175364875275208==-- From soeren.malchow at mcon.net Thu Jun 4 11:08:08 2015 Content-Type: multipart/mixed; boundary="===============7289858131248700629==" MIME-Version: 1.0 From: Soeren Malchow To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Thu, 04 Jun 2015 15:08:04 +0000 Message-ID: In-Reply-To: 20150604141709.GJ12507@redhat.com --===============7289858131248700629== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hi, I would send those, but unfortunately we did not think about the journals getting deleted after a reboot. I just made the journals persistent on the servers, we are trying to trigger the error again last time we only got half way through the VM=E2=80= =99s when removing the snapshots so we have a good chance that it comes up again. Also the libvirt logs to the journal not to libvirtd.log, i would send the journal directly to you and Eric via our data exchange servers Soeren = On 04/06/15 16:17, "Adam Litke" wrote: >On 04/06/15 13:08 +0000, Soeren Malchow wrote: >>Hi Adam, Hi Eric, >> >>We had this issue again a few minutes ago. >> >>One machine went down exactly the same way as described, the machine had >>only one snapshot and it was the only snapshot that was removed, before >>that in the same scriptrun we deleted the snapshots of 15 other Vms, some >>without, some with 1 and some with several snapshots. >> >>Can i provide anything from the logs that helps ? > >Let's start with the libvirtd.log on that host. It might be rather >large so we may need to find a creative place to host it. > >> >>Regards >>Soeren >> >> >> >>On 03/06/15 18:07, "Soeren Malchow" wrote: >> >>>Hi, >>> >>>This is not happening every time, the last time i had this, it was a >>>script runnning, and something like th 9. Vm and the 23. Vm had a >>>problem, >>>and it is not always the same VMS, it is not about the OS (happen for >>>Windows and Linux alike) >>> >>>And as i said it also happened when i tried to remove the snapshots >>>sequentially, here is the code (i know it is probably not the elegant >>>way, >>>but i am not a developer) and the code actually has correct indentions. >>> >>><=E2=80=95 snip =E2=80=95> >>> >>>print "Snapshot deletion" >>>try: >>> time.sleep(300) >>> Connect() >>> vms =3D api.vms.list() >>> for vm in vms: >>> print ("Deleting snapshots for %s ") % vm.name >>> snapshotlist =3D vm.snapshots.list() >>> for snapshot in snapshotlist: >>> if snapshot.description !=3D "Active VM": >>> time.sleep(30) >>> snapshot.delete() >>> try: >>> while >>>api.vms.get(name=3Dvm.name).snapshots.get(id=3Dsnapshot.id).snapshot_sta= tus >>>=3D=3D >>>"locked": >>> print("Waiting for snapshot %s on %s deletion to >>>finish") % (snapshot.description, vm.name) >>> time.sleep(60) >>> except Exception as e: >>> print ("Snapshot %s does not exist anymore") % >>>snapshot.description >>> print ("Snapshot deletion for %s done") % vm.name >>> print ("Deletion of snapshots done") >>> api.disconnect() >>>except Exception as e: >>> print ("Something went wrong when deleting the snapshots\n%s") % >>>str(e) >>> >>> >>> >>><=E2=80=95 snip =E2=80=95> >>> >>> >>>Cheers >>>Soeren >>> >>> >>> >>> >>> >>>On 03/06/15 15:20, "Adam Litke" wrote: >>> >>>>On 03/06/15 07:36 +0000, Soeren Malchow wrote: >>>>>Dear Adam >>>>> >>>>>First we were using a python script that was working on 4 threads and >>>>>therefore removing 4 snapshots at the time throughout the cluster, >>>>>that >>>>>still caused problems. >>>>> >>>>>Now i took the snapshot removing out of the threaded part an i am just >>>>>looping through each snapshot on each VM one after another, even with >>>>>=C2=B3sleeps=C2=B2 inbetween, but the problem remains. >>>>>But i am getting the impression that it is a problem with the amount >>>>>of >>>>>snapshots that are deleted in a certain time, if i delete manually and >>>>>one >>>>>after another (meaning every 10 min or so) i do not have problems, if >>>>>i >>>>>delete manually and do several at once and on one VM the next one just >>>>>after one finished, the risk seems to increase. >>>> >>>>Hmm. In our lab we extensively tested removing a snapshot for a VM >>>>with 4 disks. This means 4 block jobs running simultaneously. Less >>>>than 10 minutes later (closer to 1 minute) we would remove a second >>>>snapshot for the same VM (again involving 4 block jobs). I guess we >>>>should rerun this flow on a fully updated CentOS 7.1 host to see about >>>>local reproduction. Seems your case is much simpler than this though. >>>>Is this happening every time or intermittently? >>>> >>>>>I do not think it is the number of VMS because we had this on hosts >>>>>with >>>>>only 3 or 4 Vms running >>>>> >>>>>I will try restarting the libvirt and see what happens. >>>>> >>>>>We are not using RHEL 7.1 only CentOS 7.1 >>>>> >>>>>Is there anything else we can look at when this happens again ? >>>> >>>>I'll defer to Eric Blake for the libvirt side of this. Eric, would >>>>enabling debug logging in libvirtd help to shine some light on the >>>>problem? >>>> >>>>-- >>>>Adam Litke >>> >>>_______________________________________________ >>>Users mailing list >>>Users(a)ovirt.org >>>http://lists.ovirt.org/mailman/listinfo/users >> > >-- = >Adam Litke --===============7289858131248700629==-- From soeren.malchow at mcon.net Thu Jun 11 07:00:35 2015 Content-Type: multipart/mixed; boundary="===============4993239337777239299==" MIME-Version: 1.0 From: Soeren Malchow To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Thu, 11 Jun 2015 11:00:31 +0000 Message-ID: In-Reply-To: D1962CFC.E16E%soeren.malchow@mcon.net --===============4993239337777239299== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable We are still having this problem and we can not figure out what to do, i sent the logs already as download, can i do anything else to help ? On 04/06/15 17:08, "Soeren Malchow" wrote: >Hi, > >I would send those, but unfortunately we did not think about the journals >getting deleted after a reboot. > >I just made the journals persistent on the servers, we are trying to >trigger the error again last time we only got half way through the VM=E2= =80=99s >when removing the snapshots so we have a good chance that it comes up >again. > >Also the libvirt logs to the journal not to libvirtd.log, i would send the >journal directly to you and Eric via our data exchange servers > > >Soeren = > >On 04/06/15 16:17, "Adam Litke" wrote: > >>On 04/06/15 13:08 +0000, Soeren Malchow wrote: >>>Hi Adam, Hi Eric, >>> >>>We had this issue again a few minutes ago. >>> >>>One machine went down exactly the same way as described, the machine had >>>only one snapshot and it was the only snapshot that was removed, before >>>that in the same scriptrun we deleted the snapshots of 15 other Vms, >>>some >>>without, some with 1 and some with several snapshots. >>> >>>Can i provide anything from the logs that helps ? >> >>Let's start with the libvirtd.log on that host. It might be rather >>large so we may need to find a creative place to host it. >> >>> >>>Regards >>>Soeren >>> >>> >>> >>>On 03/06/15 18:07, "Soeren Malchow" wrote: >>> >>>>Hi, >>>> >>>>This is not happening every time, the last time i had this, it was a >>>>script runnning, and something like th 9. Vm and the 23. Vm had a >>>>problem, >>>>and it is not always the same VMS, it is not about the OS (happen for >>>>Windows and Linux alike) >>>> >>>>And as i said it also happened when i tried to remove the snapshots >>>>sequentially, here is the code (i know it is probably not the elegant >>>>way, >>>>but i am not a developer) and the code actually has correct indentions. >>>> >>>><=E2=80=95 snip =E2=80=95> >>>> >>>>print "Snapshot deletion" >>>>try: >>>> time.sleep(300) >>>> Connect() >>>> vms =3D api.vms.list() >>>> for vm in vms: >>>> print ("Deleting snapshots for %s ") % vm.name >>>> snapshotlist =3D vm.snapshots.list() >>>> for snapshot in snapshotlist: >>>> if snapshot.description !=3D "Active VM": >>>> time.sleep(30) >>>> snapshot.delete() >>>> try: >>>> while >>>>api.vms.get(name=3Dvm.name).snapshots.get(id=3Dsnapshot.id).snapshot_st= atus >>>>=3D=3D >>>>"locked": >>>> print("Waiting for snapshot %s on %s deletion >>>>to >>>>finish") % (snapshot.description, vm.name) >>>> time.sleep(60) >>>> except Exception as e: >>>> print ("Snapshot %s does not exist anymore") % >>>>snapshot.description >>>> print ("Snapshot deletion for %s done") % vm.name >>>> print ("Deletion of snapshots done") >>>> api.disconnect() >>>>except Exception as e: >>>> print ("Something went wrong when deleting the snapshots\n%s") % >>>>str(e) >>>> >>>> >>>> >>>><=E2=80=95 snip =E2=80=95> >>>> >>>> >>>>Cheers >>>>Soeren >>>> >>>> >>>> >>>> >>>> >>>>On 03/06/15 15:20, "Adam Litke" wrote: >>>> >>>>>On 03/06/15 07:36 +0000, Soeren Malchow wrote: >>>>>>Dear Adam >>>>>> >>>>>>First we were using a python script that was working on 4 threads and >>>>>>therefore removing 4 snapshots at the time throughout the cluster, >>>>>>that >>>>>>still caused problems. >>>>>> >>>>>>Now i took the snapshot removing out of the threaded part an i am >>>>>>just >>>>>>looping through each snapshot on each VM one after another, even with >>>>>>=C2=B3sleeps=C2=B2 inbetween, but the problem remains. >>>>>>But i am getting the impression that it is a problem with the amount >>>>>>of >>>>>>snapshots that are deleted in a certain time, if i delete manually >>>>>>and >>>>>>one >>>>>>after another (meaning every 10 min or so) i do not have problems, if >>>>>>i >>>>>>delete manually and do several at once and on one VM the next one >>>>>>just >>>>>>after one finished, the risk seems to increase. >>>>> >>>>>Hmm. In our lab we extensively tested removing a snapshot for a VM >>>>>with 4 disks. This means 4 block jobs running simultaneously. Less >>>>>than 10 minutes later (closer to 1 minute) we would remove a second >>>>>snapshot for the same VM (again involving 4 block jobs). I guess we >>>>>should rerun this flow on a fully updated CentOS 7.1 host to see about >>>>>local reproduction. Seems your case is much simpler than this though. >>>>>Is this happening every time or intermittently? >>>>> >>>>>>I do not think it is the number of VMS because we had this on hosts >>>>>>with >>>>>>only 3 or 4 Vms running >>>>>> >>>>>>I will try restarting the libvirt and see what happens. >>>>>> >>>>>>We are not using RHEL 7.1 only CentOS 7.1 >>>>>> >>>>>>Is there anything else we can look at when this happens again ? >>>>> >>>>>I'll defer to Eric Blake for the libvirt side of this. Eric, would >>>>>enabling debug logging in libvirtd help to shine some light on the >>>>>problem? >>>>> >>>>>-- >>>>>Adam Litke >>>> >>>>_______________________________________________ >>>>Users mailing list >>>>Users(a)ovirt.org >>>>http://lists.ovirt.org/mailman/listinfo/users >>> >> >>-- = >>Adam Litke > >_______________________________________________ >Users mailing list >Users(a)ovirt.org >http://lists.ovirt.org/mailman/listinfo/users --===============4993239337777239299==-- From alitke at redhat.com Thu Jun 11 16:39:45 2015 Content-Type: multipart/mixed; boundary="===============4965266544152104197==" MIME-Version: 1.0 From: Adam Litke To: users at ovirt.org Subject: Re: [ovirt-users] Bug in Snapshot Removing Date: Thu, 11 Jun 2015 16:39:41 -0400 Message-ID: <20150611203941.GA22860@redhat.com> In-Reply-To: D19F36A4.F068%soeren.malchow@mcon.net --===============4965266544152104197== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On 11/06/15 11:00 +0000, Soeren Malchow wrote: >We are still having this problem and we can not figure out what to do, i >sent the logs already as download, can i do anything else to help ? Hi. I'm sorry but I don't have any new information for you yet. One thing you could do is create a new bug for this issue so we can track it better. Please try to include as much information as possible from this discussion (including relevant log files) in your report. So far you are the only one reporting these issues so we'll want to work to narrow down the specific scenario that is causing this problem and get the right people working on the solution. >On 04/06/15 17:08, "Soeren Malchow" wrote: > >>Hi, >> >>I would send those, but unfortunately we did not think about the journals >>getting deleted after a reboot. >> >>I just made the journals persistent on the servers, we are trying to >>trigger the error again last time we only got half way through the VM=E2= =80=99s >>when removing the snapshots so we have a good chance that it comes up >>again. >> >>Also the libvirt logs to the journal not to libvirtd.log, i would send the >>journal directly to you and Eric via our data exchange servers >> >> >>Soeren >> >>On 04/06/15 16:17, "Adam Litke" wrote: >> >>>On 04/06/15 13:08 +0000, Soeren Malchow wrote: >>>>Hi Adam, Hi Eric, >>>> >>>>We had this issue again a few minutes ago. >>>> >>>>One machine went down exactly the same way as described, the machine had >>>>only one snapshot and it was the only snapshot that was removed, before >>>>that in the same scriptrun we deleted the snapshots of 15 other Vms, >>>>some >>>>without, some with 1 and some with several snapshots. >>>> >>>>Can i provide anything from the logs that helps ? >>> >>>Let's start with the libvirtd.log on that host. It might be rather >>>large so we may need to find a creative place to host it. >>> >>>> >>>>Regards >>>>Soeren >>>> >>>> >>>> >>>>On 03/06/15 18:07, "Soeren Malchow" wrote: >>>> >>>>>Hi, >>>>> >>>>>This is not happening every time, the last time i had this, it was a >>>>>script runnning, and something like th 9. Vm and the 23. Vm had a >>>>>problem, >>>>>and it is not always the same VMS, it is not about the OS (happen for >>>>>Windows and Linux alike) >>>>> >>>>>And as i said it also happened when i tried to remove the snapshots >>>>>sequentially, here is the code (i know it is probably not the elegant >>>>>way, >>>>>but i am not a developer) and the code actually has correct indentions. >>>>> >>>>><=E2=80=95 snip =E2=80=95> >>>>> >>>>>print "Snapshot deletion" >>>>>try: >>>>> time.sleep(300) >>>>> Connect() >>>>> vms =3D api.vms.list() >>>>> for vm in vms: >>>>> print ("Deleting snapshots for %s ") % vm.name >>>>> snapshotlist =3D vm.snapshots.list() >>>>> for snapshot in snapshotlist: >>>>> if snapshot.description !=3D "Active VM": >>>>> time.sleep(30) >>>>> snapshot.delete() >>>>> try: >>>>> while >>>>>api.vms.get(name=3Dvm.name).snapshots.get(id=3Dsnapshot.id).snapshot_s= tatus >>>>>=3D=3D >>>>>"locked": >>>>> print("Waiting for snapshot %s on %s deletion >>>>>to >>>>>finish") % (snapshot.description, vm.name) >>>>> time.sleep(60) >>>>> except Exception as e: >>>>> print ("Snapshot %s does not exist anymore") % >>>>>snapshot.description >>>>> print ("Snapshot deletion for %s done") % vm.name >>>>> print ("Deletion of snapshots done") >>>>> api.disconnect() >>>>>except Exception as e: >>>>> print ("Something went wrong when deleting the snapshots\n%s") % >>>>>str(e) >>>>> >>>>> >>>>> >>>>><=E2=80=95 snip =E2=80=95> >>>>> >>>>> >>>>>Cheers >>>>>Soeren >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>On 03/06/15 15:20, "Adam Litke" wrote: >>>>> >>>>>>On 03/06/15 07:36 +0000, Soeren Malchow wrote: >>>>>>>Dear Adam >>>>>>> >>>>>>>First we were using a python script that was working on 4 threads and >>>>>>>therefore removing 4 snapshots at the time throughout the cluster, >>>>>>>that >>>>>>>still caused problems. >>>>>>> >>>>>>>Now i took the snapshot removing out of the threaded part an i am >>>>>>>just >>>>>>>looping through each snapshot on each VM one after another, even with >>>>>>>=C2=B3sleeps=C2=B2 inbetween, but the problem remains. >>>>>>>But i am getting the impression that it is a problem with the amount >>>>>>>of >>>>>>>snapshots that are deleted in a certain time, if i delete manually >>>>>>>and >>>>>>>one >>>>>>>after another (meaning every 10 min or so) i do not have problems, if >>>>>>>i >>>>>>>delete manually and do several at once and on one VM the next one >>>>>>>just >>>>>>>after one finished, the risk seems to increase. >>>>>> >>>>>>Hmm. In our lab we extensively tested removing a snapshot for a VM >>>>>>with 4 disks. This means 4 block jobs running simultaneously. Less >>>>>>than 10 minutes later (closer to 1 minute) we would remove a second >>>>>>snapshot for the same VM (again involving 4 block jobs). I guess we >>>>>>should rerun this flow on a fully updated CentOS 7.1 host to see about >>>>>>local reproduction. Seems your case is much simpler than this though. >>>>>>Is this happening every time or intermittently? >>>>>> >>>>>>>I do not think it is the number of VMS because we had this on hosts >>>>>>>with >>>>>>>only 3 or 4 Vms running >>>>>>> >>>>>>>I will try restarting the libvirt and see what happens. >>>>>>> >>>>>>>We are not using RHEL 7.1 only CentOS 7.1 >>>>>>> >>>>>>>Is there anything else we can look at when this happens again ? >>>>>> >>>>>>I'll defer to Eric Blake for the libvirt side of this. Eric, would >>>>>>enabling debug logging in libvirtd help to shine some light on the >>>>>>problem? >>>>>> >>>>>>-- >>>>>>Adam Litke >>>>> >>>>>_______________________________________________ >>>>>Users mailing list >>>>>Users(a)ovirt.org >>>>>http://lists.ovirt.org/mailman/listinfo/users >>>> >>> >>>-- >>>Adam Litke >> >>_______________________________________________ >>Users mailing list >>Users(a)ovirt.org >>http://lists.ovirt.org/mailman/listinfo/users > -- = Adam Litke --===============4965266544152104197==--