From kripper at imatronix.cl Mon Aug 3 20:58:16 2015 Content-Type: multipart/mixed; boundary="===============2174216512942145954==" MIME-Version: 1.0 From: Christopher Pereira To: devel at ovirt.org Subject: [ovirt-devel] Failed Snapshot as 'Current' after failed Live Storage Migration Date: Mon, 03 Aug 2015 21:58:10 -0300 Message-ID: <55C00E22.2030404@imatronix.cl> --===============2174216512942145954== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable This is a multi-part message in MIME format. --------------000701090209060705020903 Content-Type: text/plain; charset=3Dutf-8; format=3Dflowed Content-Transfer-Encoding: 7bit Hi, A live storage migration task failed due to a network error: 2015-08-03 21:23:16,437 WARN [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (org.ovirt.thread.pool-12-thread-45) [] Could not perform live snapshot due to error, VM will still be configured to the new created snapshot: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022) 2015-08-03 21:23:16,450 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-12-thread-45) [] Correlation ID: null, Call Stack: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022) at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:11= 7) at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBroker= FrontendImpl.java:33) at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:20= 29) at org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand$2.runInTransa= ction(CreateAllSnapshotsFromVmCommand.java:400) ... As a consequence, Engine is showing a failed snapshot as "Current", = while libvirt is still reporting the previous correct snapshot. I guess next time Engine will probably try to resume the failed snapshot = and VM won't start anymore. What is the correct way to solve this issue? BZ: https://bugzilla.redhat.com/show_bug.cgi?id=3D1018867 --------------000701090209060705020903 Content-Type: text/html; charset=3Dutf-8 Content-Transfer-Encoding: 8bit Hi,

A live storage migration task failed due to a network error:
2015-08-03 21:23:16,437 WARN=C2=A0 [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (org.ovirt.thread.pool-12-thread-45) [] Could not perform live snapshot due to error, VM will still be configured to the new created snapshot: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
2015-08-03 21:23:16,450 WARN=C2=A0 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-12-thread-45) [] Correlation ID: null, Call Stack: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:117) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFron= tendImpl.java:33)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2029)<= br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand$2.runInTransactio= n(CreateAllSnapshotsFromVmCommand.java:400)
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 ...
As a consequence, Engine is showing a failed snapshot as "Current", while libvirt is still reporting the previous correct snapshot.
I guess next time Engine will probably try to resume the failed snapshot and VM won't start anymore.
What is the correct way to solve this issue?

BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=3D10= 18867
--------------000701090209060705020903-- --===============2174216512942145954== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" VGhpcyBpcyBhIG11bHRpLXBhcnQgbWVzc2FnZSBpbiBNSU1FIGZvcm1hdC4KLS0tLS0tLS0tLS0t LS0wMDA3MDEwOTAyMDkwNjA3MDUwMjA5MDMKQ29udGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFy c2V0PXV0Zi04OyBmb3JtYXQ9Zmxvd2VkCkNvbnRlbnQtVHJhbnNmZXItRW5jb2Rpbmc6IDdiaXQK CkhpLAoKQSBsaXZlIHN0b3JhZ2UgbWlncmF0aW9uIHRhc2sgZmFpbGVkIGR1ZSB0byBhIG5ldHdv cmsgZXJyb3I6CgogICAgMjAxNS0wOC0wMyAyMToyMzoxNiw0MzcgV0FSTgogICAgW29yZy5vdmly dC5lbmdpbmUuY29yZS5ibGwuQ3JlYXRlQWxsU25hcHNob3RzRnJvbVZtQ29tbWFuZF0KICAgIChv cmcub3ZpcnQudGhyZWFkLnBvb2wtMTItdGhyZWFkLTQ1KSBbXSBDb3VsZCBub3QgcGVyZm9ybSBs aXZlCiAgICBzbmFwc2hvdCBkdWUgdG8gZXJyb3IsIFZNIHdpbGwgc3RpbGwgYmUgY29uZmlndXJl ZCB0byB0aGUgbmV3CiAgICBjcmVhdGVkIHNuYXBzaG90OiBWZGNCTExFeGNlcHRpb246CiAgICBv cmcub3ZpcnQuZW5naW5lLmNvcmUudmRzYnJva2VyLnZkc2Jyb2tlci5WRFNOZXR3b3JrRXhjZXB0 aW9uOgogICAgVkRTR2VuZXJpY0V4Y2VwdGlvbjogVkRTTmV0d29ya0V4Y2VwdGlvbjogTWVzc2Fn ZSB0aW1lb3V0IHdoaWNoIGNhbgogICAgYmUgY2F1c2VkIGJ5IGNvbW11bmljYXRpb24gaXNzdWVz IChGYWlsZWQgd2l0aCBlcnJvcgogICAgVkRTX05FVFdPUktfRVJST1IgYW5kIGNvZGUgNTAyMikK ICAgIDIwMTUtMDgtMDMgMjE6MjM6MTYsNDUwIFdBUk4KICAgIFtvcmcub3ZpcnQuZW5naW5lLmNv cmUuZGFsLmRiYnJva2VyLmF1ZGl0bG9naGFuZGxpbmcuQXVkaXRMb2dEaXJlY3Rvcl0KICAgIChv cmcub3ZpcnQudGhyZWFkLnBvb2wtMTItdGhyZWFkLTQ1KSBbXSBDb3JyZWxhdGlvbiBJRDogbnVs bCwgQ2FsbAogICAgU3RhY2s6IG9yZy5vdmlydC5lbmdpbmUuY29yZS5jb21tb24uZXJyb3JzLlZk Y0JMTEV4Y2VwdGlvbjoKICAgIFZkY0JMTEV4Y2VwdGlvbjoKICAgIG9yZy5vdmlydC5lbmdpbmUu Y29yZS52ZHNicm9rZXIudmRzYnJva2VyLlZEU05ldHdvcmtFeGNlcHRpb246CiAgICBWRFNHZW5l cmljRXhjZXB0aW9uOiBWRFNOZXR3b3JrRXhjZXB0aW9uOiBNZXNzYWdlIHRpbWVvdXQgd2hpY2gg Y2FuCiAgICBiZSBjYXVzZWQgYnkgY29tbXVuaWNhdGlvbiBpc3N1ZXMgKEZhaWxlZCB3aXRoIGVy cm9yCiAgICBWRFNfTkVUV09SS19FUlJPUiBhbmQgY29kZSA1MDIyKQogICAgICAgICAgICAgYXQK ICAgIG9yZy5vdmlydC5lbmdpbmUuY29yZS5ibGwuVmRzSGFuZGxlci5oYW5kbGVWZHNSZXN1bHQo VmRzSGFuZGxlci5qYXZhOjExNykKICAgICAgICAgICAgIGF0CiAgICBvcmcub3ZpcnQuZW5naW5l LmNvcmUuYmxsLlZEU0Jyb2tlckZyb250ZW5kSW1wbC5SdW5WZHNDb21tYW5kKFZEU0Jyb2tlckZy b250ZW5kSW1wbC5qYXZhOjMzKQogICAgICAgICAgICAgYXQKICAgIG9yZy5vdmlydC5lbmdpbmUu Y29yZS5ibGwuQ29tbWFuZEJhc2UucnVuVmRzQ29tbWFuZChDb21tYW5kQmFzZS5qYXZhOjIwMjkp CiAgICAgICAgICAgICBhdAogICAgb3JnLm92aXJ0LmVuZ2luZS5jb3JlLmJsbC5DcmVhdGVBbGxT bmFwc2hvdHNGcm9tVm1Db21tYW5kJDIucnVuSW5UcmFuc2FjdGlvbihDcmVhdGVBbGxTbmFwc2hv dHNGcm9tVm1Db21tYW5kLmphdmE6NDAwKQogICAgICAgICAgICAgLi4uCgpBcyBhIGNvbnNlcXVl bmNlLCBFbmdpbmUgaXMgc2hvd2luZyBhIGZhaWxlZCBzbmFwc2hvdCBhcyAiQ3VycmVudCIsIAp3 aGlsZSBsaWJ2aXJ0IGlzIHN0aWxsIHJlcG9ydGluZyB0aGUgcHJldmlvdXMgY29ycmVjdCBzbmFw c2hvdC4KSSBndWVzcyBuZXh0IHRpbWUgRW5naW5lIHdpbGwgcHJvYmFibHkgdHJ5IHRvIHJlc3Vt ZSB0aGUgZmFpbGVkIHNuYXBzaG90IAphbmQgVk0gd29uJ3Qgc3RhcnQgYW55bW9yZS4KV2hhdCBp cyB0aGUgY29ycmVjdCB3YXkgdG8gc29sdmUgdGhpcyBpc3N1ZT8KCkJaOgpodHRwczovL2J1Z3pp bGxhLnJlZGhhdC5jb20vc2hvd19idWcuY2dpP2lkPTEwMTg4NjcKCi0tLS0tLS0tLS0tLS0tMDAw NzAxMDkwMjA5MDYwNzA1MDIwOTAzCkNvbnRlbnQtVHlwZTogdGV4dC9odG1sOyBjaGFyc2V0PXV0 Zi04CkNvbnRlbnQtVHJhbnNmZXItRW5jb2Rpbmc6IDhiaXQKCjxodG1sPgogIDxoZWFkPgoKICAg IDxtZXRhIGh0dHAtZXF1aXY9ImNvbnRlbnQtdHlwZSIgY29udGVudD0idGV4dC9odG1sOyBjaGFy c2V0PXV0Zi04Ij4KICA8L2hlYWQ+CiAgPGJvZHkgYmdjb2xvcj0iI0ZGRkZGRiIgdGV4dD0iIzAw MDAwMCI+CiAgICBIaSw8YnI+CiAgICA8YnI+CiAgICBBIGxpdmUgc3RvcmFnZSBtaWdyYXRpb24g dGFzayBmYWlsZWQgZHVlIHRvIGEgbmV0d29yayBlcnJvcjo8YnI+CiAgICA8YmxvY2txdW90ZT4y MDE1LTA4LTAzIDIxOjIzOjE2LDQzNyBXQVJOwqAKICAgICAgW29yZy5vdmlydC5lbmdpbmUuY29y ZS5ibGwuQ3JlYXRlQWxsU25hcHNob3RzRnJvbVZtQ29tbWFuZF0KICAgICAgKG9yZy5vdmlydC50 aHJlYWQucG9vbC0xMi10aHJlYWQtNDUpIFtdIENvdWxkIG5vdCBwZXJmb3JtIGxpdmUKICAgICAg c25hcHNob3QgZHVlIHRvIGVycm9yLCBWTSB3aWxsIHN0aWxsIGJlIGNvbmZpZ3VyZWQgdG8gdGhl IG5ldwogICAgICBjcmVhdGVkIHNuYXBzaG90OiBWZGNCTExFeGNlcHRpb246CiAgICAgIG9yZy5v dmlydC5lbmdpbmUuY29yZS52ZHNicm9rZXIudmRzYnJva2VyLlZEU05ldHdvcmtFeGNlcHRpb246 CiAgICAgIFZEU0dlbmVyaWNFeGNlcHRpb246IFZEU05ldHdvcmtFeGNlcHRpb246IE1lc3NhZ2Ug dGltZW91dCB3aGljaAogICAgICBjYW4gYmUgY2F1c2VkIGJ5IGNvbW11bmljYXRpb24gaXNzdWVz IChGYWlsZWQgd2l0aCBlcnJvcgogICAgICBWRFNfTkVUV09SS19FUlJPUiBhbmQgY29kZSA1MDIy KTxicj4KICAgICAgMjAxNS0wOC0wMyAyMToyMzoxNiw0NTAgV0FSTsKgCiAgICAgIFtvcmcub3Zp cnQuZW5naW5lLmNvcmUuZGFsLmRiYnJva2VyLmF1ZGl0bG9naGFuZGxpbmcuQXVkaXRMb2dEaXJl Y3Rvcl0KICAgICAgKG9yZy5vdmlydC50aHJlYWQucG9vbC0xMi10aHJlYWQtNDUpIFtdIENvcnJl bGF0aW9uIElEOiBudWxsLCBDYWxsCiAgICAgIFN0YWNrOiBvcmcub3ZpcnQuZW5naW5lLmNvcmUu Y29tbW9uLmVycm9ycy5WZGNCTExFeGNlcHRpb246CiAgICAgIFZkY0JMTEV4Y2VwdGlvbjoKICAg ICAgb3JnLm92aXJ0LmVuZ2luZS5jb3JlLnZkc2Jyb2tlci52ZHNicm9rZXIuVkRTTmV0d29ya0V4 Y2VwdGlvbjoKICAgICAgVkRTR2VuZXJpY0V4Y2VwdGlvbjogVkRTTmV0d29ya0V4Y2VwdGlvbjog TWVzc2FnZSB0aW1lb3V0IHdoaWNoCiAgICAgIGNhbiBiZSBjYXVzZWQgYnkgY29tbXVuaWNhdGlv biBpc3N1ZXMgKEZhaWxlZCB3aXRoIGVycm9yCiAgICAgIFZEU19ORVRXT1JLX0VSUk9SIGFuZCBj b2RlIDUwMjIpPGJyPgogICAgICDCoMKgwqDCoMKgwqDCoCBhdApvcmcub3ZpcnQuZW5naW5lLmNv cmUuYmxsLlZkc0hhbmRsZXIuaGFuZGxlVmRzUmVzdWx0KFZkc0hhbmRsZXIuamF2YToxMTcpPGJy PgogICAgICDCoMKgwqDCoMKgwqDCoCBhdApvcmcub3ZpcnQuZW5naW5lLmNvcmUuYmxsLlZEU0Jy b2tlckZyb250ZW5kSW1wbC5SdW5WZHNDb21tYW5kKFZEU0Jyb2tlckZyb250ZW5kSW1wbC5qYXZh OjMzKTxicj4KICAgICAgwqDCoMKgwqDCoMKgwqAgYXQKb3JnLm92aXJ0LmVuZ2luZS5jb3JlLmJs bC5Db21tYW5kQmFzZS5ydW5WZHNDb21tYW5kKENvbW1hbmRCYXNlLmphdmE6MjAyOSk8YnI+CiAg ICAgIMKgwqDCoMKgwqDCoMKgIGF0Cm9yZy5vdmlydC5lbmdpbmUuY29yZS5ibGwuQ3JlYXRlQWxs U25hcHNob3RzRnJvbVZtQ29tbWFuZCQyLnJ1bkluVHJhbnNhY3Rpb24oQ3JlYXRlQWxsU25hcHNo b3RzRnJvbVZtQ29tbWFuZC5qYXZhOjQwMCk8YnI+CiAgICAgIMKgwqDCoCDCoMKgwqAgLi4uPGJy PgogICAgPC9ibG9ja3F1b3RlPgogICAgQXMgYSBjb25zZXF1ZW5jZSwgRW5naW5lIGlzIHNob3dp bmcgYSBmYWlsZWQgc25hcHNob3QgYXMgIkN1cnJlbnQiLAogICAgd2hpbGUgbGlidmlydCBpcyBz dGlsbCByZXBvcnRpbmcgdGhlIHByZXZpb3VzIGNvcnJlY3Qgc25hcHNob3QuPGJyPgogICAgSSBn dWVzcyBuZXh0IHRpbWUgRW5naW5lIHdpbGwgcHJvYmFibHkgdHJ5IHRvIHJlc3VtZSB0aGUgZmFp bGVkCiAgICBzbmFwc2hvdCBhbmQgVk0gd29uJ3Qgc3RhcnQgYW55bW9yZS48YnI+CiAgICBXaGF0 IGlzIHRoZSBjb3JyZWN0IHdheSB0byBzb2x2ZSB0aGlzIGlzc3VlPzxicj4KICAgIDxicj4KICAg IEJaOjxicj4KICAgIDxhIGNsYXNzPSJtb3otdHh0LWxpbmstZnJlZXRleHQiIGhyZWY9Imh0dHBz Oi8vYnVnemlsbGEucmVkaGF0LmNvbS9zaG93X2J1Zy5jZ2k/aWQ9MTAxODg2NyI+aHR0cHM6Ly9i dWd6aWxsYS5yZWRoYXQuY29tL3Nob3dfYnVnLmNnaT9pZD0xMDE4ODY3PC9hPjxicj4KICA8L2Jv ZHk+CjwvaHRtbD4KCi0tLS0tLS0tLS0tLS0tMDAwNzAxMDkwMjA5MDYwNzA1MDIwOTAzLS0K --===============2174216512942145954==-- From nsoffer at redhat.com Tue Aug 4 04:35:33 2015 Content-Type: multipart/mixed; boundary="===============1702234812866972366==" MIME-Version: 1.0 From: Nir Soffer To: devel at ovirt.org Subject: Re: [ovirt-devel] Failed Snapshot as 'Current' after failed Live Storage Migration Date: Tue, 04 Aug 2015 04:35:21 -0400 Message-ID: <1947101418.2986015.1438677321479.JavaMail.zimbra@redhat.com> In-Reply-To: 55C00E22.2030404@imatronix.cl --===============1702234812866972366== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable ----- Original Message ----- > From: "Christopher Pereira" > To: devel(a)ovirt.org > Sent: Tuesday, August 4, 2015 3:58:10 AM > Subject: [ovirt-devel] Failed Snapshot as 'Current' after failed Live Sto= rage Migration > = > Hi, > = > A live storage migration task failed due to a network error: > = > = > 2015-08-03 21:23:16,437 WARN > [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] > (org.ovirt.thread.pool-12-thread-45) [] Could not perform live snapshot d= ue > to error, VM will still be configured to the new created snapshot: > VdcBLLException: > org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: > VDSGenericException: VDSNetworkException: Message timeout which can be > caused by communication issues (Failed with error VDS_NETWORK_ERROR and c= ode > 5022) > 2015-08-03 21:23:16,450 WARN > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (org.ovirt.thread.pool-12-thread-45) [] Correlation ID: null, Call Stack: > org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: > org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: > VDSGenericException: VDSNetworkException: Message timeout which can be > caused by communication issues (Failed with error VDS_NETWORK_ERROR and c= ode > 5022) > at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:1= 17) > at > org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFr= ontendImpl.java:33) > at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2= 029) > at > org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand$2.runInTransact= ion(CreateAllSnapshotsFromVmCommand.java:400) > ... > As a consequence, Engine is showing a failed snapshot as "Current", while > libvirt is still reporting the previous correct snapshot. > I guess next time Engine will probably try to resume the failed snapshot = and > VM won't start anymore. Why guess? did you try this? > What is the correct way to solve this issue? I would restart engine, it may have bad cache. > BZ: > https://bugzilla.redhat.com/show_bug.cgi?id=3D1018867 Thanks, we will look into this. Nir --===============1702234812866972366==-- From kripper at imatronix.cl Wed Aug 5 00:24:34 2015 Content-Type: multipart/mixed; boundary="===============2559074252545575795==" MIME-Version: 1.0 From: Christopher Pereira To: devel at ovirt.org Subject: Re: [ovirt-devel] Failed Snapshot as 'Current' after failed Live Storage Migration Date: Wed, 05 Aug 2015 01:24:32 -0300 Message-ID: <55C19000.1050606@imatronix.cl> In-Reply-To: 1947101418.2986015.1438677321479.JavaMail.zimbra@redhat.com --===============2559074252545575795== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On 04-08-2015 5:35, Nir Soffer wrote: >> What is the correct way to solve this issue? > I would restart engine, it may have bad cache. Thanks Nir, After restarting Engine, 'Sanpshots' list is still incorrect. I will continue reporting on BZ [1] to explore if oVirt will be able to = overcome this problems without restarting a VM. In general, restarting VMs is not possible. [1] : https://bugzilla.redhat.com/show_bug.cgi?id=3D1018867 --===============2559074252545575795==--