From fernando.frediani at upx.com Tue Nov 28 11:48:15 2017 Content-Type: multipart/mixed; boundary="===============2498132263457424072==" MIME-Version: 1.0 From: FERNANDO FREDIANI To: users at ovirt.org Subject: [ovirt-users] Hosts been evacuated unnecessarily Date: Tue, 28 Nov 2017 09:48:09 -0200 Message-ID: <6ec3e9ae-25e8-9a83-c406-7585b5905c64@upx.com> --===============2498132263457424072== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable This is a multi-part message in MIME format. --------------85B0C37C08377D8D314D9F67 Content-Type: text/plain; charset=3Dutf-8; format=3Dflowed Content-Transfer-Encoding: 7bit Hello folks. Ou oVirt (4.1.7.3-1.el7.centos) which runs in one Datacenter and = controls Nodes locally and also remotelly lost communication with the = remote Nodes in another Datacenter. To this point nothing wrong as the Nodes can continue working as = expected and running their Virtual Machines each without dependency of = the oVirt Engine. What happened at some point is that when the communication between = Engine and Hosts came back Hosts in the remote Datacenter got confused = and initiated a Live Migration of ALL VMs from one of the hosts to = another. I had also to restart vdsmd agent on all Hosts in order to get = sanity my environment. What adds up even more strangeness to this scenario is that one of the = Hosts affected by the need of restarting VDSM doesn't belong to the same = Cluster as the others and had to have the vdsmd restarted. I understand the Hosts can survive without the Engine online with = reduced possibilities but can communicated between them, but without = affecting the VMs or even needing to do what happened in this scenario. Am I wrong on any of the assumptions ? Fernando --------------85B0C37C08377D8D314D9F67 Content-Type: text/html; charset=3Dutf-8 Content-Transfer-Encoding: 7bit Hello folks.

Ou oVirt (4.1.7.3-1.el7.centos) which runs in one Datacenter and controls Nodes locally and also remotelly lost communication with the remote Nodes in another Datacenter.
To this point nothing wrong as the Nodes can continue working as expected and running their Virtual Machines each without dependency of the oVirt Engine.

What happened at some point is that when the communication between Engine and Hosts came back Hosts in the remote Datacenter got confused and initiated a Live Migration of ALL VMs from one of the hosts to another. I had also to restart vdsmd agent on all Hosts in order to get sanity my environment.

What adds up even more strangeness to this scenario is that one of the Hosts affected by the need of restarting VDSM doesn't belong to the same Cluster as the others and had to have the vdsmd restarted.

I understand the Hosts can survive without the Engine online with reduced possibilities but can communicated between them, but without affecting the VMs or even needing to do what happened in this scenario.

Am I wrong on any of the assumptions ?

Fernando
--------------85B0C37C08377D8D314D9F67-- --===============2498132263457424072== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" VGhpcyBpcyBhIG11bHRpLXBhcnQgbWVzc2FnZSBpbiBNSU1FIGZvcm1hdC4KLS0tLS0tLS0tLS0t LS04NUIwQzM3QzA4Mzc3RDhEMzE0RDlGNjcKQ29udGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFy c2V0PXV0Zi04OyBmb3JtYXQ9Zmxvd2VkCkNvbnRlbnQtVHJhbnNmZXItRW5jb2Rpbmc6IDdiaXQK CkhlbGxvIGZvbGtzLgoKT3Ugb1ZpcnQgKDQuMS43LjMtMS5lbDcuY2VudG9zKSB3aGljaCBydW5z IGluIG9uZSBEYXRhY2VudGVyIGFuZCAKY29udHJvbHMgTm9kZXMgbG9jYWxseSBhbmQgYWxzbyBy ZW1vdGVsbHkgbG9zdCBjb21tdW5pY2F0aW9uIHdpdGggdGhlIApyZW1vdGUgTm9kZXMgaW4gYW5v dGhlciBEYXRhY2VudGVyLgpUbyB0aGlzIHBvaW50IG5vdGhpbmcgd3JvbmcgYXMgdGhlIE5vZGVz IGNhbiBjb250aW51ZSB3b3JraW5nIGFzIApleHBlY3RlZCBhbmQgcnVubmluZyB0aGVpciBWaXJ0 dWFsIE1hY2hpbmVzIGVhY2ggd2l0aG91dCBkZXBlbmRlbmN5IG9mIAp0aGUgb1ZpcnQgRW5naW5l LgoKV2hhdCBoYXBwZW5lZCBhdCBzb21lIHBvaW50IGlzIHRoYXQgd2hlbiB0aGUgY29tbXVuaWNh dGlvbiBiZXR3ZWVuIApFbmdpbmUgYW5kIEhvc3RzIGNhbWUgYmFjayBIb3N0cyBpbiB0aGUgcmVt b3RlIERhdGFjZW50ZXIgZ290IGNvbmZ1c2VkIAphbmQgaW5pdGlhdGVkIGEgTGl2ZSBNaWdyYXRp b24gb2YgQUxMIFZNcyBmcm9tIG9uZSBvZiB0aGUgaG9zdHMgdG8gCmFub3RoZXIuIEkgaGFkIGFs c28gdG8gcmVzdGFydCB2ZHNtZCBhZ2VudCBvbiBhbGwgSG9zdHMgaW4gb3JkZXIgdG8gZ2V0IApz YW5pdHkgbXkgZW52aXJvbm1lbnQuCgpXaGF0IGFkZHMgdXAgZXZlbiBtb3JlIHN0cmFuZ2VuZXNz IHRvIHRoaXMgc2NlbmFyaW8gaXMgdGhhdCBvbmUgb2YgdGhlIApIb3N0cyBhZmZlY3RlZCBieSB0 aGUgbmVlZCBvZiByZXN0YXJ0aW5nIFZEU00gZG9lc24ndCBiZWxvbmcgdG8gdGhlIHNhbWUgCkNs dXN0ZXIgYXMgdGhlIG90aGVycyBhbmQgaGFkIHRvIGhhdmUgdGhlIHZkc21kIHJlc3RhcnRlZC4K CkkgdW5kZXJzdGFuZCB0aGUgSG9zdHMgY2FuIHN1cnZpdmUgd2l0aG91dCB0aGUgRW5naW5lIG9u bGluZSB3aXRoIApyZWR1Y2VkIHBvc3NpYmlsaXRpZXMgYnV0IGNhbiBjb21tdW5pY2F0ZWQgYmV0 d2VlbiB0aGVtLCBidXQgd2l0aG91dCAKYWZmZWN0aW5nIHRoZSBWTXMgb3IgZXZlbiBuZWVkaW5n IHRvIGRvIHdoYXQgaGFwcGVuZWQgaW4gdGhpcyBzY2VuYXJpby4KCkFtIEkgd3Jvbmcgb24gYW55 IG9mIHRoZSBhc3N1bXB0aW9ucyA/CgpGZXJuYW5kbwoKLS0tLS0tLS0tLS0tLS04NUIwQzM3QzA4 Mzc3RDhEMzE0RDlGNjcKQ29udGVudC1UeXBlOiB0ZXh0L2h0bWw7IGNoYXJzZXQ9dXRmLTgKQ29u dGVudC1UcmFuc2Zlci1FbmNvZGluZzogN2JpdAoKPGh0bWw+CiAgPGhlYWQ+CgogICAgPG1ldGEg aHR0cC1lcXVpdj0iY29udGVudC10eXBlIiBjb250ZW50PSJ0ZXh0L2h0bWw7IGNoYXJzZXQ9dXRm LTgiPgogIDwvaGVhZD4KICA8Ym9keSBiZ2NvbG9yPSIjRkZGRkZGIiB0ZXh0PSIjMDAwMDAwIj4K ICAgIDxmb250IGZhY2U9ImFyaWFsLCBoZWx2ZXRpY2EsIHNhbnMtc2VyaWYiPkhlbGxvIGZvbGtz Ljxicj4KICAgICAgPGJyPgogICAgPC9mb250Pjxmb250IGZhY2U9ImFyaWFsLCBoZWx2ZXRpY2Es IHNhbnMtc2VyaWYiPjxmb250IGZhY2U9ImFyaWFsLAogICAgICAgIGhlbHZldGljYSwgc2Fucy1z ZXJpZiI+T3Ugb1ZpcnQgKDwvZm9udD48L2ZvbnQ+PGZvbnQKICAgICAgZmFjZT0iYXJpYWwsIGhl bHZldGljYSwgc2Fucy1zZXJpZiI+PGZvbnQgZmFjZT0iYXJpYWwsIGhlbHZldGljYSwKICAgICAg ICBzYW5zLXNlcmlmIj48c3BhbiBjbGFzcz0idmVyc2lvbi10ZXh0Ij40LjEuNy4zLTEuZWw3LmNl bnRvcyk8L3NwYW4+CiAgICAgICAgd2hpY2ggcnVucyBpbiBvbmUgRGF0YWNlbnRlciBhbmQgY29u dHJvbHMgTm9kZXMgbG9jYWxseSBhbmQgYWxzbwogICAgICAgIHJlbW90ZWxseSBsb3N0IGNvbW11 bmljYXRpb24gd2l0aCB0aGUgcmVtb3RlIE5vZGVzIGluIGFub3RoZXIKICAgICAgICBEYXRhY2Vu dGVyLjxicj4KICAgICAgICBUbyB0aGlzIHBvaW50IG5vdGhpbmcgd3JvbmcgYXMgdGhlIE5vZGVz IGNhbiBjb250aW51ZSB3b3JraW5nIGFzCiAgICAgICAgZXhwZWN0ZWQgYW5kIHJ1bm5pbmcgdGhl aXIgVmlydHVhbCBNYWNoaW5lcyBlYWNoIHdpdGhvdXQKICAgICAgICBkZXBlbmRlbmN5IG9mIHRo ZSBvVmlydCBFbmdpbmUuPGJyPgogICAgICAgIDxicj4KICAgICAgICBXaGF0IGhhcHBlbmVkIGF0 IHNvbWUgcG9pbnQgaXMgdGhhdCB3aGVuIHRoZSBjb21tdW5pY2F0aW9uCiAgICAgICAgYmV0d2Vl biBFbmdpbmUgYW5kIEhvc3RzIGNhbWUgYmFjayBIb3N0cyBpbiB0aGUgcmVtb3RlCiAgICAgICAg RGF0YWNlbnRlciBnb3QgY29uZnVzZWQgYW5kIGluaXRpYXRlZCBhIExpdmUgTWlncmF0aW9uIG9m IEFMTAogICAgICAgIFZNcyBmcm9tIG9uZSBvZiB0aGUgaG9zdHMgdG8gYW5vdGhlci4gSSBoYWQg YWxzbyB0byByZXN0YXJ0CiAgICAgICAgdmRzbWQgYWdlbnQgb24gYWxsIEhvc3RzIGluIG9yZGVy IHRvIGdldCBzYW5pdHkgbXkgZW52aXJvbm1lbnQuPGJyPgogICAgICAgIDxicj4KICAgICAgICBX aGF0IGFkZHMgdXAgZXZlbiBtb3JlIHN0cmFuZ2VuZXNzIHRvIHRoaXMgc2NlbmFyaW8gaXMgdGhh dCBvbmUKICAgICAgICBvZiB0aGUgSG9zdHMgYWZmZWN0ZWQgYnkgdGhlIG5lZWQgb2YgcmVzdGFy dGluZyBWRFNNIGRvZXNuJ3QKICAgICAgICBiZWxvbmcgdG8gdGhlIHNhbWUgQ2x1c3RlciBhcyB0 aGUgb3RoZXJzIGFuZCBoYWQgdG8gaGF2ZSB0aGUKICAgICAgICB2ZHNtZCByZXN0YXJ0ZWQuPGJy PgogICAgICAgIDxicj4KICAgICAgICBJIHVuZGVyc3RhbmQgdGhlIEhvc3RzIGNhbiBzdXJ2aXZl IHdpdGhvdXQgdGhlIEVuZ2luZSBvbmxpbmUKICAgICAgICB3aXRoIHJlZHVjZWQgcG9zc2liaWxp dGllcyBidXQgY2FuIGNvbW11bmljYXRlZCBiZXR3ZWVuIHRoZW0sCiAgICAgICAgYnV0IHdpdGhv dXQgYWZmZWN0aW5nIHRoZSBWTXMgb3IgZXZlbiBuZWVkaW5nIHRvIGRvIHdoYXQKICAgICAgICBo YXBwZW5lZCBpbiB0aGlzIHNjZW5hcmlvLjxicj4KICAgICAgICA8YnI+CiAgICAgICAgQW0gSSB3 cm9uZyBvbiBhbnkgb2YgdGhlIGFzc3VtcHRpb25zID88YnI+CiAgICAgICAgPGJyPgogICAgICAg IEZlcm5hbmRvPC9mb250PjwvZm9udD4KICA8L2JvZHk+CjwvaHRtbD4KCi0tLS0tLS0tLS0tLS0t ODVCMEMzN0MwODM3N0Q4RDMxNEQ5RjY3LS0K --===============2498132263457424072==--