This is a multi-part message in MIME format.
--------------7A7A94A488D7D26C7D1B2EB5
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Hello.
Yesterday I had a pretty strange problem in one of our architectures. My
oVirt which runs in one Datacenter and controls Nodes locally and also
remotelly lost communication with the remote Nodes in another Datacenter.
To this point nothing wrong as the Nodes can continue working as
expected and running their Virtual Machines each without dependency of
the oVirt Engine.
What happened at some point is that when the communication between
Engine and Hosts came back Hosts got confused and initiated a Live
Migration of ALL VMs from one of the other. I had also to restart vdsmd
agent on all Hosts in order to get sanity my environment.
What adds up even more strangeness to this scenario is that one of the
Hosts affected doesn't belong to the same Cluster as the others and had
to have the vdsmd restarted.
I understand the Hosts can survive without the Engine online with
reduced possibilities but can communicated between them, but without
affecting the VMs or even needing to do what happened in this scenario.
Am I wrong on any of the assumptions ?
Fernando
--------------7A7A94A488D7D26C7D1B2EB5
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit
<html>
<head>
<meta http-equiv="content-type" content="text/html;
charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<font face="arial, helvetica, sans-serif">Hello.<br>
<br>
Yesterday I had a pretty strange problem in one of our
architectures. My oVirt which runs in one Datacenter and controls
Nodes locally and also remotelly lost communication with the
remote Nodes in another Datacenter.<br>
To this point nothing wrong as the Nodes can continue working as
expected and running their Virtual Machines each without
dependency of the oVirt Engine.<br>
<br>
What happened at some point is that when the communication between
Engine and Hosts came back Hosts got confused and initiated a Live
Migration of ALL VMs from one of the other. I had also to restart
vdsmd agent on all Hosts in order to get sanity my environment.<br>
What adds up even more strangeness to this scenario is that one of
the Hosts affected doesn't belong to the same Cluster as the
others and had to have the vdsmd restarted.<br>
<br>
I understand the Hosts can survive without the Engine online with
reduced possibilities but can communicated between them, but
without affecting the VMs or even needing to do what happened in
this scenario.<br>
<br>
Am I wrong on any of the assumptions ?<br>
<br>
Fernando<br>
</font>
</body>
</html>
--------------7A7A94A488D7D26C7D1B2EB5--