[ovirt-users] Communication Problems between Engine and Hosts

Piotr Kliczewski piotr.kliczewski at gmail.com
Wed Aug 16 14:01:36 UTC 2017


Fernando,

Do you know how log it took when you had connection issues between
data centers? Please collect the logs when it will happen again.

Thanks,
Piotr

On Wed, Aug 16, 2017 at 3:20 PM, FERNANDO FREDIANI
<fernando.frediani at upx.com> wrote:
> Hello Piotr. Thanks for your reply
>
> I was running version 4.1.1, but since that day I have upgraded to 4.1.5
> (the Engine because the hosts remain on 4.1.1). I am not sure the logs still
> exists (how long they are kept normally).
>
> Just to clarify the hosts didn't become unresponsive, but the communication
> between the Engine and the Hosts in question (each in a different Datacenter
> was interrupted - but locally the hosts were fine and accessible). What was
> strange was that since the Hosts could not talk to the Engine they seem to
> have got 'confused' and started several VM live migrations which was not
> expected. As a note I don't have any Fencing policy enabled.
>
> Regards
> Fernando
>
>
>
> On 16/08/2017 07:00, Piotr Kliczewski wrote:
>>
>> Fernando,
>>
>> Which ovirt version are you running? Please share the logs so I could
>> check what caused the hosts to become unresponsive.
>>
>> Thanks,
>> Piotr
>>
>> On Wed, Aug 2, 2017 at 5:11 PM, FERNANDO FREDIANI
>> <fernando.frediani at upx.com> wrote:
>>>
>>> Hello.
>>>
>>> Yesterday I had a pretty strange problem in one of our architectures. My
>>> oVirt which runs in one Datacenter and controls Nodes locally and also
>>> remotelly lost communication with the remote Nodes in another Datacenter.
>>> To this point nothing wrong as the Nodes can continue working as expected
>>> and running their Virtual Machines each without dependency of the oVirt
>>> Engine.
>>>
>>> What happened at some point is that when the communication between Engine
>>> and Hosts came back Hosts got confused and initiated a Live Migration of
>>> ALL
>>> VMs from one of the other. I had also to restart vdsmd agent on all Hosts
>>> in
>>> order to get sanity my environment.
>>> What adds up even more strangeness to this scenario is that one of the
>>> Hosts
>>> affected doesn't belong to the same Cluster as the others and had to have
>>> the vdsmd restarted.
>>>
>>> I understand the Hosts can survive without the Engine online with reduced
>>> possibilities but can communicated between them, but without affecting
>>> the
>>> VMs or even needing to do what happened in this scenario.
>>>
>>> Am I wrong on any of the assumptions ?
>>>
>>> Fernando
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>


More information about the Users mailing list