[ovirt-users] Data Center becomes Non Responsive when I reboot a host

Artyom Lukianov alukiano at redhat.com
Tue Aug 4 17:10:43 UTC 2015


Maybe I mistake, but from logs looks like host where you stop network also SPM host, so can you try right click on host choose 'Confirm Host have been Rebooted'.
If it will help, maybe someone from devs can help us, with error from engine log:
2015-08-03 10:54:29,044 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-35) Command SpmStatusVDSCommand(HostName = ovhv00.mytld, HostId = 0759994d-a704-4374-b6fa-d8c62f46760a, storagePoolId = 00000002-0002-0002-0002-0000000001ac) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (107, 'Sanlock resource read failure', 'Transport endpoint is not connected'), code = 100
Thanks

----- Original Message -----
From: "Konstantinos Christidis" <kochrist at ekt.gr>
To: "Artyom Lukianov" <alukiano at redhat.com>
Cc: users at ovirt.org
Sent: Monday, August 3, 2015 11:20:21 AM
Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a	host

Hello,

Sorry for my late response. I reproduced the error in a lab environment 
(oVirt3.5/CentOS7.1) with 2 hosts (ovhv00 ovhv01) and a replicated 
glusterfs.
I activated the maintenance mode in host ovhv01 and then I stopped 
network.service (instead of a reboot).
The result is always the same. Data Center becomes "Non Responsive", my 
storage becomes red and inactive, and most VMs become "paused due to 
unknown storage error".

This is the engine log
https://paste.fedoraproject.org/250877/58925314/raw/

Thanks,

K.



On 07/29/2015 12:21 PM, Artyom Lukianov wrote:
> Can you please provide engine log(/var/log/ovirt-engine/engine.log)?
>
> ----- Original Message -----
> From: "Konstantinos Christidis" <kochrist at ekt.gr>
> To: users at ovirt.org
> Cc: "Artyom Lukianov" <alukiano at redhat.com>
> Sent: Wednesday, July 29, 2015 9:40:26 AM
> Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a	host
>
> Maintenance mode is already enabled. All VMs finish migration successfully.
> Now I stop glusterd service on this host (systemctl stop
> glusterd.service) and nothing bad happens, which means that distributed
> replica glusterfs works fine.
> Then I stop vdsmd service (systemctl stop vdsmd.service) and everything
> works fine.
> When I administratively set ovirtmgmt network down or reboot this host,
> my Data Center becomes "Non Responsive", my storage becomes red and
> inactive, and most VMs become "paused due to unknown storage error".
>
> K.
>
>
>
>
> On 07/28/2015 06:09 PM, Artyom Lukianov wrote:
>> Just put host to maintenance mode, if it have vms it will migrate them automatically on other host.
>>
>> ----- Original Message -----
>> From: "Konstantinos Christidis" <kochrist at ekt.gr>
>> To: users at ovirt.org
>> Sent: Tuesday, July 28, 2015 1:15:15 PM
>> Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a	host
>>
>> Hello ovirt users,
>>
>> I have 4 hosts with a distributed replicated 2x2 GlusterFS storage.
>> (oVirt3.5/CentOS7)
>>
>> When I reboot a host (in maintenance mode and not my SPM host) my Data
>> Center becomes "Non Responsive", my storage becomes red and inactive,
>> and many VMs become "paused due to unknown storage error". The same
>> happens if I administratively set ovirtmgmt network down (to a host in
>> maintenance mode and not my SPM host) with ifconfig ovirtmgmt down.
>> I know that management network (ovirtmgmt) is required by default and is
>> part of oVirt monitoring process but is there anything I can do in order
>> to reboot a host without causing this mess?
>>
>> Thanks,
>>
>> K.
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users




More information about the Users mailing list