Maybe I mistake, but from logs looks like host where you stop network also SPM host, so
can you try right click on host choose 'Confirm Host have been Rebooted'.
If it will help, maybe someone from devs can help us, with error from engine log:
2015-08-03 10:54:29,044 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand]
(DefaultQuartzScheduler_Worker-35) Command SpmStatusVDSCommand(HostName = ovhv00.mytld,
HostId = 0759994d-a704-4374-b6fa-d8c62f46760a, storagePoolId =
00000002-0002-0002-0002-0000000001ac) execution failed. Exception: VDSErrorException:
VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (107, 'Sanlock
resource read failure', 'Transport endpoint is not connected'), code = 100
Thanks
----- Original Message -----
From: "Konstantinos Christidis" <kochrist(a)ekt.gr>
To: "Artyom Lukianov" <alukiano(a)redhat.com>
Cc: users(a)ovirt.org
Sent: Monday, August 3, 2015 11:20:21 AM
Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host
Hello,
Sorry for my late response. I reproduced the error in a lab environment
(oVirt3.5/CentOS7.1) with 2 hosts (ovhv00 ovhv01) and a replicated
glusterfs.
I activated the maintenance mode in host ovhv01 and then I stopped
network.service (instead of a reboot).
The result is always the same. Data Center becomes "Non Responsive", my
storage becomes red and inactive, and most VMs become "paused due to
unknown storage error".
This is the engine log
https://paste.fedoraproject.org/250877/58925314/raw/
Thanks,
K.
On 07/29/2015 12:21 PM, Artyom Lukianov wrote:
Can you please provide engine log(/var/log/ovirt-engine/engine.log)?
----- Original Message -----
From: "Konstantinos Christidis" <kochrist(a)ekt.gr>
To: users(a)ovirt.org
Cc: "Artyom Lukianov" <alukiano(a)redhat.com>
Sent: Wednesday, July 29, 2015 9:40:26 AM
Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host
Maintenance mode is already enabled. All VMs finish migration successfully.
Now I stop glusterd service on this host (systemctl stop
glusterd.service) and nothing bad happens, which means that distributed
replica glusterfs works fine.
Then I stop vdsmd service (systemctl stop vdsmd.service) and everything
works fine.
When I administratively set ovirtmgmt network down or reboot this host,
my Data Center becomes "Non Responsive", my storage becomes red and
inactive, and most VMs become "paused due to unknown storage error".
K.
On 07/28/2015 06:09 PM, Artyom Lukianov wrote:
> Just put host to maintenance mode, if it have vms it will migrate them automatically
on other host.
>
> ----- Original Message -----
> From: "Konstantinos Christidis" <kochrist(a)ekt.gr>
> To: users(a)ovirt.org
> Sent: Tuesday, July 28, 2015 1:15:15 PM
> Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host
>
> Hello ovirt users,
>
> I have 4 hosts with a distributed replicated 2x2 GlusterFS storage.
> (oVirt3.5/CentOS7)
>
> When I reboot a host (in maintenance mode and not my SPM host) my Data
> Center becomes "Non Responsive", my storage becomes red and inactive,
> and many VMs become "paused due to unknown storage error". The same
> happens if I administratively set ovirtmgmt network down (to a host in
> maintenance mode and not my SPM host) with ifconfig ovirtmgmt down.
> I know that management network (ovirtmgmt) is required by default and is
> part of oVirt monitoring process but is there anything I can do in order
> to reboot a host without causing this mess?
>
> Thanks,
>
> K.
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users