Data Center becomes Non Responsive when I reboot a host

Konstantinos Christidis

28 Jul 2015 28 Jul '15

12:15 p.m.

Hello ovirt users, I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7) When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess? Thanks, K.

Show replies by date

Artyom Lukianov

28 Jul 28 Jul

5:09 p.m.

New subject: Data Center becomes Non Responsive when I reboot a host

Just put host to maintenance mode, if it have vms it will migrate them automatically on other host. ----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Sent: Tuesday, July 28, 2015 1:15:15 PM Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host Hello ovirt users, I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7) When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess? Thanks, K. _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Konstantinos Christidis

29 Jul 29 Jul

8:40 a.m.

New subject: Data Center becomes Non Responsive when I reboot a host

Maintenance mode is already enabled. All VMs finish migration successfully. Now I stop glusterd service on this host (systemctl stop glusterd.service) and nothing bad happens, which means that distributed replica glusterfs works fine. Then I stop vdsmd service (systemctl stop vdsmd.service) and everything works fine. When I administratively set ovirtmgmt network down or reboot this host, my Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error". K. On 07/28/2015 06:09 PM, Artyom Lukianov wrote:

...

Just put host to maintenance mode, if it have vms it will migrate them automatically on other host.

----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Sent: Tuesday, July 28, 2015 1:15:15 PM Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host

Hello ovirt users,

I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7)

When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess?

Thanks,

K. _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Artyom Lukianov

11:21 a.m.

New subject: Data Center becomes Non Responsive when I reboot a host

Can you please provide engine log(/var/log/ovirt-engine/engine.log)? ----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Cc: "Artyom Lukianov" <alukiano@redhat.com> Sent: Wednesday, July 29, 2015 9:40:26 AM Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host Maintenance mode is already enabled. All VMs finish migration successfully. Now I stop glusterd service on this host (systemctl stop glusterd.service) and nothing bad happens, which means that distributed replica glusterfs works fine. Then I stop vdsmd service (systemctl stop vdsmd.service) and everything works fine. When I administratively set ovirtmgmt network down or reboot this host, my Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error". K. On 07/28/2015 06:09 PM, Artyom Lukianov wrote:

...

Just put host to maintenance mode, if it have vms it will migrate them automatically on other host.

----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Sent: Tuesday, July 28, 2015 1:15:15 PM Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host

Hello ovirt users,

I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7)

When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess?

Thanks,

K. _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Konstantinos Christidis

3 Aug 3 Aug

10:20 a.m.

New subject: Data Center becomes Non Responsive when I reboot a host

Hello, Sorry for my late response. I reproduced the error in a lab environment (oVirt3.5/CentOS7.1) with 2 hosts (ovhv00 ovhv01) and a replicated glusterfs. I activated the maintenance mode in host ovhv01 and then I stopped network.service (instead of a reboot). The result is always the same. Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error". This is the engine log https://paste.fedoraproject.org/250877/58925314/raw/ Thanks, K. On 07/29/2015 12:21 PM, Artyom Lukianov wrote:

...

Can you please provide engine log(/var/log/ovirt-engine/engine.log)?

----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Cc: "Artyom Lukianov" <alukiano@redhat.com> Sent: Wednesday, July 29, 2015 9:40:26 AM Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host

Maintenance mode is already enabled. All VMs finish migration successfully. Now I stop glusterd service on this host (systemctl stop glusterd.service) and nothing bad happens, which means that distributed replica glusterfs works fine. Then I stop vdsmd service (systemctl stop vdsmd.service) and everything works fine. When I administratively set ovirtmgmt network down or reboot this host, my Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error".

K.

On 07/28/2015 06:09 PM, Artyom Lukianov wrote:

...
Just put host to maintenance mode, if it have vms it will migrate them automatically on other host.

----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Sent: Tuesday, July 28, 2015 1:15:15 PM Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host

Hello ovirt users,

I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7)

When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess?

Thanks,

K. _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Artyom Lukianov

4 Aug 4 Aug

7:10 p.m.

New subject: Data Center becomes Non Responsive when I reboot a host

Maybe I mistake, but from logs looks like host where you stop network also SPM host, so can you try right click on host choose 'Confirm Host have been Rebooted'. If it will help, maybe someone from devs can help us, with error from engine log: 2015-08-03 10:54:29,044 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-35) Command SpmStatusVDSCommand(HostName = ovhv00.mytld, HostId = 0759994d-a704-4374-b6fa-d8c62f46760a, storagePoolId = 00000002-0002-0002-0002-0000000001ac) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (107, 'Sanlock resource read failure', 'Transport endpoint is not connected'), code = 100 Thanks ----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: "Artyom Lukianov" <alukiano@redhat.com> Cc: users@ovirt.org Sent: Monday, August 3, 2015 11:20:21 AM Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host Hello, Sorry for my late response. I reproduced the error in a lab environment (oVirt3.5/CentOS7.1) with 2 hosts (ovhv00 ovhv01) and a replicated glusterfs. I activated the maintenance mode in host ovhv01 and then I stopped network.service (instead of a reboot). The result is always the same. Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error". This is the engine log https://paste.fedoraproject.org/250877/58925314/raw/ Thanks, K. On 07/29/2015 12:21 PM, Artyom Lukianov wrote:

...

Can you please provide engine log(/var/log/ovirt-engine/engine.log)?

----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Cc: "Artyom Lukianov" <alukiano@redhat.com> Sent: Wednesday, July 29, 2015 9:40:26 AM Subject: Re: [ovirt-users] Data Center becomes Non Responsive when I reboot a host

Maintenance mode is already enabled. All VMs finish migration successfully. Now I stop glusterd service on this host (systemctl stop glusterd.service) and nothing bad happens, which means that distributed replica glusterfs works fine. Then I stop vdsmd service (systemctl stop vdsmd.service) and everything works fine. When I administratively set ovirtmgmt network down or reboot this host, my Data Center becomes "Non Responsive", my storage becomes red and inactive, and most VMs become "paused due to unknown storage error".

K.

On 07/28/2015 06:09 PM, Artyom Lukianov wrote:

...
Just put host to maintenance mode, if it have vms it will migrate them automatically on other host.

----- Original Message ----- From: "Konstantinos Christidis" <kochrist@ekt.gr> To: users@ovirt.org Sent: Tuesday, July 28, 2015 1:15:15 PM Subject: [ovirt-users] Data Center becomes Non Responsive when I reboot a host

Hello ovirt users,

I have 4 hosts with a distributed replicated 2x2 GlusterFS storage. (oVirt3.5/CentOS7)

When I reboot a host (in maintenance mode and not my SPM host) my Data Center becomes "Non Responsive", my storage becomes red and inactive, and many VMs become "paused due to unknown storage error". The same happens if I administratively set ovirtmgmt network down (to a host in maintenance mode and not my SPM host) with ifconfig ovirtmgmt down. I know that management network (ovirtmgmt) is required by default and is part of oVirt monitoring process but is there anything I can do in order to reboot a host without causing this mess?

Thanks,

K. _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

3870

Age (days ago)

3877

Last active (days ago)

List overview

Download

5 comments

2 participants

participants (2)

Artyom Lukianov
Konstantinos Christidis