Update:
I identified that the ovirtmgt network configuration was out of whack on
the non-functioning host.
Problem 1: The GUI will not allow me to set the network back to its proper
state.
Problem 2: I accidentally changed the IP address of ovirtmgt network the
host that was working, so both hosts are now down. Even though I set the IP
address back to its original value, the cluster manager is unable to see
the host that was working.
Until I rebooted, the host could ping out, but didn't respond to pings.
I am able to see the host console.
At this point I have no functioning hosts, so there is no access to the
storage domain.
I suspect the solution to either issue will be the solution to both issues.
Thank you.
On Sat, Jun 11, 2022 at 5:42 PM David Johnson <djohnson(a)maxistechnology.com>
wrote:
Good afternoon all,
Ovirt version: 4.14.4.10.7-1.el8
Centos version: Linux version 4.18.0-365.el8.x86_64 (
mockbuild(a)kbuilder.bsys.centos.org) (gcc version 8.5.0 20210514 (Red Hat
8.5.0-10) (GCC)) #1 SMP Thu Feb 10 16:11:23 UTC 2022
Background:
We had a mother board fail in our storage device. I was able to
migrate the storage domain to the backup device before it failed
completely, and have been running on the backup device for several weeks
while we purchased a replacement main storage.
Today I shut everything down cleanly, replaced the main storage, and
restarted the cluster. We did disconnect and reconnect the network on all
of the devices as we shuffled equipment in the rack.
One of the hosts in the cluster refuses to come back up.I am able to
connect to the host via putty.
Ovirt gui reporting:
Setting Host
ovirt-host-03.maxisinc.net to Non-Operational mode.
Completed: Jun 11, 2022, 4:59:57 PM
Activating Host
ovirt-host-03.maxisinc.net
Completed: Jun 11, 2022, 4:59:57 PM
Invoking Activate Host
ovirt-host-03.maxisinc.net
Completed: Jun 11, 2022, 4:57:40 PM
Installing Host
ovirt-host-03.maxisinc.net
log from host is
5:09 PM
GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did not
receive a reply. Possible causes include: the remote application did not
send a reply, the message bus security policy blocked the reply, the reply
timeout expired, or the network connection was broken.
pulseaudio
4:55 PM
bondscan-DGwC1l: option lacp_active: mode dependency failed, not supported
in mode balance-alb(6)
kernel
4:55 PM
bondscan-DGwC1l: option arp_all_targets: invalid value (2)
kernel
4:55 PM
bondscan-DGwC1l: option fail_over_mac: invalid value (3)
kernel
4:55 PM
bondscan-DGwC1l: option primary_reselect: invalid value (3)
kernel
4:55 PM
bondscan-DGwC1l: option ad_select: invalid value (3)
kernel
4:55 PM