Hi,
we managed to get a bit forward, but we still face issues.
2018-06-05 09:38:42,556+02 INFO [org.ovirt.engine.core.bll.host. HostConnectivityChecker] (EE-ManagedThreadFactory- engine-Thread-1) [2617aebd] Engine managed to communicate with VDSM agent on host 'host01.redacted' with address 'host01.redacted' ('8af21ab3-ce7a-49a5-a526- 94b65aa3da29') 2018-06-05 09:38:47,488+02 WARN [org.ovirt.engine.core.bll.network. NetworkConfigurator] (EE-ManagedThreadFactory- engine-Thread-1) [2617aebd] Failed to find a valid interface for the management network of host host01.redacted. If the interface ovirtmgmt is a bridge, it should be torn-down manually. 2018-06-05 09:38:47,488+02 ERROR [org.ovirt.engine.core.bll.hostdeploy. InstallVdsInternalCommand] (EE-ManagedThreadFactory- engine-Thread-1) [2617aebd] Exception: org.ovirt.engine.core.bll. network.NetworkConfigurator$ NetworkConfiguratorException: Interface ovirtmgmt is invalid for management network
Our network configuration, bond0.1111 is bridged into ovirtmgmt:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft foreverinet6 ::1/128 scope hostvalid_lft forever preferred_lft forever[…]11: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ffinet 1.2.3.42/24 brd 1.2.3.255 scope global noprefixroute ovirtmgmtvalid_lft forever preferred_lft foreverinet6 fe80::e8dd:fff:fe33:4bba/64 scope linkvalid_lft forever preferred_lft forever12: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ffinet6 fe80::5ef3:fcff:feda:b618/64 scope linkvalid_lft forever preferred_lft forever13: bond0.3019@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0.3019 state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff14: bond0.1111@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovirtmgmt state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff15: br0.3019: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ffinet 19.2.3.22/16 brd 192.168.255.255 scope global noprefixroute br0.3019valid_lft forever preferred_lft foreverinet6 fe80::5ef3:fcff:feda:b618/64 scope linkvalid_lft forever preferred_lft forever31: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000link/ether da:aa:73:7e:d7:93 brd ff:ff:ff:ff:ff:ff32: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ffinet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0valid_lft forever preferred_lft forever33: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ff40: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 1000 link/ether fe:16:3e:2d:0d:55 brd ff:ff:ff:ff:ff:ffinet6 fe80::fc16:3eff:fe2d:d55/64 scope linkvalid_lft forever preferred_lft forever
On Mon, 2018-05-28 at 12:57 +0200, Simone Tiraboschi wrote:
On Mon, May 28, 2018 at 11:44 AM, Mariusz Kozakowski <mariusz.kozakowski@dsg.dk> wrote:
On Fri, 2018-05-25 at 11:21 +0200, Simone Tiraboschi wrote:
On Fri, May 25, 2018 at 9:20 AM, Mariusz Kozakowski <mariusz.kozakowski@dsg.dk> wrote:
On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote:To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM.
Unfortunately there is no logs under that directory. It's empty.
So it probably failed to reach the host due to a name resolution issue or something like that.Can you please double check it in /var/log/ovirt-engine/engine.log on the engine VM ?
Thanks - it helped a bit. At least now we have logs for host-deploy, but still no success.
Few parts I found in engine log:
2018-05-28 11:07:39,473+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalComm and] (EE-ManagedThreadFactory-engin e-Thread-1) [1a4cf85e] Exception: org.ovirt.engine.core.common.e rrors.EngineException: EngineException: org.ovirt.engine.core.vdsbroke r.vdsbroker.VDSNetworkExceptio n: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
2018-05-28 11:07:39,485+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalComm and] (EE-ManagedThreadFactory-engin e-Thread-1) [1a4cf85e] Host installation failed for host '098c3c99-921d-46f0-bdba-86370 a2dc895', 'host01.redacted': Failed to configure management network on the host
The issue is on network configuration:you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed.
2018-05-28 11:20:04,705+02 INFO [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand ] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [5ba0ae45] Running command: SetNonOperationalVdsCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a 2dc895 Type: VDS 2018-05-28 11:20:04,711+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSComman d] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [5ba0ae45] START, SetVdsStatusVDSCommand(HostNam e = host01.redacted, SetVdsStatusVDSCommandParamete rs:{hostId='098c3c99-921d-46f0 -bdba-86370a2dc895', status='NonOperational', nonOperationalReason='NETWORK_ UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 11ebbdeb 2018-05-28 11:20:04,715+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSComman d] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [5ba0ae45] FINISH, SetVdsStatusVDSCommand, log id: 11ebbdeb 2018-05-28 11:20:04,769+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [5ba0ae45] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-05-28 11:20:04,786+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.A uditLogDirector] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [5ba0ae45] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK (519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-05-28 11:20:04,807+02 INFO [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterC hangedCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [7937fb47] Running command: HandleVdsCpuFlagsOrClusterChan gedCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a 2dc895 Type: VDS 2018-05-28 11:20:04,814+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.A uditLogDirector] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [7937fb47] EVENT_ID: VDS_DETECTED(13), Status of host host01.redacted was set to NonOperational. 2018-05-28 11:20:04,833+02 INFO [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [4c10675c] Running command: HandleVdsVersionCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a 2dc895 Type: VDS 2018-05-28 11:20:04,837+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.HostMonito ring] (EE-ManagedThreadFactory-engin eScheduled-Thread-37) [4c10675c] Host 'host01.redacted'(098c3c99-921 d-46f0-bdba-86370a2dc895) is already in NonOperational status for reason 'NETWORK_UNREACHABLE'. SetNonOperationalVds command is skipped.
Full log as attachment.
--Best regards/Pozdrawiam/MfG
Mariusz Kozakowski
Site Reliability Engineer
Dansk Supermarked Group
Baltic Business Park
ul. 1 Maja 38-39
71-627 Szczecin
dansksupermarked.com