On Tue, Jun 5, 2018 at 10:05 AM, Mariusz Kozakowski <mariusz.kozakowski@sallinggroup.com> wrote:
Hi,

we managed to get a bit forward, but we still face issues.

2018-06-05 09:38:42,556+02 INFO  [org.ovirt.engine.core.bll.host.HostConnectivityChecker] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Engine managed to communicate with VDSM agent on host 'host01.redacted' with address 'host01.redacted' ('8af21ab3-ce7a-49a5-a526-94b65aa3da29')
2018-06-05 09:38:47,488+02 WARN  [org.ovirt.engine.core.bll.network.NetworkConfigurator] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Failed to find a valid interface for the management network of host host01.redacted. If the interface ovirtmgmt is a bridge, it should be torn-down manually.
2018-06-05 09:38:47,488+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Exception: org.ovirt.engine.core.bll.network.NetworkConfigurator$NetworkConfiguratorException: Interface ovirtmgmt is invalid for management network

Our network configuration, bond0.1111 is bridged into ovirtmgmt:

But did you manually created the bridge or did the engine created it for you triggered by hosted-engine-setup?
 

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
[…]
11: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff
    inet 1.2.3.42/24 brd 1.2.3.255 scope global noprefixroute ovirtmgmt
       valid_lft forever preferred_lft forever
    inet6 fe80::e8dd:fff:fe33:4bba/64 scope link 
       valid_lft forever preferred_lft forever
12: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::5ef3:fcff:feda:b618/64 scope link 
       valid_lft forever preferred_lft forever
13: bond0.3019@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0.3019 state UP group default qlen 1000
    link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff
14: bond0.1111@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovirtmgmt state UP group default qlen 1000
    link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff
15: br0.3019: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff
    inet 19.2.3.22/16 brd 192.168.255.255 scope global noprefixroute br0.3019
       valid_lft forever preferred_lft forever
    inet6 fe80::5ef3:fcff:feda:b618/64 scope link 
       valid_lft forever preferred_lft forever
31: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether da:aa:73:7e:d7:93 brd ff:ff:ff:ff:ff:ff
32: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
33: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ff
40: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 1000
    link/ether fe:16:3e:2d:0d:55 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc16:3eff:fe2d:d55/64 scope link 
       valid_lft forever preferred_lft forever

On Mon, 2018-05-28 at 12:57 +0200, Simone Tiraboschi wrote:


On Mon, May 28, 2018 at 11:44 AM, Mariusz Kozakowski <mariusz.kozakowski@dsg.dk> wrote:
On Fri, 2018-05-25 at 11:21 +0200, Simone Tiraboschi wrote:



On Fri, May 25, 2018 at 9:20 AM, Mariusz Kozakowski <mariusz.kozakowski@dsg.dk> wrote:
On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote:
To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM.

Unfortunately there is no logs under that directory. It's empty.


So it probably failed to reach the host due to a name resolution issue or something like that.
Can you please double check it in /var/log/ovirt-engine/engine.log on the engine VM ?
 

Thanks - it helped a bit. At least now we have logs for host-deploy, but still no success.

Few parts I found in engine log:

2018-05-28 11:07:39,473+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)


2018-05-28 11:07:39,485+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Host installation failed for host '098c3c99-921d-46f0-bdba-86370a2dc895', 'host01.redacted': Failed to configure management network on the host



The issue is on network configuration:
you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed.

 

2018-05-28 11:20:04,705+02 INFO  [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Running command: SetNonOperationalVdsCommand internal: true. Entities affected :  ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS
2018-05-28 11:20:04,711+02 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='098c3c99-921d-46f0-bdba-86370a2dc895', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 11ebbdeb
2018-05-28 11:20:04,715+02 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] FINISH, SetVdsStatusVDSCommand, log id: 11ebbdeb
2018-05-28 11:20:04,769+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt'
2018-05-28 11:20:04,786+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt'
2018-05-28 11:20:04,807+02 INFO  [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected :  ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS
2018-05-28 11:20:04,814+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] EVENT_ID: VDS_DETECTED(13), Status of host host01.redacted was set to NonOperational.
2018-05-28 11:20:04,833+02 INFO  [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Running command: HandleVdsVersionCommand internal: true. Entities affected :  ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS
2018-05-28 11:20:04,837+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Host 'host01.redacted'(098c3c99-921d-46f0-bdba-86370a2dc895) is already in NonOperational status for reason 'NETWORK_UNREACHABLE'. SetNonOperationalVds command is skipped.

Full log as attachment.


-- 
Best regards/Pozdrawiam/MfG

Mariusz Kozakowski

Site Reliability Engineer

Dansk Supermarked Group
Baltic Business Park
ul. 1 Maja 38-39
71-627 Szczecin
dansksupermarked.com