oVirt hosted-engine-setup issues with getting host facts

Hello, We've been trying to setup oVirt environment for few days but we have issue with hosted-engine-setup (ansible script). We managed to fix few small things and have them merged upstream but unfortunately right now the installation process fails on getting host facts. It looks like it cannot proceed because it fails when connecting to ovirt-engine API of the bootstrap VM. The oVirt API / webpanel is working, I tested it via a browser and I can login without issues using the admin password chosen earlier in the process. 2018-05-18 15:26:47,800+0200 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:100 TASK [Wait for the host to be up] 2018-05-18 15:39:14,025+0200 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:94 {u'_ansible_parsed': True, u'_ansible_no_log': False, u'changed': False, u'attempts': 120, u'invocation': {u'module_args': { u'pattern': u'name=host01.redacted', u'fetch_nested': False, u'nested_attributes': []}}, u'ansible_facts': {u'ovirt_hosts': []}} 2018-05-18 15:39:14,127+0200 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:98 fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": []}, "attempts": 120, "changed": false} May 18 13:34:34 host01 python: ansible-ovirt_hosts_facts Invoked with pattern=name=host01.redacted fetch_nested=False nested_attributes=[] auth={'timeout': 0, 'url': 'https://ovirt-dev.redacted/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'R--token-redacted', 'ca_file': None} Do you have idea what/where is issue and how to fix it?

On Wed, May 23, 2018 at 8:35 AM, Mariusz Kozakowski < mariusz.kozakowski@dsg.dk> wrote:
Hello,
We've been trying to setup oVirt environment for few days but we have issue with hosted-engine-setup (ansible script). We managed to fix few small things and have them merged upstream but unfortunately right now the installation process fails on getting host facts. It looks like it cannot proceed because it fails when connecting to ovirt-engine API of the bootstrap VM.
The oVirt API / webpanel is working, I tested it via a browser and I can login without issues using the admin password chosen earlier in the process.
2018-05-18 15:26:47,800+0200 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:100 TASK [Wait for the host to be up] 2018-05-18 15:39:14,025+0200 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:94 {u'_ansible_parsed': True, u'_ansible_no_log': False, u'changed': False, u'attempts': 120, u'invocation': {u'module_args': { u'pattern': u'name=host01.redacted', u'fetch_nested': False, u'nested_attributes': []}}, u'ansible_facts': {u'ovirt_hosts': []}} 2018-05-18 15:39:14,127+0200 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:98 fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": []}, "attempts": 120, "changed": false}
May 18 13:34:34 host01 python: ansible-ovirt_hosts_facts Invoked with pattern=name=host01.redacted fetch_nested=False nested_attributes=[] auth={'timeout': 0, 'url': 'https://ovirt-dev.redacted/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'R--token-redacted', 'ca_file': None}
Do you have idea what/where is issue and how to fix it?
Hi, it correctly gets host facts from engine APIs, the issue now is simply that no host appears in the reported facts and this means that engine, for same reason, failed to configure the host where you are running hosted-engine-setup. To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM.
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org

On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote: To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM. Unfortunately there is no logs under that directory. It's empty. -- Best regards/Pozdrawiam/MfG Mariusz Kozakowski Site Reliability Engineer Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Fri, May 25, 2018 at 9:20 AM, Mariusz Kozakowski < mariusz.kozakowski@dsg.dk> wrote:
On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote:
To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM.
Unfortunately there is no logs under that directory. It's empty.
So it probably failed to reach the host due to a name resolution issue or something like that. Can you please double check it in /var/log/ovirt-engine/engine.log on the engine VM ?
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Fri, 2018-05-25 at 11:21 +0200, Simone Tiraboschi wrote: On Fri, May 25, 2018 at 9:20 AM, Mariusz Kozakowski <mariusz.kozakowski@dsg.dk<mailto:mariusz.kozakowski@dsg.dk>> wrote: On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote: To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM. Unfortunately there is no logs under that directory. It's empty. So it probably failed to reach the host due to a name resolution issue or something like that. Can you please double check it in /var/log/ovirt-engine/engine.log on the engine VM ? Thanks - it helped a bit. At least now we have logs for host-deploy, but still no success. Few parts I found in engine log: 2018-05-28 11:07:39,473+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022) 2018-05-28 11:07:39,485+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Host installation failed for host '098c3c99-921d-46f0-bdba-86370a2dc895', 'host01.redacted': Failed to configure management network on the host 2018-05-28 11:20:04,705+02 INFO [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Running command: SetNonOperationalVdsCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,711+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='098c3c99-921d-46f0-bdba-86370a2dc895', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 11ebbdeb 2018-05-28 11:20:04,715+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] FINISH, SetVdsStatusVDSCommand, log id: 11ebbdeb 2018-05-28 11:20:04,769+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-05-28 11:20:04,786+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-05-28 11:20:04,807+02 INFO [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,814+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] EVENT_ID: VDS_DETECTED(13), Status of host host01.redacted was set to NonOperational. 2018-05-28 11:20:04,833+02 INFO [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Running command: HandleVdsVersionCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,837+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Host 'host01.redacted'(098c3c99-921d-46f0-bdba-86370a2dc895) is already in NonOperational status for reason 'NETWORK_UNREACHABLE'. SetNonOperationalVds command is skipped. Full log as attachment.

On Mon, May 28, 2018 at 11:44 AM, Mariusz Kozakowski < mariusz.kozakowski@dsg.dk> wrote:
On Fri, 2018-05-25 at 11:21 +0200, Simone Tiraboschi wrote:
On Fri, May 25, 2018 at 9:20 AM, Mariusz Kozakowski < mariusz.kozakowski@dsg.dk> wrote:
On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote:
To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM.
Unfortunately there is no logs under that directory. It's empty.
So it probably failed to reach the host due to a name resolution issue or something like that. Can you please double check it in /var/log/ovirt-engine/engine.log on the engine VM ?
Thanks - it helped a bit. At least now we have logs for host-deploy, but still no success.
Few parts I found in engine log:
2018-05-28 11:07:39,473+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
2018-05-28 11:07:39,485+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Host installation failed for host '098c3c99-921d-46f0-bdba-86370a2dc895', 'host01.redacted': Failed to configure management network on the host
The issue is on network configuration: you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed.
2018-05-28 11:20:04,705+02 INFO [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Running command: SetNonOperationalVdsCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,711+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='098c3c99-921d-46f0-bdba-86370a2dc895', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 11ebbdeb 2018-05-28 11:20:04,715+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] FINISH, SetVdsStatusVDSCommand, log id: 11ebbdeb 2018-05-28 11:20:04,769+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-05-28 11:20:04,786+02 WARN [org.ovirt.engine.core. dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-05-28 11:20:04,807+02 INFO [org.ovirt.engine.core.bll. HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,814+02 INFO [org.ovirt.engine.core. dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] EVENT_ID: VDS_DETECTED(13), Status of host host01.redacted was set to NonOperational. 2018-05-28 11:20:04,833+02 INFO [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Running command: HandleVdsVersionCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,837+02 INFO [org.ovirt.engine.core. vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Host 'host01.redacted'(098c3c99-921d-46f0-bdba-86370a2dc895) is already in NonOperational status for reason 'NETWORK_UNREACHABLE'. SetNonOperationalVds command is skipped.
Full log as attachment.

On Mon, 2018-05-28 at 12:57 +0200, Simone Tiraboschi wrote: The issue is on network configuration: you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed. From same time frame, vdsm.log. Can this be related? 2018-05-28 11:07:34,481+0200 INFO (jsonrpc/1) [api.host] START getAllVmStats() from=::1,45816 (api:46) 2018-05-28 11:07:34,482+0200 INFO (jsonrpc/1) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,45816 (api:52) 2018-05-28 11:07:34,483+0200 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:573) 2018-05-28 11:07:34,489+0200 INFO (jsonrpc/2) [api.host] START getAllVmIoTunePolicies() from=::1,45816 (api:46) 2018-05-28 11:07:34,489+0200 INFO (jsonrpc/2) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'405f8ec0-03f9-43cb-a7e1-343a4c30453f': {'policy': [], 'current_values': []}}} from=::1,45816 (api:52) 2018-05-28 11:07:34,490+0200 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:573) 2018-05-28 11:07:35,555+0200 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=6f517c47-a9f3-4913-bf9d-661355262c38 (api:46) 2018-05-28 11:07:35,555+0200 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=6f517c47-a9f3-4913-bf9d-661355262c38 (api:52) 2018-05-28 11:07:35,555+0200 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:707) 2018-05-28 11:07:38,982+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/7 running <Task <JsonRpcTask {'params': {}, 'jsonrpc': '2.0', 'method': u'Host.getCapabilities', 'id': u'b5990e16-65c9-4137-aeed-271c415f9df5'} at 0x7eff50088910> timeout=60, duration=180 at 0x2fc3650> task#=1 at 0x3590710>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in __call__ self._handler(self._ctx, self._req) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in _serveRequest response = self._handle_request(req, ctx) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request res = method(**params) File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in _dynamicMethod result = fn(*methodArgs) File: "<string>", line 2, in getCapabilities File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1339, in getCapabilities c = caps.get() File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 201, in get liveSnapSupported = _getLiveSnapshotSupport(cpuarch.effective()) File: "/usr/lib/python2.7/site-packages/vdsm/common/cache.py", line 41, in __call__ value = self.func(*args) File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 92, in _getLiveSnapshotSupport capabilities = _getCapsXMLStr() File: "/usr/lib/python2.7/site-packages/vdsm/common/cache.py", line 41, in __call__ value = self.func(*args) File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 60, in _getCapsXMLStr return _getFreshCapsXMLStr() File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 55, in _getFreshCapsXMLStr return libvirtconnection.get().getCapabilities() File: "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File: "/usr/lib64/python2.7/site-packages/libvirt.py", line 3669, in getCapabilities ret = libvirtmod.virConnectGetCapabilities(self._o) (executor:363) 2018-05-28 11:07:40,561+0200 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=2ef7c0a3-4cf9-436e-ad6b-ee04e2a0cf3a (api:46) 2018-05-28 11:07:40,561+0200 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=2ef7c0a3-4cf9-436e-ad6b-ee04e2a0cf3a (api:52) 2018-05-28 11:07:40,561+0200 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:707) supervdsm hase nothing at 11:07:39.

On Mon, 2018-05-28 at 14:30 +0200, Mariusz Kozakowski wrote: On Mon, 2018-05-28 at 12:57 +0200, Simone Tiraboschi wrote: The issue is on network configuration: you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed. From same time frame, vdsm.log. Can this be related? 2018-05-28 11:07:34,481+0200 INFO (jsonrpc/1) [api.host] START getAllVmStats() from=::1,45816 (api:46) 2018-05-28 11:07:34,482+0200 INFO (jsonrpc/1) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,45816 (api:52) 2018-05-28 11:07:34,483+0200 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:573) 2018-05-28 11:07:34,489+0200 INFO (jsonrpc/2) [api.host] START getAllVmIoTunePolicies() from=::1,45816 (api:46) 2018-05-28 11:07:34,489+0200 INFO (jsonrpc/2) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'405f8ec0-03f9-43cb-a7e1-343a4c30453f': {'policy': [], 'current_values': []}}} from=::1,45816 (api:52) 2018-05-28 11:07:34,490+0200 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:573) 2018-05-28 11:07:35,555+0200 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=6f517c47-a9f3-4913-bf9d-661355262c38 (api:46) 2018-05-28 11:07:35,555+0200 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=6f517c47-a9f3-4913-bf9d-661355262c38 (api:52) 2018-05-28 11:07:35,555+0200 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:707) 2018-05-28 11:07:38,982+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/7 running <Task <JsonRpcTask {'params': {}, 'jsonrpc': '2.0', 'method': u'Host.getCapabilities', 'id': u'b5990e16-65c9-4137-aeed-271c415f9df5'} at 0x7eff50088910> timeout=60, duration=180 at 0x2fc3650> task#=1 at 0x3590710>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in __call__ self._handler(self._ctx, self._req) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in _serveRequest response = self._handle_request(req, ctx) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request res = method(**params) File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in _dynamicMethod result = fn(*methodArgs) File: "<string>", line 2, in getCapabilities File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1339, in getCapabilities c = caps.get() File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 201, in get liveSnapSupported = _getLiveSnapshotSupport(cpuarch.effective()) File: "/usr/lib/python2.7/site-packages/vdsm/common/cache.py", line 41, in __call__ value = self.func(*args) File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 92, in _getLiveSnapshotSupport capabilities = _getCapsXMLStr() File: "/usr/lib/python2.7/site-packages/vdsm/common/cache.py", line 41, in __call__ value = self.func(*args) File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 60, in _getCapsXMLStr return _getFreshCapsXMLStr() File: "/usr/lib/python2.7/site-packages/vdsm/host/caps.py", line 55, in _getFreshCapsXMLStr return libvirtconnection.get().getCapabilities() File: "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File: "/usr/lib64/python2.7/site-packages/libvirt.py", line 3669, in getCapabilities ret = libvirtmod.virConnectGetCapabilities(self._o) (executor:363) 2018-05-28 11:07:40,561+0200 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=2ef7c0a3-4cf9-436e-ad6b-ee04e2a0cf3a (api:46) 2018-05-28 11:07:40,561+0200 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=2ef7c0a3-4cf9-436e-ad6b-ee04e2a0cf3a (api:52) 2018-05-28 11:07:40,561+0200 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:707) supervdsm hase nothing at 11:07:39. Found another one, maybe related: 2018-05-28 11:02:44,371+0200 ERROR (vm/405f8ec0) [virt.vm] (vmId='405f8ec0-03f9-43cb-a7e1-343a4c30453f') Failed to connect to guest agent channel (vm:2500) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2498, in _vmDependentInit self.guestAgent.start() File "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 249, in start self._prepare_socket() File "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 291, in _prepare_socket supervdsm.getProxy().prepareVmChannel(self._socketName) File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 55, in __call__ return callMethod() File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in <lambda> **kwargs) File "<string>", line 2, in prepareVmChannel File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod raise convert_to_error(kind, result) OSError: [Errno 2] No such file or directory: '/var/lib/libvirt/qemu/channels/405f8ec0-03f9-43cb-a7e1-343a4c30453f.com.redhat.rhevm.vdsm' Interesting part: there is no file with ending: com.redhat.rhevm.vdsm, but there is `/var/lib/libvirt/qemu/channels/405f8ec0-03f9-43cb-a7e1-343a4c30453f.org.qemu.guest_agent.0`

Hi, we managed to get a bit forward, but we still face issues. 2018-06-05 09:38:42,556+02 INFO [org.ovirt.engine.core.bll.host.HostConnectivityChecker] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Engine managed to communicate with VDSM agent on host 'host01.redacted' with address 'host01.redacted' ('8af21ab3-ce7a-49a5-a526-94b65aa3da29') 2018-06-05 09:38:47,488+02 WARN [org.ovirt.engine.core.bll.network.NetworkConfigurator] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Failed to find a valid interface for the management network of host host01.redacted. If the interface ovirtmgmt is a bridge, it should be torn-down manually. 2018-06-05 09:38:47,488+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Exception: org.ovirt.engine.core.bll.network.NetworkConfigurator$NetworkConfiguratorException: Interface ovirtmgmt is invalid for management network Our network configuration, bond0.1111 is bridged into ovirtmgmt: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever […] 11: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff inet 1.2.3.42/24 brd 1.2.3.255 scope global noprefixroute ovirtmgmt valid_lft forever preferred_lft forever inet6 fe80::e8dd:fff:fe33:4bba/64 scope link valid_lft forever preferred_lft forever 12: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff inet6 fe80::5ef3:fcff:feda:b618/64 scope link valid_lft forever preferred_lft forever 13: bond0.3019@bond0<mailto:bond0.3019@bond0>: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0.3019 state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff 14: bond0.1111@bond0<mailto:bond0.1111@bond0>: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovirtmgmt state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff 15: br0.3019: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff inet 19.2.3.22/16 brd 192.168.255.255 scope global noprefixroute br0.3019 valid_lft forever preferred_lft forever inet6 fe80::5ef3:fcff:feda:b618/64 scope link valid_lft forever preferred_lft forever 31: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether da:aa:73:7e:d7:93 brd ff:ff:ff:ff:ff:ff 32: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 33: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000 link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ff 40: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 1000 link/ether fe:16:3e:2d:0d:55 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc16:3eff:fe2d:d55/64 scope link valid_lft forever preferred_lft forever On Mon, 2018-05-28 at 12:57 +0200, Simone Tiraboschi wrote: On Mon, May 28, 2018 at 11:44 AM, Mariusz Kozakowski <mariusz.kozakowski@dsg.dk<mailto:mariusz.kozakowski@dsg.dk>> wrote: On Fri, 2018-05-25 at 11:21 +0200, Simone Tiraboschi wrote: On Fri, May 25, 2018 at 9:20 AM, Mariusz Kozakowski <mariusz.kozakowski@dsg.dk<mailto:mariusz.kozakowski@dsg.dk>> wrote: On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote: To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM. Unfortunately there is no logs under that directory. It's empty. So it probably failed to reach the host due to a name resolution issue or something like that. Can you please double check it in /var/log/ovirt-engine/engine.log on the engine VM ? Thanks - it helped a bit. At least now we have logs for host-deploy, but still no success. Few parts I found in engine log: 2018-05-28 11:07:39,473+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022) 2018-05-28 11:07:39,485+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Host installation failed for host '098c3c99-921d-46f0-bdba-86370a2dc895', 'host01.redacted': Failed to configure management network on the host The issue is on network configuration: you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed. 2018-05-28 11:20:04,705+02 INFO [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Running command: SetNonOperationalVdsCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,711+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='098c3c99-921d-46f0-bdba-86370a2dc895', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 11ebbdeb 2018-05-28 11:20:04,715+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] FINISH, SetVdsStatusVDSCommand, log id: 11ebbdeb 2018-05-28 11:20:04,769+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-05-28 11:20:04,786+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-05-28 11:20:04,807+02 INFO [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,814+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] EVENT_ID: VDS_DETECTED(13), Status of host host01.redacted was set to NonOperational. 2018-05-28 11:20:04,833+02 INFO [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Running command: HandleVdsVersionCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,837+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Host 'host01.redacted'(098c3c99-921d-46f0-bdba-86370a2dc895) is already in NonOperational status for reason 'NETWORK_UNREACHABLE'. SetNonOperationalVds command is skipped. Full log as attachment. -- Best regards/Pozdrawiam/MfG Mariusz Kozakowski Site Reliability Engineer Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, Jun 5, 2018 at 10:05 AM, Mariusz Kozakowski < mariusz.kozakowski@sallinggroup.com> wrote:
Hi,
we managed to get a bit forward, but we still face issues.
2018-06-05 09:38:42,556+02 INFO [org.ovirt.engine.core.bll.host.HostConnectivityChecker] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Engine managed to communicate with VDSM agent on host 'host01.redacted' with address 'host01.redacted' ('8af21ab3-ce7a-49a5-a526-94b65aa3da29') 2018-06-05 09:38:47,488+02 WARN [org.ovirt.engine.core.bll.network.NetworkConfigurator] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Failed to find a valid interface for the management network of host host01.redacted. If the interface ovirtmgmt is a bridge, it should be torn-down manually. 2018-06-05 09:38:47,488+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [2617aebd] Exception: org.ovirt.engine.core.bll.network.NetworkConfigurator$NetworkConfiguratorException: Interface ovirtmgmt is invalid for management network
Our network configuration, bond0.1111 is bridged into ovirtmgmt:
But did you manually created the bridge or did the engine created it for you triggered by hosted-engine-setup?
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever […] 11: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff inet 1.2.3.42/24 brd 1.2.3.255 scope global noprefixroute ovirtmgmt valid_lft forever preferred_lft forever inet6 fe80::e8dd:fff:fe33:4bba/64 scope link valid_lft forever preferred_lft forever 12: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff inet6 fe80::5ef3:fcff:feda:b618/64 scope link valid_lft forever preferred_lft forever 13: bond0.3019@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0.3019 state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff 14: bond0.1111@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovirtmgmt state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff 15: br0.3019: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 5c:f3:fc:da:b6:18 brd ff:ff:ff:ff:ff:ff inet 19.2.3.22/16 brd 192.168.255.255 scope global noprefixroute br0.3019 valid_lft forever preferred_lft forever inet6 fe80::5ef3:fcff:feda:b618/64 scope link valid_lft forever preferred_lft forever 31: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether da:aa:73:7e:d7:93 brd ff:ff:ff:ff:ff:ff 32: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 33: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000 link/ether 52:54:00:a6:75:67 brd ff:ff:ff:ff:ff:ff 40: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 1000 link/ether fe:16:3e:2d:0d:55 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc16:3eff:fe2d:d55/64 scope link valid_lft forever preferred_lft forever
On Mon, 2018-05-28 at 12:57 +0200, Simone Tiraboschi wrote:
On Mon, May 28, 2018 at 11:44 AM, Mariusz Kozakowski < mariusz.kozakowski@dsg.dk> wrote:
On Fri, 2018-05-25 at 11:21 +0200, Simone Tiraboschi wrote:
On Fri, May 25, 2018 at 9:20 AM, Mariusz Kozakowski < mariusz.kozakowski@dsg.dk> wrote:
On Thu, 2018-05-24 at 14:11 +0200, Simone Tiraboschi wrote:
To better understand what it's happening you have to check host-deploy logs; they are available under /var/log/ovirt-engine/host-deploy/ on your engine VM.
Unfortunately there is no logs under that directory. It's empty.
So it probably failed to reach the host due to a name resolution issue or something like that. Can you please double check it in /var/log/ovirt-engine/engine.log on the engine VM ?
Thanks - it helped a bit. At least now we have logs for host-deploy, but still no success.
Few parts I found in engine log:
2018-05-28 11:07:39,473+02 ERROR [org.ovirt.engine.core.bll.hos tdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroke r.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
2018-05-28 11:07:39,485+02 ERROR [org.ovirt.engine.core.bll.hos tdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [1a4cf85e] Host installation failed for host '098c3c99-921d-46f0-bdba-86370a2dc895', 'host01.redacted': Failed to configure management network on the host
The issue is on network configuration: you have to check /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log to understand why it failed.
2018-05-28 11:20:04,705+02 INFO [org.ovirt.engine.core.b ll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Running command: SetNonOperationalVdsCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,711+02 INFO [org.ovirt.engine.core.v dsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='098c3c99-921d-46f0-bdba-86370a2dc895', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 11ebbdeb 2018-05-28 11:20:04,715+02 INFO [org.ovirt.engine.core.v dsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] FINISH, SetVdsStatusVDSCommand, log id: 11ebbdeb 2018-05-28 11:20:04,769+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-05-28 11:20:04,786+02 WARN [org.ovirt.engine.core.d al.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [5ba0ae45] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-05-28 11:20:04,807+02 INFO [org.ovirt.engine.core.b ll.HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,814+02 INFO [org.ovirt.engine.core.d al.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [7937fb47] EVENT_ID: VDS_DETECTED(13), Status of host host01.redacted was set to NonOperational. 2018-05-28 11:20:04,833+02 INFO [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Running command: HandleVdsVersionCommand internal: true. Entities affected : ID: 098c3c99-921d-46f0-bdba-86370a2dc895 Type: VDS 2018-05-28 11:20:04,837+02 INFO [org.ovirt.engine.core.v dsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [4c10675c] Host 'host01.redacted'(098c3c99-921d-46f0-bdba-86370a2dc895) is already in NonOperational status for reason 'NETWORK_UNREACHABLE'. SetNonOperationalVds command is skipped.
Full log as attachment.
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, 2018-06-05 at 10:09 +0200, Simone Tiraboschi wrote: But did you manually created the bridge or did the engine created it for you triggered by hosted-engine-setup? Manually. Before we had br0.1111. Should we go back with network configuration to br0.1111, and no ovritmgmt network created? Also, what we should use as anwers here? OVEHOSTED_NETWORK/bridgeIf=str:bond0.1111 OVEHOSTED_NETWORK/bridgeName=str:ovirtmgmt -- Best regards/Pozdrawiam/MfG Mariusz Kozakowski Site Reliability Engineer Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, Jun 5, 2018 at 10:16 AM, Mariusz Kozakowski < mariusz.kozakowski@sallinggroup.com> wrote:
On Tue, 2018-06-05 at 10:09 +0200, Simone Tiraboschi wrote:
But did you manually created the bridge or did the engine created it for you triggered by hosted-engine-setup?
Manually. Before we had br0.1111. Should we go back with network configuration to br0.1111, and no ovritmgmt network created?
Yes, I'm pretty sure that the issue is there, see also https://bugzilla.redhat.com/show_bug.cgi?id=1317125 It will work for sure if the management bridge has been created in the past by the engine but we had a lot of failure reports trying to consume management bridge manually created with wrong options. Letting the engine creating it with the right configuration is by far the safest option.
Also, what we should use as anwers here?
OVEHOSTED_NETWORK/bridgeIf=str:bond0.1111 OVEHOSTED_NETWORK/bridgeName=str:ovirtmgmt
Yes, this should be fine.
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

Ok, removed the manual created ovirtmgmt network. Now it's complaining about missing ovirtmgmt network when installing host: 2018-06-05 13:21:32,091+02 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [2ed0af08] Connecting to host01.recacted/1.2.3.42 2018-06-05 13:21:32,250+02 INFO [org.ovirt.engine.core.bll.host.HostConnectivityChecker] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Engine managed to communicate with VDSM agent on host 'host01.recacted' with address 'host01.recacted' ('9956cebf-59ab-426b-ace6-25342705445e') 2018-06-05 13:22:09,007+02 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-78) [2b3212b5] Lock Acquired to object 'EngineLock:{exclusiveLocks='[5eb1152b-e9fb-4e2a-abea-f2d9823b724a=PROVIDER]', sharedLocks=''}' 2018-06-05 13:22:09,024+02 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-78) [2b3212b5] Running command: SyncNetworkProviderCommand internal: true. 2018-06-05 13:22:09,254+02 INFO [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-64) [] User admin@internal<mailto:admin@internal> successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2018-06-05 13:22:09,526+02 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-78) [2b3212b5] Lock freed to object 'EngineLock:{exclusiveLocks='[5eb1152b-e9fb-4e2a-abea-f2d9823b724a=PROVIDER]', sharedLocks=''}' 2018-06-05 13:24:32,603+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host01.recacted command CollectVdsNetworkDataAfterInstallationVDS failed: Message timeout which can be caused by communication issues 2018-06-05 13:24:32,603+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand' return value 'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturn@7b51b9cb<mailto:'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturn@7b51b9cb>' 2018-06-05 13:24:32,603+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] HostName = host01.recacted 2018-06-05 13:24:32,616+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Command 'CollectVdsNetworkDataAfterInstallationVDSCommand(HostName = host01.recacted, CollectHostNetworkDataVdsCommandParameters:{hostId='9956cebf-59ab-426b-ace6-25342705445e', vds='Host[host01.recacted,9956cebf-59ab-426b-ace6-25342705445e]'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues 2018-06-05 13:24:32,616+02 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engine-Thread-3) [33333924] Host 'host01.recacted' is not responding. 2018-06-05 13:24:32,622+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-3) [33333924] EVENT_ID: VDS_HOST_NOT_RESPONDING(9,027), Host host01.recacted is not responding. Host cannot be fenced automatically because power management for the host is disabled. 2018-06-05 13:24:32,616+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022) 2018-06-05 13:24:32,628+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Host installation failed for host '9956cebf-59ab-426b-ace6-25342705445e', 'host01.redacted': Failed to configure management network on the host 2018-06-05 13:24:32,633+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='9956cebf-59ab-426b-ace6-25342705445e', status='NonOperational', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 2fa6c4ab 2018-06-05 13:26:19,101+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='9956cebf-59ab-426b-ace6-25342705445e', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 460b0642 2018-06-05 13:26:19,106+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] FINISH, SetVdsStatusVDSCommand, log id: 460b0642 2018-06-05 13:26:19,161+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-06-05 13:26:19,196+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-06-05 13:26:19,235+02 INFO [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [647a0b6c] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: 9956cebf-59ab-426b-ace6-25342705445e Type: VDS Answers used for ansible deploy: OVEHOSTED_NETWORK/bridgeIf=str:br0.1111 OVEHOSTED_NETWORK/bridgeName=str:ovirtmgmt Do you have idea why the ovirtmgmt wasn't created during deploy? Or should we use other settings for network for deploy script? On Tue, 2018-06-05 at 11:03 +0200, Simone Tiraboschi wrote: On Tue, Jun 5, 2018 at 10:16 AM, Mariusz Kozakowski <mariusz.kozakowski@sallinggroup.com<mailto:mariusz.kozakowski@sallinggroup.com>> wrote: On Tue, 2018-06-05 at 10:09 +0200, Simone Tiraboschi wrote: But did you manually created the bridge or did the engine created it for you triggered by hosted-engine-setup? Manually. Before we had br0.1111. Should we go back with network configuration to br0.1111, and no ovritmgmt network created? Yes, I'm pretty sure that the issue is there, see also https://bugzilla.redhat.com/show_bug.cgi?id=1317125 It will work for sure if the management bridge has been created in the past by the engine but we had a lot of failure reports trying to consume management bridge manually created with wrong options. Letting the engine creating it with the right configuration is by far the safest option. Also, what we should use as anwers here? OVEHOSTED_NETWORK/bridgeIf=str:bond0.1111 OVEHOSTED_NETWORK/bridgeName=str:ovirtmgmt Yes, this should be fine. -- Best regards/Pozdrawiam/MfG Mariusz Kozakowski Site Reliability Engineer Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com<http://dansksupermarked.com> -- Best regards/Pozdrawiam/MfG Mariusz Kozakowski Site Reliability Engineer Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, Jun 5, 2018 at 1:40 PM, Mariusz Kozakowski < mariusz.kozakowski@sallinggroup.com> wrote:
Ok, removed the manual created ovirtmgmt network. Now it's complaining about missing ovirtmgmt network when installing host:
2018-06-05 13:21:32,091+02 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [2ed0af08] Connecting to host01.recacted/1.2.3.42 2018-06-05 13:21:32,250+02 INFO [org.ovirt.engine.core.bll.host.HostConnectivityChecker] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Engine managed to communicate with VDSM agent on host 'host01.recacted' with address 'host01.recacted' ('9956cebf-59ab-426b-ace6-25342705445e') 2018-06-05 13:22:09,007+02 INFO [org.ovirt.engine.core. bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-78) [2b3212b5] Lock Acquired to object 'EngineLock:{exclusiveLocks='[ 5eb1152b-e9fb-4e2a-abea-f2d9823b724a=PROVIDER]', sharedLocks=''}' 2018-06-05 13:22:09,024+02 INFO [org.ovirt.engine.core. bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-78) [2b3212b5] Running command: SyncNetworkProviderCommand internal: true. 2018-06-05 13:22:09,254+02 INFO [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-64) [] User admin@internal successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2018-06-05 13:22:09,526+02 INFO [org.ovirt.engine.core. bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-78) [2b3212b5] Lock freed to object 'EngineLock:{exclusiveLocks='[ 5eb1152b-e9fb-4e2a-abea-f2d9823b724a=PROVIDER]', sharedLocks=''}' 2018-06-05 13:24:32,603+02 ERROR [org.ovirt.engine.core.dal. dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host01.recacted command CollectVdsNetworkDataAfterInstallationVDS failed: Message timeout which can be caused by communication issues 2018-06-05 13:24:32,603+02 INFO [org.ovirt.engine.core. vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand' return value 'org.ovirt.engine.core.vdsbroker.vdsbroker. VDSInfoReturn@7b51b9cb' 2018-06-05 13:24:32,603+02 INFO [org.ovirt.engine.core. vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] HostName = host01.recacted 2018-06-05 13:24:32,616+02 ERROR [org.ovirt.engine.core. vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Command ' CollectVdsNetworkDataAfterInstallationVDSCommand(HostName = host01.recacted, CollectHostNetworkDataVdsCommandParameters:{hostId=' 9956cebf-59ab-426b-ace6-25342705445e', vds='Host[host01.recacted, 9956cebf-59ab-426b-ace6-25342705445e]'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues 2018-06-05 13:24:32,616+02 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engine-Thread-3) [33333924] Host 'host01.recacted' is not responding. 2018-06-05 13:24:32,622+02 WARN [org.ovirt.engine.core. dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-3) [33333924] EVENT_ID: VDS_HOST_NOT_RESPONDING(9,027), Host host01.recacted is not responding. Host cannot be fenced automatically because power management for the host is disabled. 2018-06-05 13:24:32,616+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
2018-06-05 13:24:32,628+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] Host installation failed for host '9956cebf-59ab-426b-ace6-25342705445e', 'host01.redacted': Failed to configure management network on the host 2018-06-05 13:24:32,633+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [33333924] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='9956cebf-59ab-426b-ace6-25342705445e', status='NonOperational', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 2fa6c4ab
2018-06-05 13:26:19,101+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] START, SetVdsStatusVDSCommand(HostName = host01.redacted, SetVdsStatusVDSCommandParameters:{hostId='9956cebf-59ab-426b-ace6-25342705445e', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 460b0642 2018-06-05 13:26:19,106+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] FINISH, SetVdsStatusVDSCommand, log id: 460b0642 2018-06-05 13:26:19,161+02 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] Host 'host01.redacted' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-06-05 13:26:19,196+02 WARN [org.ovirt.engine.core. dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [374e0c] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host host01.redacted does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-06-05 13:26:19,235+02 INFO [org.ovirt.engine.core.bll. HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-45) [647a0b6c] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: 9956cebf-59ab-426b-ace6-25342705445e Type: VDS
Answers used for ansible deploy:
OVEHOSTED_NETWORK/bridgeIf=str:br0.1111
OVEHOSTED_NETWORK/bridgeName=str:ovirtmgmt
Do you have idea why the ovirtmgmt wasn't created during deploy? Or should we use other settings for network for deploy script?
Can you please share your /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log ?
On Tue, 2018-06-05 at 11:03 +0200, Simone Tiraboschi wrote:
On Tue, Jun 5, 2018 at 10:16 AM, Mariusz Kozakowski <mariusz.kozakowski@ sallinggroup.com> wrote:
On Tue, 2018-06-05 at 10:09 +0200, Simone Tiraboschi wrote:
But did you manually created the bridge or did the engine created it for you triggered by hosted-engine-setup?
Manually. Before we had br0.1111. Should we go back with network configuration to br0.1111, and no ovritmgmt network created?
Yes, I'm pretty sure that the issue is there, see also https://bugzilla.redhat.com/show_bug.cgi?id=1317125
It will work for sure if the management bridge has been created in the past by the engine but we had a lot of failure reports trying to consume management bridge manually created with wrong options. Letting the engine creating it with the right configuration is by far the safest option.
Also, what we should use as anwers here?
OVEHOSTED_NETWORK/bridgeIf=str:bond0.1111 OVEHOSTED_NETWORK/bridgeName=str:ovirtmgmt
Yes, this should be fine.
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, 2018-06-05 at 13:43 +0200, Simone Tiraboschi wrote: Can you please share your /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log ? Yes, please find the attached. -- Best regards/Pozdrawiam/MfG Mariusz Kozakowski Site Reliability Engineer Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, Jun 5, 2018 at 2:22 PM, Mariusz Kozakowski < mariusz.kozakowski@sallinggroup.com> wrote:
On Tue, 2018-06-05 at 13:43 +0200, Simone Tiraboschi wrote:
Can you please share your /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log ?
Yes, please find the attached.
On a seccesful setup in the logs you should see something like 2018-06-05 10:51:53,451+0200 INFO (jsonrpc/7) [api.network] START setupNetworks(networks={u'ovirtmgmt': {u'ipv6autoconf': True, u'nic': u'eth0', u'mtu': 1500, u'switch': u'legacy', u'dhcpv6': False, u'STP': u'no', u'bridged': u'true', u'defaultRoute': True, u'bootproto': u'dhcp'}}, bondings={}, options={u'connectivityCheck': u'true', u'connectivityTimeout': 120}) from=::ffff:192.168.122.37,49716, flow_id=41ca4fc1 (api:46)
but it's completely missing on your logs. Can you please attach also the whole engine.log and host-deploy logs?
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, Jun 5, 2018 at 4:37 PM, Simone Tiraboschi <stirabos@redhat.com> wrote:
On Tue, Jun 5, 2018 at 2:22 PM, Mariusz Kozakowski <mariusz.kozakowski@ sallinggroup.com> wrote:
On Tue, 2018-06-05 at 13:43 +0200, Simone Tiraboschi wrote:
Can you please share your /var/log/vdsm/vdsm.log and /var/log/vdsm/supervdsm.log ?
Yes, please find the attached.
On a seccesful setup in the logs you should see something like 2018-06-05 10:51:53,451+0200 INFO (jsonrpc/7) [api.network] START setupNetworks(networks={u'ovirtmgmt': {u'ipv6autoconf': True, u'nic': u'eth0', u'mtu': 1500, u'switch': u'legacy', u'dhcpv6': False, u'STP': u'no', u'bridged': u'true', u'defaultRoute': True, u'bootproto': u'dhcp'}}, bondings={}, options={u'connectivityCheck': u'true', u'connectivityTimeout': 120}) from=::ffff:192.168.122.37,49716, flow_id=41ca4fc1 (api:46)
but it's completely missing on your logs.
Can you please attach also the whole engine.log and host-deploy logs?
I tried to deploy hosted-engine over vlan over a bond and everything worked as expected but I also found a case where it fails: SetupNetworks is going to fail if bond0.123 is correctly configured with an IPv4 address while the untagged bond0 lacks IPv4 configuration. Simply configuring a static IPv4 address from an unused subnet on the untagged bond is a valid workaround. I just opened https://bugzilla.redhat.com/show_bug.cgi?id=1586280
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Tue, 2018-06-05 at 23:26 +0200, Simone Tiraboschi wrote: I tried to deploy hosted-engine over vlan over a bond and everything worked as expected but I also found a case where it fails: SetupNetworks is going to fail if bond0.123 is correctly configured with an IPv4 address while the untagged bond0 lacks IPv4 configuration. Simply configuring a static IPv4 address from an unused subnet on the untagged bond is a valid workaround. I just opened https://bugzilla.redhat.com/show_bug.cgi?id=1586280 Ok, I removed all bridges, so IPs are configured on bond0.id, added dummy IP for bond0, flushed iptables. And it works. So thank you for your help! One last question - does the bond0 IP is still needed or it's just issue during install and now we can delete it? Cheers -- Best regards/Pozdrawiam/MfG Mariusz Kozakowski Site Reliability Engineer Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

On Wed, Jun 6, 2018 at 2:49 PM, Mariusz Kozakowski < mariusz.kozakowski@sallinggroup.com> wrote:
On Tue, 2018-06-05 at 23:26 +0200, Simone Tiraboschi wrote:
I tried to deploy hosted-engine over vlan over a bond and everything worked as expected but I also found a case where it fails: SetupNetworks is going to fail if bond0.123 is correctly configured with an IPv4 address while the untagged bond0 lacks IPv4 configuration. Simply configuring a static IPv4 address from an unused subnet on the untagged bond is a valid workaround. I just opened https://bugzilla.redhat.com/show_bug.cgi?id=1586280
Ok, I removed all bridges, so IPs are configured on bond0.id, added dummy IP for bond0, flushed iptables. And it works. So thank you for your help!
Thanks for the report.
One last question - does the bond0 IP is still needed or it's just issue during install and now we can delete it?
Please keep it since vdsm could potentially try to rollback your network configuration to its initial status if something is reported to be down for a relevant amount of time.
Cheers
--
Best regards/Pozdrawiam/MfG
*Mariusz Kozakowski*
Site Reliability Engineer
Dansk Supermarked Group Baltic Business Park ul. 1 Maja 38-39 71-627 Szczecin dansksupermarked.com

Hi Mariusz, I'm sorry to hear you're having issues. Could you please provide a link to a pastebin or something similar for the full log file? Thanks! -Phillip Bailey On Wed, May 23, 2018 at 2:35 AM, Mariusz Kozakowski < mariusz.kozakowski@dsg.dk> wrote:
Hello,
We've been trying to setup oVirt environment for few days but we have issue with hosted-engine-setup (ansible script). We managed to fix few small things and have them merged upstream but unfortunately right now the installation process fails on getting host facts. It looks like it cannot proceed because it fails when connecting to ovirt-engine API of the bootstrap VM.
The oVirt API / webpanel is working, I tested it via a browser and I can login without issues using the admin password chosen earlier in the process.
2018-05-18 15:26:47,800+0200 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:100 TASK [Wait for the host to be up] 2018-05-18 15:39:14,025+0200 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:94 {u'_ansible_parsed': True, u'_ansible_no_log': False, u'changed': False, u'attempts': 120, u'invocation': {u'module_args': { u'pattern': u'name=host01.redacted', u'fetch_nested': False, u'nested_attributes': []}}, u'ansible_facts': {u'ovirt_hosts': []}} 2018-05-18 15:39:14,127+0200 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:98 fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": []}, "attempts": 120, "changed": false}
May 18 13:34:34 host01 python: ansible-ovirt_hosts_facts Invoked with pattern=name=host01.redacted fetch_nested=False nested_attributes=[] auth={'timeout': 0, 'url': 'https://ovirt-dev.redacted/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'R--token-redacted', 'ca_file': None}
Do you have idea what/where is issue and how to fix it?
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
participants (4)
-
Mariusz Kozakowski
-
Mariusz Kozakowski
-
Phillip Bailey
-
Simone Tiraboschi