
16 Apr
2020
16 Apr
'20
1:28 p.m.
OK, to wrap up this thread and to provide some detail as to how it concluded.... I now have the engine back up and running. The final issue was that I needed to re-create the Synology share that the HE was stored on. I got past all my issues and up to the domain setup stage. Obviously I couldn't install to the same share with the original domain still there so I deleted that (luckily other VMs are in a different share) but the install still failed with a storage domain creation error. So I created a new share and the install could now progress. So I have no idea what happened but I seem to have suffered some sort of failure of a shared folder on my Synology that caused issues with oVirt. I could still mount the folder manually and create/edit/delete files, and the engine was generating 100K's worth of .prob-* files, but it was somehow corrupt? I've copied the ansible error at the end of this mail. The other issue was the IPv6 one which was strange. I didn't see this when I first installed and setup oVirt but perhaps something changed somewhere in our setup. So as Strahil pointed out, I needed the IPV6ADDR entry in my ifcfg-eno1 for the interface being used for the node. The pain is that this is overwritten by the deployment so if that fails, you have to re-add it. So my interface config looks like this now: # Generated by VDSM version 4.30.40.1 DEVICE=eno1 BRIDGE=ovirtmgmt ONBOOT=yes MTU=1500 DEFROUTE=no NM_CONTROLLED=no IPV6INIT=no IPV6ADDR=64:ff9b::c0a8:13d Then I had the strange firewalld issue that I can't explain. I re-installed the node from scratch to resolve that as I'd lost patience. So thanks for all the help and I hope I never have to do that again. :-) Shareef. Ansible storage domain error: 2020-04-16 11:11:55,872+0000 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 TASK [ovirt.hosted_engine_setup : Add NFS storage domain] 2020-04-16 11:11:58,777+0000 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:103 {u'invocation': {u'module_args': {u'comment': None, u'warning_low_space': None, u'gluster fs': None, u'localfs': None, u'managed_block_storage': None, u'data_center': u'Default', u'id': None, u'iscsi': None, u'state': u'unattached', u'wipe_after_delete': None, u'destroy': None, u'fcp': None, u 'description': None, u'format': None, u'nested_attributes': [], u'host': u' ovirt-node-00.phoelex.com', u'discard_after_delete': None, u'wait': True, u'domain_function': u'data', u'name': u'hosted_storage' , u'critical_space_action_blocker': None, u'posixfs': None, u'poll_interval': 3, u'fetch_nested': False, u'nfs': {u'path': u'/volume1/ovirt', u'version': u'auto', u'mount_options': u'', u'address': u'nas- 01.phoelex.com'}, u'timeout': 180, u'backup': None}}, u'msg': u'Fault reason is "Operation Failed". Fault detail is "[Error in creating a Storage Domain. The selected storage path is not empty (probably c ontains another Storage Domain). Either remove the existing Storage Domain from this path, or change the Storage path).]". HTTP response code is 400.', u'exception': u'Traceback (most recent call last):\n File "/tmp/ansible_ovirt_storage_domain_payload_6uM8mE/ansible_ovirt_storage_domain_payload.zip/ansible/modules/cloud/ovirt/ovirt_storage_domain.py", line 792, in main\n File "/tmp/ansible_ovirt_storag e_domain_payload_6uM8mE/ansible_ovirt_storage_domain_payload.zip/ansible/module_utils/ovirt.py", line 621, in create\n **kwargs\n File "/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py", line 25168, in add\n return self._internal_add(storage_domain, headers, query, wait)\n File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 232, in _internal_add\n return future.wait() i f wait else future\n File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 55, in wait\n return self._code(response)\n File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", l ine 229, in callback\n self._check_fault(response)\n File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 132, in _check_fault\n self._raise_error(response, body)\n File "/usr/lib6 4/python2.7/site-packages/ovirtsdk4/service.py", line 118, in _raise_error\n raise error\nError: Fault reason is "Operation Failed". Fault detail is "[Error in creating a Storage Domain. The selected s torage path is not empty (probably contains another Storage Domain). Either remove the existing Storage Domain from this path, or change the Storage path).]". HTTP response code is 400.\n', u'changed': Fa lse, u'_ansible_no_log': False} 2020-04-16 11:11:58,877+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:107 Error: Fault reason is "Operation Failed". Fault detail is "[Error in creating a Storage Domain. The selected storage path is not empty (probably contains another Storage Domain). Either remove the existing Storage Domain from this path, or change the Storage path).]". HTTP response code is 400. 2020-04-16 11:11:58,978+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:107 fatal: [localhost]: FAILED! => {"changed": false, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[Error in creating a Storage Domain. The selected storage path is not empty (probably contains another Storage Domain). Either remove the existing Storage Domain from this path, or change the Storage path).]\". HTTP response code is 400."} On Thu, Apr 16, 2020 at 12:14 PM Shareef Jalloq <shareef@jalloq.co.uk> wrote: > Actually, you've just raised a point I hadn't thought about. We have an > old Xeon server that is being used to host some ESXi VMs that were needed > while we transitioned to ovirt. Once I have moved those VMs I could > repurpose that as the engine. > > On Thu, Apr 16, 2020 at 11:42 AM Strahil Nikolov <hunter86_bg@yahoo.com> > wrote: > >> On April 16, 2020 11:25:20 AM GMT+03:00, Shareef Jalloq < >> shareef@jalloq.co.uk> wrote: >> >Is this actually production ready? It seems to break at every step. >> > >> >On Wed, Apr 15, 2020 at 5:45 PM Strahil Nikolov <hunter86_bg@yahoo.com> >> >wrote: >> > >> >> On April 15, 2020 5:59:46 PM GMT+03:00, Shareef Jalloq < >> >> shareef@jalloq.co.uk> wrote: >> >> >Thanks for your help but I've decided to try and reinstall from >> >> >scratch. >> >> >This is taking too long. >> >> > >> >> >On Wed, Apr 15, 2020 at 3:25 PM Strahil Nikolov >> ><hunter86_bg@yahoo.com> >> >> >wrote: >> >> > >> >> >> On April 15, 2020 2:40:52 PM GMT+03:00, Shareef Jalloq < >> >> >> shareef@jalloq.co.uk> wrote: >> >> >> >Yes, but there are no zones set up, just ports 22, 6801 adn 6900. >> >> >> > >> >> >> >On Wed, Apr 15, 2020 at 12:37 PM Strahil Nikolov >> >> >> ><hunter86_bg@yahoo.com> >> >> >> >wrote: >> >> >> > >> >> >> >> On April 15, 2020 2:28:05 PM GMT+03:00, Shareef Jalloq < >> >> >> >> shareef@jalloq.co.uk> wrote: >> >> >> >> >Oh this is painful. It seems to progress if you have both >> >> >> >> >he_force_ipv4 >> >> >> >> >set and run the deployment with the '--4' switch. >> >> >> >> > >> >> >> >> >But then I get a failure when the ansible script checks for >> >> >> >> >firewalld-zones >> >> >> >> >and doesn't get anything back. Should the deployment flow not >> >be >> >> >> >> >setting >> >> >> >> >any zones it needs? >> >> >> >> > >> >> >> >> >2020-04-15 10:57:25,439+0000 INFO >> >> >> >> >otopi.ovirt_hosted_engine_setup.ansible_utils >> >> >> >> >ansible_utils._process_output:109 TASK >> >[ovirt.hosted_engine_setup >> >> >: >> >> >> >Get >> >> >> >> >active list of active firewalld zones] >> >> >> >> > >> >> >> >> >2020-04-15 10:57:26,641+0000 DEBUG >> >> >> >> >otopi.ovirt_hosted_engine_setup.ansible_utils >> >> >> >> >ansible_utils._process_output:103 {u'stderr_lines': [], >> >> >u'changed': >> >> >> >> >True, >> >> >> >> >u'end': u'2020-04-15 10:57:26.481202', u'_ansible_no_log': >> >False, >> >> >> >> >u'stdout': u'', u'cmd': u'set -euo pipefail && firewall-cmd >> >> >> >> >--get-active-zones | grep -v "^\\s*interfaces"', u'start': >> >> >> >u'2020-04-15 >> >> >> >> >10:57:26.050203', u'delta': u'0:00:00.430999', u'stderr': u'', >> >> >> >u'rc': >> >> >> >> >1, >> >> >> >> >u'invocation': {u'module_args': {u'creates': None, >> >u'executable': >> >> >> >None, >> >> >> >> >u'_uses_shell': True, u'strip_empty_ends': True, >> >u'_raw_params': >> >> >> >u'set >> >> >> >> >-euo >> >> >> >> >pipefail && firewall-cmd --get-active-zones | grep -v >> >> >> >> >"^\\s*interfaces"', >> >> >> >> >u'removes': None, u'argv': None, u'warn': True, u'chdir': >> >None, >> >> >> >> >u'stdin_add_newline': True, u'stdin': None}}, u'stdout_lines': >> >> >[], >> >> >> >> >u'msg': >> >> >> >> >u'non-zero return code'} >> >> >> >> > >> >> >> >> >2020-04-15 10:57:26,741+0000 ERROR >> >> >> >> >otopi.ovirt_hosted_engine_setup.ansible_utils >> >> >> >> >ansible_utils._process_output:107 fatal: [localhost]: FAILED! >> >=> >> >> >> >> >{"changed": true, "cmd": "set -euo pipefail && firewall-cmd >> >> >> >> >--get-active-zones | grep -v \"^\\s*interfaces\"", "delta": >> >> >> >> >"0:00:00.430999", "end": "2020-04-15 10:57:26.481202", "msg": >> >> >> >"non-zero >> >> >> >> >return code", "rc": 1, "start": "2020-04-15 10:57:26.050203", >> >> >> >"stderr": >> >> >> >> >"", >> >> >> >> >"stderr_lines": [], "stdout": "", "stdout_lines": []} >> >> >> >> > >> >> >> >> >On Wed, Apr 15, 2020 at 10:23 AM Shareef Jalloq >> >> >> ><shareef@jalloq.co.uk> >> >> >> >> >wrote: >> >> >> >> > >> >> >> >> >> Ha, spoke too soon. It's now stuck in a loop and a google >> >> >points >> >> >> >me >> >> >> >> >at >> >> >> >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1746585 >> >> >> >> >> >> >> >> >> >> However, forcing ipv4 doesn't seem to have fixed the loop. >> >> >> >> >> >> >> >> >> >> On Wed, Apr 15, 2020 at 9:59 AM Shareef Jalloq >> >> >> ><shareef@jalloq.co.uk> >> >> >> >> >> wrote: >> >> >> >> >> >> >> >> >> >>> OK, that seems to have fixed it, thanks. Is this a side >> >> >effect >> >> >> >of >> >> >> >> >>> redeploying the HE over a first time install? Nothing has >> >> >changed >> >> >> >in >> >> >> >> >our >> >> >> >> >>> setup and I didn't need to do this when I initially set up >> >our >> >> >> >> >nodes. >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> On Tue, Apr 14, 2020 at 6:55 PM Strahil Nikolov >> >> >> >> ><hunter86_bg@yahoo.com> >> >> >> >> >>> wrote: >> >> >> >> >>> >> >> >> >> >>>> On April 14, 2020 6:17:17 PM GMT+03:00, Shareef Jalloq < >> >> >> >> >>>> shareef@jalloq.co.uk> wrote: >> >> >> >> >>>> >Hmmm, we're not using ipv6. Is that the issue? >> >> >> >> >>>> > >> >> >> >> >>>> >On Tue, Apr 14, 2020 at 3:56 PM Strahil Nikolov >> >> >> >> ><hunter86_bg@yahoo.com> >> >> >> >> >>>> >wrote: >> >> >> >> >>>> > >> >> >> >> >>>> >> On April 14, 2020 1:27:24 PM GMT+03:00, Shareef Jalloq >> >< >> >> >> >> >>>> >> shareef@jalloq.co.uk> wrote: >> >> >> >> >>>> >> >Right, I've given up on recovering the HE so want to >> >try >> >> >and >> >> >> >> >>>> >redeploy >> >> >> >> >>>> >> >it. >> >> >> >> >>>> >> >There doesn't seem to be enough information to debug >> >why >> >> >the >> >> >> >> >>>> >> >broker/agent >> >> >> >> >>>> >> >won't start cleanly. >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >In running 'hosted-engine --deploy', I'm seeing the >> >> >> >following >> >> >> >> >error >> >> >> >> >>>> >in >> >> >> >> >>>> >> >the >> >> >> >> >>>> >> >setup validation phase: >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:08,922+0000 DEBUG >> >> >> >> >otopi.plugins.otopi.dialog.human >> >> >> >> >>>> >> >dialog.__logString:204 DIALOG:SEND >> >Please >> >> >> >> >provide >> >> >> >> >>>> >the >> >> >> >> >>>> >> >hostname of this host on the management network >> >> >> >> >>>> >> >[ovirt-node-00.phoelex.com]: >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,831+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >> >>>> >> >hostname.getResolvedAddresses:432 >> >> >> >> >>>> >> >getResolvedAddresses: set(['64:ff9b::c0a8:13d', >> >> >> >> >'192.168.1.61']) >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,832+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >> >>>> >> >hostname._validateFQDNresolvability:289 >> >> >> >> >ovirt-node-00.phoelex.com >> >> >> >> >>>> >> >resolves >> >> >> >> >>>> >> >to: set(['64:ff9b::c0a8:13d', '192.168.1.61']) >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,832+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >plugin.executeRaw:813 >> >> >> >> >>>> >> >execute: >> >> >> >> >>>> >> >['/usr/bin/dig', '+noall', '+answer', >> >> >> >> >'ovirt-node-00.phoelex.com', >> >> >> >> >>>> >> >'ANY'], >> >> >> >> >>>> >> >executable='None', cwd='None', env=None >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,871+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >plugin.executeRaw:863 >> >> >> >> >>>> >> >execute-result: ['/usr/bin/dig', '+noall', '+answer', >> >' >> >> >> >> >>>> >> >ovirt-node-00.phoelex.com', 'ANY'], rc=0 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,872+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >plugin.execute:921 >> >> >> >> >>>> >> >execute-output: ['/usr/bin/dig', '+noall', '+answer', >> >' >> >> >> >> >>>> >> >ovirt-node-00.phoelex.com', 'ANY'] stdout: >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >ovirt-node-00.phoelex.com. 86400 IN A >> >> >192.168.1.61 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,872+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >plugin.execute:926 >> >> >> >> >>>> >> >execute-output: ['/usr/bin/dig', '+noall', '+answer', >> >' >> >> >> >> >>>> >> >ovirt-node-00.phoelex.com', 'ANY'] stderr: >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,872+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >plugin.executeRaw:813 >> >> >> >> >>>> >> >execute: >> >> >> >> >>>> >> >('/usr/sbin/ip', 'addr'), executable='None', >> >cwd='None', >> >> >> >> >env=None >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,876+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >plugin.executeRaw:863 >> >> >> >> >>>> >> >execute-result: ('/usr/sbin/ip', 'addr'), rc=0 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,876+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >plugin.execute:921 >> >> >> >> >>>> >> >execute-output: ('/usr/sbin/ip', 'addr') stdout: >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue >> >> >state >> >> >> >> >UNKNOWN >> >> >> >> >>>> >> >group >> >> >> >> >>>> >> >default qlen 1000 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > link/loopback 00:00:00:00:00:00 brd >> >00:00:00:00:00:00 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > inet 127.0.0.1/8 scope host lo >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > valid_lft forever preferred_lft forever >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > inet6 ::1/128 scope host >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > valid_lft forever preferred_lft forever >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 >> >qdisc >> >> >mq >> >> >> >> >master >> >> >> >> >>>> >> >ovirtmgmt state UP group default qlen 1000 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > link/ether ac:1f:6b:bc:32:6a brd ff:ff:ff:ff:ff:ff >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 >> >> >qdisc >> >> >> >mq >> >> >> >> >state >> >> >> >> >>>> >> >DOWN >> >> >> >> >>>> >> >group default qlen 1000 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > link/ether ac:1f:6b:bc:32:6b brd ff:ff:ff:ff:ff:ff >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc >> >noop >> >> >> >state >> >> >> >> >DOWN >> >> >> >> >>>> >> >group >> >> >> >> >>>> >> >default qlen 1000 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > link/ether 02:e6:e2:80:93:8d brd ff:ff:ff:ff:ff:ff >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >5: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop >> >> >state >> >> >> >DOWN >> >> >> >> >>>> >group >> >> >> >> >>>> >> >default qlen 1000 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > link/ether 8a:26:44:50:ee:4a brd ff:ff:ff:ff:ff:ff >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >21: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu >> >1500 >> >> >> >qdisc >> >> >> >> >>>> >noqueue >> >> >> >> >>>> >> >state UP group default qlen 1000 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > link/ether ac:1f:6b:bc:32:6a brd ff:ff:ff:ff:ff:ff >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > inet 192.168.1.61/24 brd 192.168.1.255 scope >> >global >> >> >> >> >ovirtmgmt >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > valid_lft forever preferred_lft forever >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > inet6 fe80::ae1f:6bff:febc:326a/64 scope link >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > valid_lft forever preferred_lft forever >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >22: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc >> >> >noop >> >> >> >> >state >> >> >> >> >>>> >DOWN >> >> >> >> >>>> >> >group >> >> >> >> >>>> >> >default qlen 1000 >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > link/ether 3a:02:7b:7d:b3:2a brd ff:ff:ff:ff:ff:ff >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,876+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >plugin.execute:926 >> >> >> >> >>>> >> >execute-output: ('/usr/sbin/ip', 'addr') stderr: >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,877+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >> >>>> >> >hostname.getLocalAddresses:251 >> >> >> >> >>>> >> >addresses: [u'192.168.1.61', >> >> >u'fe80::ae1f:6bff:febc:326a'] >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,877+0000 DEBUG >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >> >hostname.test_hostname:464 >> >> >> >> >>>> >> >test_hostname exception >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >Traceback (most recent call last): >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >File >> >> >> >> >> >>"/usr/lib/python2.7/site-packages/ovirt_setup_lib/hostname.py", >> >> >> >> >>>> >> >line >> >> >> >> >>>> >> >460, in test_hostname >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > not_local_text, >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >File >> >> >> >> >> >>"/usr/lib/python2.7/site-packages/ovirt_setup_lib/hostname.py", >> >> >> >> >>>> >> >line >> >> >> >> >>>> >> >342, in _validateFQDNresolvability >> >> >> >> >>>> >> > >> >> >> >> >>>> >> > addresses=resolvedAddressesAsString >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >RuntimeError: ovirt-node-00.phoelex.com resolves to >> >> >> >> >>>> >64:ff9b::c0a8:13d >> >> >> >> >>>> >> >192.168.1.61 and not all of them can be mapped to non >> >> >> >loopback >> >> >> >> >>>> >devices >> >> >> >> >>>> >> >on >> >> >> >> >>>> >> >this host >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >2020-04-14 09:46:12,884+0000 ERROR >> >> >> >> >>>> >> >otopi.plugins.gr_he_common.network.bridge >> >> >> >> >dialog.queryEnvKey:120 >> >> >> >> >>>> >Host >> >> >> >> >>>> >> >name >> >> >> >> >>>> >> >is not valid: ovirt-node-00.phoelex.com resolves to >> >> >> >> >>>> >64:ff9b::c0a8:13d >> >> >> >> >>>> >> >192.168.1.61 and not all of them can be mapped to non >> >> >> >loopback >> >> >> >> >>>> >devices >> >> >> >> >>>> >> >on >> >> >> >> >>>> >> >this host >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >The node I'm running on has an IP address of .61 and >> >> >> >resolves >> >> >> >> >>>> >> >correctly. >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >On Fri, Apr 10, 2020 at 12:55 PM Shareef Jalloq >> >> >> >> >>>> ><shareef@jalloq.co.uk> >> >> >> >> >>>> >> >wrote: >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >> Where should I be checking if there are any >> >> >files/folder >> >> >> >not >> >> >> >> >owned >> >> >> >> >>>> >by >> >> >> >> >>>> >> >> vdsm:kvm? I checked on the mount the HA sits on and >> >> >it's >> >> >> >> >fine. >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> How would I go about checking vdsm can access those >> >> >> >images? >> >> >> >> >If I >> >> >> >> >>>> >run >> >> >> >> >>>> >> >> virsh, it lists them and they were running yesterday >> >> >even >> >> >> >> >though >> >> >> >> >>>> >the >> >> >> >> >>>> >> >HA was >> >> >> >> >>>> >> >> down. I've since restarted both hosts but the >> >broker >> >> >is >> >> >> >> >still >> >> >> >> >>>> >> >spitting out >> >> >> >> >>>> >> >> the same error (copied below). How do I find the >> >> >reason >> >> >> >the >> >> >> >> >>>> >broker >> >> >> >> >>>> >> >can't >> >> >> >> >>>> >> >> connect to the storage? The conf file is already at >> >> >DEBUG >> >> >> >> >>>> >verbosity: >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> [handler_logfile] >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> class=logging.handlers.TimedRotatingFileHandler >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> args=('/var/log/ovirt-hosted-engine-ha/broker.log', >> >> >'d', >> >> >> >1, >> >> >> >> >7) >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> level=DEBUG >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> formatter=long >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> And what are all these .prob-<num> files that are >> >being >> >> >> >> >created? >> >> >> >> >>>> >> >There >> >> >> >> >>>> >> >> are over 250K of them now on the mount I'm using for >> >> >the >> >> >> >Data >> >> >> >> >>>> >domain. >> >> >> >> >>>> >> >> They're all of 0 size and of the form, >> >> >> >> >>>> >> >> /rhev/data-center/mnt/nas-01.phoelex.com: >> >> >> >> >>>> >> >> >> >> >> >_volume2_vmstore/.prob-ffa867da-93db-4211-82df-b1b04a625ab9 >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> @eevans: The volume I have the Data Domain on has >> >TB's >> >> >> >free. >> >> >> >> > The >> >> >> >> >>>> >HA >> >> >> >> >>>> >> >is >> >> >> >> >>>> >> >> dead so I can't ssh in. No idea what started these >> >> >errors >> >> >> >> >and the >> >> >> >> >>>> >> >other >> >> >> >> >>>> >> >> VMs were still running happily although they're on a >> >> >> >> >different >> >> >> >> >>>> >Data >> >> >> >> >>>> >> >Domain. >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> Shareef. >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> MainThread::INFO::2020-04-10 >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>07:45:00,408::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >> >> >> >>>> >> >> Connecting the storage >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> MainThread::INFO::2020-04-10 >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>07:45:00,408::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >> Connecting storage server >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> MainThread::INFO::2020-04-10 >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>07:45:01,577::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >> Connecting storage server >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> MainThread::INFO::2020-04-10 >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>07:45:02,692::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >> Refreshing the storage domain >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> MainThread::WARNING::2020-04-10 >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>07:45:05,175::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) >> >> >> >> >>>> >> >> Can't connect vdsm storage: Command >> >> >StorageDomain.getInfo >> >> >> >> >with >> >> >> >> >>>> >args >> >> >> >> >>>> >> >> {'storagedomainID': >> >> >> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} >> >> >> >> >>>> >failed: >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> (code=350, message=Error in storage domain action: >> >> >> >> >>>> >> >> (u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> On Thu, Apr 9, 2020 at 5:58 PM Strahil Nikolov >> >> >> >> >>>> >> ><hunter86_bg@yahoo.com> >> >> >> >> >>>> >> >> wrote: >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >>> On April 9, 2020 11:12:30 AM GMT+03:00, Shareef >> >Jalloq >> >> >< >> >> >> >> >>>> >> >>> shareef@jalloq.co.uk> wrote: >> >> >> >> >>>> >> >>> >OK, let's go through this. I'm looking at the >> >node >> >> >that >> >> >> >at >> >> >> >> >>>> >least >> >> >> >> >>>> >> >still >> >> >> >> >>>> >> >>> >has >> >> >> >> >>>> >> >>> >some VMs running. virsh also tells me that the >> >> >> >> >HostedEngine VM >> >> >> >> >>>> >is >> >> >> >> >>>> >> >>> >running >> >> >> >> >>>> >> >>> >but it's unresponsive and I can't shut it down. >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >1. All storage domains exist and are mounted. >> >> >> >> >>>> >> >>> >2. The ha_agent exists: >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >[root@ovirt-node-01 ovirt-hosted-engine-ha]# ls >> >> >> >> >>>> >> >/rhev/data-center/mnt/ >> >> >> >> >>>> >> >>> >nas-01.phoelex.com >> >> >> >> >>>> >> >>> >> >> >\:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >dom_md ha_agent images master >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >3. There are two links >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >[root@ovirt-node-01 ovirt-hosted-engine-ha]# ll >> >> >> >> >>>> >> >/rhev/data-center/mnt/ >> >> >> >> >>>> >> >>> >nas-01.phoelex.com >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >>>>\:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ha_agent/ >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >total 8 >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >lrwxrwxrwx. 1 vdsm kvm 132 Apr 2 14:50 >> >> >> >> >hosted-engine.lockspace >> >> >> >> >>>> >-> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ffb90b82-42fe-4253-85d5-aaec8c280aaf/90e68791-0c6f-406a-89ac-e0d86c631604 >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >lrwxrwxrwx. 1 vdsm kvm 132 Apr 2 14:50 >> >> >> >> >hosted-engine.metadata >> >> >> >> >>>> >-> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/2161aed0-7250-4c1d-b667-ac94f60af17e/6b818e33-f80a-48cc-a59c-bba641e027d4 >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >4. The services exist but all seem to have some >> >sort >> >> >of >> >> >> >> >warning: >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >a) Apr 08 18:10:55 ovirt-node-01.phoelex.com >> >> >> >sanlock[1728]: >> >> >> >> >>>> >> >*2020-04-08 >> >> >> >> >>>> >> >>> >18:10:55 1744152 [36796]: s16 delta_renew long >> >write >> >> >> >time >> >> >> >> >10 >> >> >> >> >>>> >sec* >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >b) Mar 23 18:02:59 ovirt-node-01.phoelex.com >> >> >> >> >supervdsmd[29409]: >> >> >> >> >>>> >> >*failed >> >> >> >> >>>> >> >>> >to >> >> >> >> >>>> >> >>> >load module nvdimm: libbd_nvdimm.so.2: cannot open >> >> >> >shared >> >> >> >> >object >> >> >> >> >>>> >> >file: >> >> >> >> >>>> >> >>> >No >> >> >> >> >>>> >> >>> >such file or directory* >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >c) Apr 09 08:05:13 ovirt-node-01.phoelex.com >> >> >vdsm[4801]: >> >> >> >> >*ERROR >> >> >> >> >>>> >> >failed >> >> >> >> >>>> >> >>> >to >> >> >> >> >>>> >> >>> >retrieve Hosted Engine HA score '[Errno 2] No such >> >> >file >> >> >> >or >> >> >> >> >>>> >> >directory'Is >> >> >> >> >>>> >> >>> >the >> >> >> >> >>>> >> >>> >Hosted Engine setup finished?* >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >d)Apr 08 22:48:27 ovirt-node-01.phoelex.com >> >> >> >> >libvirtd[29307]: >> >> >> >> >>>> >> >2020-04-08 >> >> >> >> >>>> >> >>> >22:48:27.134+0000: 29309: warning : >> >> >> >qemuGetProcessInfo:1404 >> >> >> >> >: >> >> >> >> >>>> >> >cannot >> >> >> >> >>>> >> >>> >parse >> >> >> >> >>>> >> >>> >process status data >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >Apr 08 22:48:27 ovirt-node-01.phoelex.com >> >> >> >libvirtd[29307]: >> >> >> >> >>>> >> >2020-04-08 >> >> >> >> >>>> >> >>> >22:48:27.134+0000: 29309: error : >> >> >> >> >virNetDevTapInterfaceStats:764 >> >> >> >> >>>> >: >> >> >> >> >>>> >> >>> >internal >> >> >> >> >>>> >> >>> >error: /proc/net/dev: Interface not found >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >Apr 08 23:09:39 ovirt-node-01.phoelex.com >> >> >> >libvirtd[29307]: >> >> >> >> >>>> >> >2020-04-08 >> >> >> >> >>>> >> >>> >23:09:39.844+0000: 29307: error : >> >> >> >virNetSocketReadWire:1806 >> >> >> >> >: >> >> >> >> >>>> >End >> >> >> >> >>>> >> >of >> >> >> >> >>>> >> >>> >file >> >> >> >> >>>> >> >>> >while reading data: Input/output error >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >Apr 09 01:05:26 ovirt-node-01.phoelex.com >> >> >> >libvirtd[29307]: >> >> >> >> >>>> >> >2020-04-09 >> >> >> >> >>>> >> >>> >01:05:26.660+0000: 29307: error : >> >> >> >virNetSocketReadWire:1806 >> >> >> >> >: >> >> >> >> >>>> >End >> >> >> >> >>>> >> >of >> >> >> >> >>>> >> >>> >file >> >> >> >> >>>> >> >>> >while reading data: Input/output error >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >5 & 6. The broker log is continually printing >> >this >> >> >> >error: >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >> >> >>>> >> >>> >ovirt-hosted-engine-ha broker 2.3.6 started >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >> >> >>>> >> >>> >Running broker >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,438::broker::120::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_monitor) >> >> >> >> >>>> >> >>> >Starting monitor >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,438::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Searching for submonitors in >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >/submonitors >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,439::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor network >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,440::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor cpu-load-no-engine >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor mgmt-bridge >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor network >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor cpu-load >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor engine-health >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor mgmt-bridge >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor cpu-load-no-engine >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor cpu-load >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor mem-free >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor storage-domain >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor storage-domain >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor mem-free >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Loaded submonitor engine-health >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,444::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >Finished loading submonitors >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker) >> >> >> >> >>>> >> >>> >Starting storage broker >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >> >> >> >>>> >> >>> >Connecting to VDSM >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug) >> >> >> >> >>>> >> >>> >Creating a new json-rpc connection to VDSM >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >08:07:31,453::concurrent::258::root::(run) START >> >> >thread >> >> >> >> >>>> >> ><Thread(Client >> >> >> >> >>>> >> >>> >localhost:54321, started daemon 139992488138496)> >> >> >> >> >(func=<bound >> >> >> >> >>>> >> >method >> >> >> >> >>>> >> >>> >Reactor.process_requests of >> >> >> >> ><yajsonrpc.betterAsyncore.Reactor >> >> >> >> >>>> >> >object at >> >> >> >> >>>> >> >>> >0x7f528acabc90>>, args=(), kwargs={}) >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >Client localhost:54321::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected) >> >> >> >> >>>> >> >>> >Stomp connection established >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>08:07:31,467::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >> >> >> >>>> >> >Sending >> >> >> >> >>>> >> >>> >response >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,530::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >> >> >> >>>> >> >>> >Connecting the storage >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:31,531::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >Connecting storage server >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>08:07:31,531::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >> >> >> >>>> >> >Sending >> >> >> >> >>>> >> >>> >response >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>08:07:31,534::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >> >> >> >>>> >> >Sending >> >> >> >> >>>> >> >>> >response >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:32,199::storage_server::158::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path) >> >> >> >> >>>> >> >>> >Storage domain >> >a6cea67d-dbfb-45cf-a775-b4d0d47b26f2 >> >> >is >> >> >> >not >> >> >> >> >>>> >> >available >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:32,199::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >Connecting storage server >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>08:07:32,199::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >> >> >> >>>> >> >Sending >> >> >> >> >>>> >> >>> >response >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:32,814::storage_server::363::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >[{u'status': 0, u'id': >> >> >> >> >u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}] >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::INFO::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:32,814::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >Refreshing the storage domain >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>08:07:32,815::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >> >> >> >>>> >> >Sending >> >> >> >> >>>> >> >>> >response >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:33,129::storage_server::420::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >Error refreshing storage domain: Command >> >> >> >> >StorageDomain.getStats >> >> >> >> >>>> >> >with >> >> >> >> >>>> >> >>> >args >> >> >> >> >>>> >> >>> >{'storagedomainID': >> >> >> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} >> >> >> >> >>>> >failed: >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >(code=350, message=Error in storage domain action: >> >> >> >> >>>> >> >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>08:07:33,130::stompclient::294::jsonrpc.AsyncoreClient::(send) >> >> >> >> >>>> >> >Sending >> >> >> >> >>>> >> >>> >response >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::DEBUG::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:33,795::storage_backends::208::ovirt_hosted_engine_ha.lib.storage_backends::(_get_sector_size) >> >> >> >> >>>> >> >>> >Command StorageDomain.getInfo with args >> >> >> >{'storagedomainID': >> >> >> >> >>>> >> >>> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed: >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >(code=350, message=Error in storage domain action: >> >> >> >> >>>> >> >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >MainThread::WARNING::2020-04-09 >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>08:07:33,795::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) >> >> >> >> >>>> >> >>> >Can't connect vdsm storage: Command >> >> >> >StorageDomain.getInfo >> >> >> >> >with >> >> >> >> >>>> >args >> >> >> >> >>>> >> >>> >{'storagedomainID': >> >> >> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} >> >> >> >> >>>> >failed: >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >(code=350, message=Error in storage domain action: >> >> >> >> >>>> >> >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >The UUID it is moaning about is indeed the one >> >that >> >> >the >> >> >> >HA >> >> >> >> >sits >> >> >> >> >>>> >on >> >> >> >> >>>> >> >and >> >> >> >> >>>> >> >>> >is >> >> >> >> >>>> >> >>> >the one I listed the contents of in step 2 above. >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >So why can't it see this domain? >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >Thanks, Shareef. >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >On Thu, Apr 9, 2020 at 6:12 AM Strahil Nikolov >> >> >> >> >>>> >> ><hunter86_bg@yahoo.com> >> >> >> >> >>>> >> >>> >wrote: >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >> On April 9, 2020 1:51:05 AM GMT+03:00, Shareef >> >> >Jalloq >> >> >> >< >> >> >> >> >>>> >> >>> >> shareef@jalloq.co.uk> wrote: >> >> >> >> >>>> >> >>> >> >Don't know if this is useful or not, but I just >> >> >tried >> >> >> >to >> >> >> >> >>>> >> >shutdown >> >> >> >> >>>> >> >>> >and >> >> >> >> >>>> >> >>> >> >start >> >> >> >> >>>> >> >>> >> >another VM on one of the hosts and get the >> >> >following >> >> >> >> >error: >> >> >> >> >>>> >> >>> >> > >> >> >> >> >>>> >> >>> >> >virsh # start scratch >> >> >> >> >>>> >> >>> >> > >> >> >> >> >>>> >> >>> >> >error: Failed to start domain scratch >> >> >> >> >>>> >> >>> >> > >> >> >> >> >>>> >> >>> >> >error: Network not found: no network with >> >matching >> >> >> >name >> >> >> >> >>>> >> >>> >> >'vdsm-ovirtmgmt' >> >> >> >> >>>> >> >>> >> > >> >> >> >> >>>> >> >>> >> >Is this not referring to the interface name as >> >the >> >> >> >> >network is >> >> >> >> >>>> >> >called >> >> >> >> >>>> >> >>> >> >'ovirtmgnt'. >> >> >> >> >>>> >> >>> >> > >> >> >> >> >>>> >> >>> >> >On Wed, Apr 8, 2020 at 11:35 PM Shareef Jalloq >> >> >> >> >>>> >> >>> ><shareef@jalloq.co.uk> >> >> >> >> >>>> >> >>> >> >wrote: >> >> >> >> >>>> >> >>> >> > >> >> >> >> >>>> >> >>> >> >> Hmmm, virsh tells me the HE is running but it >> >> >> >hasn't >> >> >> >> >come >> >> >> >> >>>> >up >> >> >> >> >>>> >> >and >> >> >> >> >>>> >> >>> >the >> >> >> >> >>>> >> >>> >> >> agent.log is full of the same errors. >> >> >> >> >>>> >> >>> >> >> >> >> >> >> >>>> >> >>> >> >> On Wed, Apr 8, 2020 at 11:31 PM Shareef >> >Jalloq >> >> >> >> >>>> >> >>> ><shareef@jalloq.co.uk> >> >> >> >> >>>> >> >>> >> >> wrote: >> >> >> >> >>>> >> >>> >> >> >> >> >> >> >>>> >> >>> >> >>> Ah hah! Ok, so I've managed to start it >> >using >> >> >> >virsh >> >> >> >> >on >> >> >> >> >>>> >the >> >> >> >> >>>> >> >>> >second >> >> >> >> >>>> >> >>> >> >host >> >> >> >> >>>> >> >>> >> >>> but my first host is still dead. >> >> >> >> >>>> >> >>> >> >>> >> >> >> >> >>>> >> >>> >> >>> First of all, what are these 56,317 .prob- >> >> >files >> >> >> >that >> >> >> >> >get >> >> >> >> >>>> >> >dumped >> >> >> >> >>>> >> >>> >to >> >> >> >> >>>> >> >>> >> >the >> >> >> >> >>>> >> >>> >> >>> NFS mounts? >> >> >> >> >>>> >> >>> >> >>> >> >> >> >> >>>> >> >>> >> >>> Secondly, why doesn't the node mount the NFS >> >> >> >> >directories >> >> >> >> >>>> >at >> >> >> >> >>>> >> >boot? >> >> >> >> >>>> >> >>> >> >Is >> >> >> >> >>>> >> >>> >> >>> that the issue with this particular node? >> >> >> >> >>>> >> >>> >> >>> >> >> >> >> >>>> >> >>> >> >>> On Wed, Apr 8, 2020 at 11:12 PM >> >> >> >> >>>> ><eevans@digitaldatatechs.com> >> >> >> >> >>>> >> >>> >wrote: >> >> >> >> >>>> >> >>> >> >>> >> >> >> >> >>>> >> >>> >> >>>> Did you try virsh list --inactive >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Eric Evans >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Digital Data Services LLC. >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> 304.660.9080 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> *From:* Shareef Jalloq >> ><shareef@jalloq.co.uk> >> >> >> >> >>>> >> >>> >> >>>> *Sent:* Wednesday, April 8, 2020 5:58 PM >> >> >> >> >>>> >> >>> >> >>>> *To:* Strahil Nikolov >> ><hunter86_bg@yahoo.com> >> >> >> >> >>>> >> >>> >> >>>> *Cc:* Ovirt Users <users@ovirt.org> >> >> >> >> >>>> >> >>> >> >>>> *Subject:* [ovirt-users] Re: ovirt-engine >> >> >> >> >unresponsive - >> >> >> >> >>>> >how >> >> >> >> >>>> >> >to >> >> >> >> >>>> >> >>> >> >rescue? >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> I've now shut down the VMs on one host and >> >> >> >rebooted >> >> >> >> >it >> >> >> >> >>>> >but >> >> >> >> >>>> >> >the >> >> >> >> >>>> >> >>> >> >agent >> >> >> >> >>>> >> >>> >> >>>> service doesn't start. If I run >> >> >'hosted-engine >> >> >> >> >>>> >--vm-status' >> >> >> >> >>>> >> >I >> >> >> >> >>>> >> >>> >get: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> The hosted engine configuration has not >> >been >> >> >> >> >retrieved >> >> >> >> >>>> >from >> >> >> >> >>>> >> >>> >shared >> >> >> >> >>>> >> >>> >> >>>> storage. Please ensure that ovirt-ha-agent >> >is >> >> >> >> >running and >> >> >> >> >>>> >> >the >> >> >> >> >>>> >> >>> >> >storage >> >> >> >> >>>> >> >>> >> >>>> server is reachable. >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> and indeed if I list the mounts under >> >> >> >> >>>> >/rhev/data-center/mnt, >> >> >> >> >>>> >> >>> >only >> >> >> >> >>>> >> >>> >> >one of >> >> >> >> >>>> >> >>> >> >>>> the directories is mounted. I have 3 NFS >> >> >mounts, >> >> >> >> >one ISO >> >> >> >> >>>> >> >Domain >> >> >> >> >>>> >> >>> >> >and two >> >> >> >> >>>> >> >>> >> >>>> Data Domains. Only one Data Domain has >> >> >mounted >> >> >> >and >> >> >> >> >this >> >> >> >> >>>> >has >> >> >> >> >>>> >> >>> >lots >> >> >> >> >>>> >> >>> >> >of .prob >> >> >> >> >>>> >> >>> >> >>>> files in. So why haven't the other NFS >> >> >exports >> >> >> >been >> >> >> >> >>>> >> >mounted? >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Manually mounting them doesn't seem to have >> >> >> >helped >> >> >> >> >much >> >> >> >> >>>> >> >either. >> >> >> >> >>>> >> >>> >I >> >> >> >> >>>> >> >>> >> >can >> >> >> >> >>>> >> >>> >> >>>> start the broker service but the agent >> >service >> >> >> >says >> >> >> >> >no. >> >> >> >> >>>> >> >Same >> >> >> >> >>>> >> >>> >error >> >> >> >> >>>> >> >>> >> >as the >> >> >> >> >>>> >> >>> >> >>>> one in my last email. >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Shareef. >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> On Wed, Apr 8, 2020 at 9:57 PM Shareef >> >Jalloq >> >> >> >> >>>> >> >>> >> ><shareef@jalloq.co.uk> >> >> >> >> >>>> >> >>> >> >>>> wrote: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Right, still down. I've run virsh and it >> >> >doesn't >> >> >> >> >know >> >> >> >> >>>> >> >anything >> >> >> >> >>>> >> >>> >> >about >> >> >> >> >>>> >> >>> >> >>>> the engine vm. >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> I've restarted the broker and agent >> >services >> >> >and >> >> >> >I >> >> >> >> >still >> >> >> >> >>>> >get >> >> >> >> >>>> >> >>> >> >nothing in >> >> >> >> >>>> >> >>> >> >>>> virsh->list. >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> In the logs under >> >> >/var/log/ovirt-hosted-engine-ha >> >> >> >I >> >> >> >> >see >> >> >> >> >>>> >lots >> >> >> >> >>>> >> >of >> >> >> >> >>>> >> >>> >> >errors: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> broker.log: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >> >> >>>> >> >>> >> >>>> ovirt-hosted-engine-ha broker 2.3.6 started >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Searching for submonitors in >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor network >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor cpu-load-no-engine >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor mgmt-bridge >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor network >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor cpu-load >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor engine-health >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor mgmt-bridge >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor cpu-load-no-engine >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor cpu-load >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor mem-free >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor storage-domain >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor storage-domain >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor mem-free >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Loaded submonitor engine-health >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,143::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Finished loading submonitors >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,197::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) >> >> >> >> >>>> >> >>> >> >>>> Connecting the storage >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,197::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >> >>>> Connecting storage server >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,414::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >> >>>> Connecting storage server >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:20,628::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) >> >> >> >> >>>> >> >>> >> >>>> Refreshing the storage domain >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::WARNING::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:21,057::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) >> >> >> >> >>>> >> >>> >> >>>> Can't connect vdsm storage: Command >> >> >> >> >StorageDomain.getInfo >> >> >> >> >>>> >> >with >> >> >> >> >>>> >> >>> >args >> >> >> >> >>>> >> >>> >> >>>> {'storagedomainID': >> >> >> >> >>>> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} >> >> >> >> >>>> >> >>> >failed: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> (code=350, message=Error in storage domain >> >> >> >action: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',)) >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:21,901::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) >> >> >> >> >>>> >> >>> >> >>>> ovirt-hosted-engine-ha broker 2.3.6 started >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:56:21,901::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) >> >> >> >> >>>> >> >>> >> >>>> Searching for submonitors in >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> agent.log: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:00,799::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) >> >> >> >> >>>> >> >>> >> >>>> Trying to restart agent >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:00,799::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) >> >> >> >> >>>> >> >>> >> >>>> Agent shutting down >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,144::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) >> >> >> >> >>>> >> >>> >> >>>> ovirt-hosted-engine-ha agent 2.3.6 started >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,182::hosted_engine::234::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) >> >> >> >> >>>> >> >>> >> >>>> Found certificate common name: >> >> >> >> >ovirt-node-01.phoelex.com >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,294::hosted_engine::543::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) >> >> >> >> >>>> >> >>> >> >>>> Initializing ha-broker connection >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,296::brokerlink::80::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) >> >> >> >> >>>> >> >>> >> >>>> Starting monitor network, options >> >> >> >{'tcp_t_address': >> >> >> >> >'', >> >> >> >> >>>> >> >>> >> >'network_test': >> >> >> >> >>>> >> >>> >> >>>> 'dns', 'tcp_t_port': '', 'addr': >> >> >'192.168.1.99'} >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,296::hosted_engine::559::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) >> >> >> >> >>>> >> >>> >> >>>> Failed to start necessary monitors >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,297::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) >> >> >> >> >>>> >> >>> >> >>>> Traceback (most recent call last): >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> File >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >> >> >> >> >>>> >> >>> >> >>>> line 131, in _run_agent >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> return action(he) >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> File >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >> >> >> >> >>>> >> >>> >> >>>> line 55, in action_proper >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> return he.start_monitoring() >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> File >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >> >> >> >> >>>> >> >>> >> >>>> line 432, in start_monitoring >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> self._initialize_broker() >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> File >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >> >> >> >> >>>> >> >>> >> >>>> line 556, in _initialize_broker >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> m.get('options', {})) >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> File >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >> >> >> >> >>>> >> >>> >> >>>> line 89, in start_monitor >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> ).format(t=type, o=options, e=e) >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> RequestError: brokerlink - failed to start >> >> >> >monitor >> >> >> >> >via >> >> >> >> >>>> >> >>> >> >ovirt-ha-broker: >> >> >> >> >>>> >> >>> >> >>>> [Errno 2] No such file or directory, >> >[monitor: >> >> >> >> >'network', >> >> >> >> >>>> >> >>> >options: >> >> >> >> >>>> >> >>> >> >>>> {'tcp_t_address': '', 'network_test': >> >'dns', >> >> >> >> >>>> >'tcp_t_port': >> >> >> >> >>>> >> >'', >> >> >> >> >>>> >> >>> >> >'addr': >> >> >> >> >>>> >> >>> >> >>>> '192.168.1.99'}] >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,297::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) >> >> >> >> >>>> >> >>> >> >>>> Trying to restart agent >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> MainThread::INFO::2020-04-08 >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>>>>>>>20:57:11,297::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) >> >> >> >> >>>> >> >>> >> >>>> Agent shutting down >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> On Wed, Apr 8, 2020 at 6:10 PM Strahil >> >Nikolov >> >> >> >> >>>> >> >>> >> ><hunter86_bg@yahoo.com> >> >> >> >> >>>> >> >>> >> >>>> wrote: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> On April 8, 2020 7:47:20 PM GMT+03:00, >> >"Maton, >> >> >> >> >Brett" < >> >> >> >> >>>> >> >>> >> >>>> matonb@ltresources.co.uk> wrote: >> >> >> >> >>>> >> >>> >> >>>> >On the host you tried to restart the >> >engine >> >> >on: >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >Add an alias to virsh (authenticates with >> >> >> >> >>>> >virsh_auth.conf) >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >alias virsh='virsh -c >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >> >> >>>>>>qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >Then run virsh: >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >virsh >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >virsh # list >> >> >> >> >>>> >> >>> >> >>>> > Id Name >> >State >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>---------------------------------------------------- >> >> >> >> >>>> >> >>> >> >>>> > xx HostedEngine >> >Paused >> >> >> >> >>>> >> >>> >> >>>> > xx ********** >> >running >> >> >> >> >>>> >> >>> >> >>>> > ... >> >> >> >> >>>> >> >>> >> >>>> > xx ********** >> >> >running >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >HostedEngine should be in the list, try >> >and >> >> >> >resume >> >> >> >> >the >> >> >> >> >>>> >> >engine: >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >virsh # resume HostedEngine >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >On Wed, 8 Apr 2020 at 17:28, Shareef >> >Jalloq >> >> >> >> >>>> >> >>> ><shareef@jalloq.co.uk> >> >> >> >> >>>> >> >>> >> >>>> >wrote: >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >> Thanks! >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> The status hangs due to, I guess, the VM >> >> >being >> >> >> >> >>>> >down.... >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> [root@ovirt-node-01 ~]# hosted-engine >> >> >> >--vm-start >> >> >> >> >>>> >> >>> >> >>>> >> VM exists and is down, cleaning up and >> >> >> >restarting >> >> >> >> >>>> >> >>> >> >>>> >> VM in WaitForLaunch >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> but this doesn't seem to do anything. >> >OK, >> >> >> >after >> >> >> >> >a >> >> >> >> >>>> >while >> >> >> >> >>>> >> >I >> >> >> >> >>>> >> >>> >get a >> >> >> >> >>>> >> >>> >> >>>> >status of >> >> >> >> >>>> >> >>> >> >>>> >> it being barfed... >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> --== Host ovirt-node-00.phoelex.com (id: >> >1) >> >> >> >> >status >> >> >> >> >>>> >==-- >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> conf_on_shared_storage : >> >True >> >> >> >> >>>> >> >>> >> >>>> >> Status up-to-date : >> >False >> >> >> >> >>>> >> >>> >> >>>> >> Hostname : >> >> >> >> >>>> >> >>> >ovirt-node-00.phoelex.com >> >> >> >> >>>> >> >>> >> >>>> >> Host ID : 1 >> >> >> >> >>>> >> >>> >> >>>> >> Engine status : >> >> >unknown >> >> >> >> >>>> >stale-data >> >> >> >> >>>> >> >>> >> >>>> >> Score : >> >3400 >> >> >> >> >>>> >> >>> >> >>>> >> stopped : >> >False >> >> >> >> >>>> >> >>> >> >>>> >> Local maintenance : >> >False >> >> >> >> >>>> >> >>> >> >>>> >> crc32 : >> >> >9c4a034b >> >> >> >> >>>> >> >>> >> >>>> >> local_conf_timestamp : >> >523362 >> >> >> >> >>>> >> >>> >> >>>> >> Host timestamp : >> >523608 >> >> >> >> >>>> >> >>> >> >>>> >> Extra metadata (valid at timestamp): >> >> >> >> >>>> >> >>> >> >>>> >> metadata_parse_version=1 >> >> >> >> >>>> >> >>> >> >>>> >> metadata_feature_version=1 >> >> >> >> >>>> >> >>> >> >>>> >> timestamp=523608 (Wed Apr 8 16:17:11 >> >2020) >> >> >> >> >>>> >> >>> >> >>>> >> host-id=1 >> >> >> >> >>>> >> >>> >> >>>> >> score=3400 >> >> >> >> >>>> >> >>> >> >>>> >> vm_conf_refresh_time=523362 (Wed Apr 8 >> >> >> >16:13:06 >> >> >> >> >2020) >> >> >> >> >>>> >> >>> >> >>>> >> conf_on_shared_storage=True >> >> >> >> >>>> >> >>> >> >>>> >> maintenance=False >> >> >> >> >>>> >> >>> >> >>>> >> state=EngineDown >> >> >> >> >>>> >> >>> >> >>>> >> stopped=False >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> --== Host ovirt-node-01.phoelex.com (id: >> >2) >> >> >> >> >status >> >> >> >> >>>> >==-- >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> conf_on_shared_storage : >> >True >> >> >> >> >>>> >> >>> >> >>>> >> Status up-to-date : >> >True >> >> >> >> >>>> >> >>> >> >>>> >> Hostname : >> >> >> >> >>>> >> >>> >ovirt-node-01.phoelex.com >> >> >> >> >>>> >> >>> >> >>>> >> Host ID : 2 >> >> >> >> >>>> >> >>> >> >>>> >> Engine status : >> >> >> >{"reason": >> >> >> >> >"bad >> >> >> >> >>>> >vm >> >> >> >> >>>> >> >>> >status", >> >> >> >> >>>> >> >>> >> >>>> >"health": >> >> >> >> >>>> >> >>> >> >>>> >> "bad", "vm": "down_unexpected", >> >"detail": >> >> >> >"Down"} >> >> >> >> >>>> >> >>> >> >>>> >> Score : 0 >> >> >> >> >>>> >> >>> >> >>>> >> stopped : >> >False >> >> >> >> >>>> >> >>> >> >>>> >> Local maintenance : >> >False >> >> >> >> >>>> >> >>> >> >>>> >> crc32 : >> >> >5045f2eb >> >> >> >> >>>> >> >>> >> >>>> >> local_conf_timestamp : >> >> >1737037 >> >> >> >> >>>> >> >>> >> >>>> >> Host timestamp : >> >> >1737283 >> >> >> >> >>>> >> >>> >> >>>> >> Extra metadata (valid at timestamp): >> >> >> >> >>>> >> >>> >> >>>> >> metadata_parse_version=1 >> >> >> >> >>>> >> >>> >> >>>> >> metadata_feature_version=1 >> >> >> >> >>>> >> >>> >> >>>> >> timestamp=1737283 (Wed Apr 8 16:16:17 >> >> >2020) >> >> >> >> >>>> >> >>> >> >>>> >> host-id=2 >> >> >> >> >>>> >> >>> >> >>>> >> score=0 >> >> >> >> >>>> >> >>> >> >>>> >> vm_conf_refresh_time=1737037 (Wed Apr 8 >> >> >> >16:12:11 >> >> >> >> >>>> >2020) >> >> >> >> >>>> >> >>> >> >>>> >> conf_on_shared_storage=True >> >> >> >> >>>> >> >>> >> >>>> >> maintenance=False >> >> >> >> >>>> >> >>> >> >>>> >> state=EngineUnexpectedlyDown >> >> >> >> >>>> >> >>> >> >>>> >> stopped=False >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >> On Wed, Apr 8, 2020 at 5:09 PM Maton, >> >Brett >> >> >> >> >>>> >> >>> >> >>>> ><matonb@ltresources.co.uk> >> >> >> >> >>>> >> >>> >> >>>> >> wrote: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >> >>>> >> >>> >> >>>> >>> First steps, on one of your hosts as >> >root: >> >> >> >> >>>> >> >>> >> >>>> >>> >> >> >> >> >>>> >> >>> >> >>>> >>> To get information: >> >> >> >> >>>> >> >>> >> >>>> >>> hosted-engine --vm-status >> >> >> >> >>>> >> >>> >> >>>> >>> >> >> >> >> >>>> >> >>> >> >>>> >>> To start the engine: >> >> >> >> >>>> >> >>> >> >>>> >>> hosted-engine --vm-start >> >> >> >> >>>> >> >>> >> >>>> >>> >> >> >> >> >>>> >> >>> >> >>>> >>> >> >> >> >> >>>> >> >>> >> >>>> >>> On Wed, 8 Apr 2020 at 17:00, Shareef >> >> >Jalloq >> >> >> >> >>>> >> >>> >> ><shareef@jalloq.co.uk> >> >> >> >> >>>> >> >>> >> >>>> >wrote: >> >> >> >> >>>> >> >>> >> >>>> >>> >> >> >> >> >>>> >> >>> >> >>>> >>>> So my engine has gone down and I can't >> >> >ssh >> >> >> >into >> >> >> >> >it >> >> >> >> >>>> >> >either. >> >> >> >> >>>> >> >>> >If >> >> >> >> >>>> >> >>> >> >I >> >> >> >> >>>> >> >>> >> >>>> >try to >> >> >> >> >>>> >> >>> >> >>>> >>>> log into the web-ui of the node it is >> >> >> >running >> >> >> >> >on, I >> >> >> >> >>>> >get >> >> >> >> >>>> >> >>> >> >redirected >> >> >> >> >>>> >> >>> >> >>>> >because >> >> >> >> >>>> >> >>> >> >>>> >>>> the node can't reach the engine. >> >> >> >> >>>> >> >>> >> >>>> >>>> >> >> >> >> >>>> >> >>> >> >>>> >>>> What are my next steps? >> >> >> >> >>>> >> >>> >> >>>> >>>> >> >> >> >> >>>> >> >>> >> >>>> >>>> Shareef. >> >> >> >> >>>> >> >>> >> >>>> >>>> >> >> >> >_______________________________________________ >> >> >> >> >>>> >> >>> >> >>>> >>>> Users mailing list -- users@ovirt.org >> >> >> >> >>>> >> >>> >> >>>> >>>> To unsubscribe send an email to >> >> >> >> >>>> >users-leave@ovirt.org >> >> >> >> >>>> >> >>> >> >>>> >>>> Privacy Statement: >> >> >> >> >>>> >> >>> >https://www.ovirt.org/privacy-policy.html >> >> >> >> >>>> >> >>> >> >>>> >>>> oVirt Code of Conduct: >> >> >> >> >>>> >> >>> >> >>>> >>>> >> >> >> >> >>>> >> >> >> >>https://www.ovirt.org/community/about/community-guidelines/ >> >> >> >> >>>> >> >>> >> >>>> >>>> List Archives: >> >> >> >> >>>> >> >>> >> >>>> >>>> >> >> >> >> >>>> >> >>> >> >>>> > >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> > >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> > >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> > >> >> >> >> >>>> >> >> >> >> >> >>>> > >> >> >> >> >>>> >> >> >> >> > >> >> >> >> >> >> >> > >> >> >> >> >> > >> >> >> > >> https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/ >> >> >> >> >>>> >> >>> >> >>>> >>>> >> >> >> >> >>>> >> >>> >> >>>> >>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> This has to be resolved: >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Engine status : >> >unknown >> >> >> >> >stale-data >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Run again 'hosted-engine --vm-status'. If >> >it >> >> >> >remains >> >> >> >> >the >> >> >> >> >>>> >> >same, >> >> >> >> >>>> >> >>> >> >restart >> >> >> >> >>>> >> >>> >> >>>> ovirt-ha-broker.service & >> >> >ovirt-ha-agent.service >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Verify that the engine's storage is >> >available. >> >> >> >Then >> >> >> >> >>>> >monitor >> >> >> >> >>>> >> >the >> >> >> >> >>>> >> >>> >> >broker >> >> >> >> >>>> >> >>> >> >>>> & agent logs in >> >> >/var/log/ovirt-hosted-engine-ha >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> Best Regards, >> >> >> >> >>>> >> >>> >> >>>> Strahil Nikolov >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >>>> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> Hi Shareef, >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> The flow of activation oVirt is more complex >> >than a >> >> >> >plain >> >> >> >> >KVM. >> >> >> >> >>>> >> >>> >> Mounting of the domains happen during the >> >> >activation >> >> >> >of >> >> >> >> >the >> >> >> >> >>>> >node >> >> >> >> >>>> >> >( >> >> >> >> >>>> >> >>> >the >> >> >> >> >>>> >> >>> >> HostedEngine is activating everything needed). >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> Focus on the HostedEngine VM. >> >> >> >> >>>> >> >>> >> Is it running properly ? >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> If not,try: >> >> >> >> >>>> >> >>> >> 1. Verify that the storage domain exists >> >> >> >> >>>> >> >>> >> 2. Check if it has 'ha_agents' directory >> >> >> >> >>>> >> >>> >> 3. Check if the links are OK, if not you can >> >> >safely >> >> >> >> >remove >> >> >> >> >>>> >the >> >> >> >> >>>> >> >links >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> 4. Next check the services are running: >> >> >> >> >>>> >> >>> >> A) sanlock >> >> >> >> >>>> >> >>> >> B) supervdsmd >> >> >> >> >>>> >> >>> >> C) vdsmd >> >> >> >> >>>> >> >>> >> D) libvirtd >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> 5. Increase the log level for broker and agent >> >> >> >services: >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> cd /etc/ovirt-hosted-engine-ha >> >> >> >> >>>> >> >>> >> vim *-log.conf >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> systemctl restart ovirt-ha-broker ovirt-ha-agent >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> 6. Check what they are complaining about >> >> >> >> >>>> >> >>> >> Keep in mind that agent will keep throwing >> >errors >> >> >> >untill >> >> >> >> >the >> >> >> >> >>>> >> >broker >> >> >> >> >>>> >> >>> >stops >> >> >> >> >>>> >> >>> >> doing it (agent depends on broker), so broker >> >> >must >> >> >> >be >> >> >> >> >OK >> >> >> >> >>>> >before >> >> >> >> >>>> >> >>> >> peoceeding with the agent log. >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> About the manual VM start, you need 2 things: >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> 1. Define the VM network >> >> >> >> >>>> >> >>> >> # cat vdsm-ovirtmgmt.xml <network> >> >> >> >> >>>> >> >>> >> <name>vdsm-ovirtmgmt</name> >> >> >> >> >>>> >> >>> >> >> ><uuid>8ded486e-e681-4754-af4b-5737c2b05405</uuid> >> >> >> >> >>>> >> >>> >> <forward mode='bridge'/> >> >> >> >> >>>> >> >>> >> <bridge name='ovirtmgmt'/> >> >> >> >> >>>> >> >>> >> </network> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> [root@ovirt1 HostedEngine-RECOVERY]# virsh >> >define >> >> >> >> >>>> >> >vdsm-ovirtmgmt.xml >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> 2. Get an xml definition which can be found in >> >the >> >> >> >vdsm >> >> >> >> >log. >> >> >> >> >>>> >> >Every VM >> >> >> >> >>>> >> >>> >at >> >> >> >> >>>> >> >>> >> start up has it's configuration printed out in >> >> >vdsm >> >> >> >log >> >> >> >> >on >> >> >> >> >>>> >the >> >> >> >> >>>> >> >host >> >> >> >> >>>> >> >>> >it >> >> >> >> >>>> >> >>> >> starts. >> >> >> >> >>>> >> >>> >> Save to file and then: >> >> >> >> >>>> >> >>> >> A) virsh define myvm.xml >> >> >> >> >>>> >> >>> >> B) virsh start myvm >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> It seems there is/was a problem with your NFS >> >> >shares. >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> Best Regards, >> >> >> >> >>>> >> >>> >> Strahil Nikolov >> >> >> >> >>>> >> >>> >> >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> Hey Shareef, >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> Check if there are any files or folders not owned >> >by >> >> >> >> >vdsm:kvm . >> >> >> >> >>>> >> >Something >> >> >> >> >>>> >> >>> like this: >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> find . -not -user 36 -not -group 36 -print >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> Also check if vdsm can access the images in the >> >> >> >> >>>> >> >>> '<vol-mount-point>/images' directories. >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >>> Best Regards, >> >> >> >> >>>> >> >>> Strahil Nikolov >> >> >> >> >>>> >> >>> >> >> >> >> >>>> >> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> And the IPv6 address '64:ff9b::c0a8:13d' ? >> >> >> >> >>>> >> >> >> >> >> >>>> >> I don't see in the log output. >> >> >> >> >>>> >> >> >> >> >> >>>> >> Best Regards, >> >> >> >> >>>> >> Strahil Nikolov >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >>>> Based on your output , you got a PTR record for IPv4 & >> >> >IPv6 >> >> >> >... >> >> >> >> >most >> >> >> >> >>>> probably it's the reason. >> >> >> >> >>>> >> >> >> >> >>>> Set the IPv6 on the interface and try again. >> >> >> >> >>>> >> >> >> >> >>>> Best Regards, >> >> >> >> >>>> Strahil Nikolov >> >> >> >> >>>> >> >> >> >> >>> >> >> >> >> >> >> >> >> Do you have firewalld up and running on the host ? >> >> >> >> >> >> >> >> Best Regards, >> >> >> >> Strahil Nikolov >> >> >> >> >> >> >> >> >> >> I am guessing, but your interface is not asaigned to any zone , >> >> >right? >> >> >> Just add the interface to the default zone (usually 'public'). >> >> >> >> >> >> Best Regards, >> >> >> Strahil Nikolov >> >> >> >> >> >> >> Keep in mind that there are a lot of playbooks that can be used >> >to >> >> deploy a HostedEngine Environment via ansible. >> >> >> >> Keep in mind that if you plan to use oVirt in Prod, you need to know >> >how >> >> to debug it (at least on basic level). >> >> >> >> Best Regards, >> >> Strahil Nikolov >> >> >> >> It's really interesting that you mention that topic. >> The only way I managed to break my engine was: >> A) bad SELINUX rpm which was solved via reinstall of the package and >> relabel >> B) Interrupted patch, as I forgot to use screen >> >> I think it is Prod ready, but it requires knowledge as it is not >> as dummy-proof like VMware. Yet, oVirt is way more flexible allowing you >> to run your own scripts before/during/after a certain event (vdsm hooks). >> >> Sadly Ansible (this is what is used for setup of gluster -> >> gdeploy, and for the engine) is quite dynamic and sometimes something >> might break. >> >> If you feel that oVirt breaks too often - just set your engine on a >> separate physical or virtual (non-hosted) machine, but do not complain that >> a free open-source product is not Production ready, just because you don't >> know how to debug it. >> >> >> You can trial the downstream solutions from Red Hat & Oracle and you will >> notice the difference. For me oVirt is like Fedora compared to >> RHEL/OEL/CentOS, but this is just a personal opinion. >> >> Best Regards, >> Strahil Nikolov >> >