Yes, but there are no zones set up, just ports 22, 6801 adn 6900.

On Wed, Apr 15, 2020 at 12:37 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On April 15, 2020 2:28:05 PM GMT+03:00, Shareef Jalloq <shareef@jalloq.co.uk> wrote:
>Oh this is painful.  It seems to progress if you have both
>he_force_ipv4
>set and run the deployment with the '--4' switch.
>
>But then I get a failure when the ansible script checks for
>firewalld-zones
>and doesn't get anything back.  Should the deployment flow not be
>setting
>any zones it needs?
>
>2020-04-15 10:57:25,439+0000 INFO
>otopi.ovirt_hosted_engine_setup.ansible_utils
>ansible_utils._process_output:109 TASK [ovirt.hosted_engine_setup : Get
>active list of active firewalld zones]
>
>2020-04-15 10:57:26,641+0000 DEBUG
>otopi.ovirt_hosted_engine_setup.ansible_utils
>ansible_utils._process_output:103 {u'stderr_lines': [], u'changed':
>True,
>u'end': u'2020-04-15 10:57:26.481202', u'_ansible_no_log': False,
>u'stdout': u'', u'cmd': u'set -euo pipefail && firewall-cmd
>--get-active-zones | grep -v "^\\s*interfaces"', u'start': u'2020-04-15
>10:57:26.050203', u'delta': u'0:00:00.430999', u'stderr': u'', u'rc':
>1,
>u'invocation': {u'module_args': {u'creates': None, u'executable': None,
>u'_uses_shell': True, u'strip_empty_ends': True, u'_raw_params': u'set
>-euo
>pipefail && firewall-cmd --get-active-zones | grep -v
>"^\\s*interfaces"',
>u'removes': None, u'argv': None, u'warn': True, u'chdir': None,
>u'stdin_add_newline': True, u'stdin': None}}, u'stdout_lines': [],
>u'msg':
>u'non-zero return code'}
>
>2020-04-15 10:57:26,741+0000 ERROR
>otopi.ovirt_hosted_engine_setup.ansible_utils
>ansible_utils._process_output:107 fatal: [localhost]: FAILED! =>
>{"changed": true, "cmd": "set -euo pipefail && firewall-cmd
>--get-active-zones | grep -v \"^\\s*interfaces\"", "delta":
>"0:00:00.430999", "end": "2020-04-15 10:57:26.481202", "msg": "non-zero
>return code", "rc": 1, "start": "2020-04-15 10:57:26.050203", "stderr":
>"",
>"stderr_lines": [], "stdout": "", "stdout_lines": []}
>
>On Wed, Apr 15, 2020 at 10:23 AM Shareef Jalloq <shareef@jalloq.co.uk>
>wrote:
>
>> Ha, spoke too soon. It's now stuck in a loop and a google points me
>at
>> https://bugzilla.redhat.com/show_bug.cgi?id=1746585
>>
>> However, forcing ipv4 doesn't seem to have fixed the loop.
>>
>> On Wed, Apr 15, 2020 at 9:59 AM Shareef Jalloq <shareef@jalloq.co.uk>
>> wrote:
>>
>>> OK, that seems to have fixed it, thanks.  Is this a side effect of
>>> redeploying the HE over a first time install? Nothing has changed in
>our
>>> setup and I didn't need to do this when I initially set up our
>nodes.
>>>
>>>
>>>
>>> On Tue, Apr 14, 2020 at 6:55 PM Strahil Nikolov
><hunter86_bg@yahoo.com>
>>> wrote:
>>>
>>>> On April 14, 2020 6:17:17 PM GMT+03:00, Shareef Jalloq <
>>>> shareef@jalloq.co.uk> wrote:
>>>> >Hmmm, we're not using ipv6.  Is that the issue?
>>>> >
>>>> >On Tue, Apr 14, 2020 at 3:56 PM Strahil Nikolov
><hunter86_bg@yahoo.com>
>>>> >wrote:
>>>> >
>>>> >> On April 14, 2020 1:27:24 PM GMT+03:00, Shareef Jalloq <
>>>> >> shareef@jalloq.co.uk> wrote:
>>>> >> >Right, I've given up on recovering the HE so want to try and
>>>> >redeploy
>>>> >> >it.
>>>> >> >There doesn't seem to be enough information to debug why the
>>>> >> >broker/agent
>>>> >> >won't start cleanly.
>>>> >> >
>>>> >> >In running 'hosted-engine --deploy', I'm seeing the following
>error
>>>> >in
>>>> >> >the
>>>> >> >setup validation phase:
>>>> >> >
>>>> >> >2020-04-14 09:46:08,922+0000 DEBUG
>otopi.plugins.otopi.dialog.human
>>>> >> >dialog.__logString:204 DIALOG:SEND                 Please
>provide
>>>> >the
>>>> >> >hostname of this host on the management network
>>>> >> >[ovirt-node-00.phoelex.com]:
>>>> >> >
>>>> >> >
>>>> >> >2020-04-14 09:46:12,831+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge
>>>> >> >hostname.getResolvedAddresses:432
>>>> >> >getResolvedAddresses: set(['64:ff9b::c0a8:13d',
>'192.168.1.61'])
>>>> >> >
>>>> >> >2020-04-14 09:46:12,832+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge
>>>> >> >hostname._validateFQDNresolvability:289
>ovirt-node-00.phoelex.com
>>>> >> >resolves
>>>> >> >to: set(['64:ff9b::c0a8:13d', '192.168.1.61'])
>>>> >> >
>>>> >> >2020-04-14 09:46:12,832+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.executeRaw:813
>>>> >> >execute:
>>>> >> >['/usr/bin/dig', '+noall', '+answer',
>'ovirt-node-00.phoelex.com',
>>>> >> >'ANY'],
>>>> >> >executable='None', cwd='None', env=None
>>>> >> >
>>>> >> >2020-04-14 09:46:12,871+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.executeRaw:863
>>>> >> >execute-result: ['/usr/bin/dig', '+noall', '+answer', '
>>>> >> >ovirt-node-00.phoelex.com', 'ANY'], rc=0
>>>> >> >
>>>> >> >2020-04-14 09:46:12,872+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.execute:921
>>>> >> >execute-output: ['/usr/bin/dig', '+noall', '+answer', '
>>>> >> >ovirt-node-00.phoelex.com', 'ANY'] stdout:
>>>> >> >
>>>> >> >ovirt-node-00.phoelex.com. 86400 IN     A       192.168.1.61
>>>> >> >
>>>> >> >
>>>> >> >2020-04-14 09:46:12,872+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.execute:926
>>>> >> >execute-output: ['/usr/bin/dig', '+noall', '+answer', '
>>>> >> >ovirt-node-00.phoelex.com', 'ANY'] stderr:
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >2020-04-14 09:46:12,872+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.executeRaw:813
>>>> >> >execute:
>>>> >> >('/usr/sbin/ip', 'addr'), executable='None', cwd='None',
>env=None
>>>> >> >
>>>> >> >2020-04-14 09:46:12,876+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.executeRaw:863
>>>> >> >execute-result: ('/usr/sbin/ip', 'addr'), rc=0
>>>> >> >
>>>> >> >2020-04-14 09:46:12,876+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.execute:921
>>>> >> >execute-output: ('/usr/sbin/ip', 'addr') stdout:
>>>> >> >
>>>> >> >1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state
>UNKNOWN
>>>> >> >group
>>>> >> >default qlen 1000
>>>> >> >
>>>> >> >    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>> >> >
>>>> >> >    inet 127.0.0.1/8 scope host lo
>>>> >> >
>>>> >> >       valid_lft forever preferred_lft forever
>>>> >> >
>>>> >> >    inet6 ::1/128 scope host
>>>> >> >
>>>> >> >       valid_lft forever preferred_lft forever
>>>> >> >
>>>> >> >2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq
>master
>>>> >> >ovirtmgmt state UP group default qlen 1000
>>>> >> >
>>>> >> >    link/ether ac:1f:6b:bc:32:6a brd ff:ff:ff:ff:ff:ff
>>>> >> >
>>>> >> >3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
>state
>>>> >> >DOWN
>>>> >> >group default qlen 1000
>>>> >> >
>>>> >> >    link/ether ac:1f:6b:bc:32:6b brd ff:ff:ff:ff:ff:ff
>>>> >> >
>>>> >> >4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state
>DOWN
>>>> >> >group
>>>> >> >default qlen 1000
>>>> >> >
>>>> >> >    link/ether 02:e6:e2:80:93:8d brd ff:ff:ff:ff:ff:ff
>>>> >> >
>>>> >> >5: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
>>>> >group
>>>> >> >default qlen 1000
>>>> >> >
>>>> >> >    link/ether 8a:26:44:50:ee:4a brd ff:ff:ff:ff:ff:ff
>>>> >> >
>>>> >> >21: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>>>> >noqueue
>>>> >> >state UP group default qlen 1000
>>>> >> >
>>>> >> >    link/ether ac:1f:6b:bc:32:6a brd ff:ff:ff:ff:ff:ff
>>>> >> >
>>>> >> >    inet 192.168.1.61/24 brd 192.168.1.255 scope global
>ovirtmgmt
>>>> >> >
>>>> >> >       valid_lft forever preferred_lft forever
>>>> >> >
>>>> >> >    inet6 fe80::ae1f:6bff:febc:326a/64 scope link
>>>> >> >
>>>> >> >       valid_lft forever preferred_lft forever
>>>> >> >
>>>> >> >22: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop
>state
>>>> >DOWN
>>>> >> >group
>>>> >> >default qlen 1000
>>>> >> >
>>>> >> >    link/ether 3a:02:7b:7d:b3:2a brd ff:ff:ff:ff:ff:ff
>>>> >> >
>>>> >> >
>>>> >> >2020-04-14 09:46:12,876+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge plugin.execute:926
>>>> >> >execute-output: ('/usr/sbin/ip', 'addr') stderr:
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >2020-04-14 09:46:12,877+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge
>>>> >> >hostname.getLocalAddresses:251
>>>> >> >addresses: [u'192.168.1.61', u'fe80::ae1f:6bff:febc:326a']
>>>> >> >
>>>> >> >2020-04-14 09:46:12,877+0000 DEBUG
>>>> >> >otopi.plugins.gr_he_common.network.bridge
>hostname.test_hostname:464
>>>> >> >test_hostname exception
>>>> >> >
>>>> >> >Traceback (most recent call last):
>>>> >> >
>>>> >> >File
>"/usr/lib/python2.7/site-packages/ovirt_setup_lib/hostname.py",
>>>> >> >line
>>>> >> >460, in test_hostname
>>>> >> >
>>>> >> >    not_local_text,
>>>> >> >
>>>> >> >File
>"/usr/lib/python2.7/site-packages/ovirt_setup_lib/hostname.py",
>>>> >> >line
>>>> >> >342, in _validateFQDNresolvability
>>>> >> >
>>>> >> >    addresses=resolvedAddressesAsString
>>>> >> >
>>>> >> >RuntimeError: ovirt-node-00.phoelex.com resolves to
>>>> >64:ff9b::c0a8:13d
>>>> >> >192.168.1.61 and not all of them can be mapped to non loopback
>>>> >devices
>>>> >> >on
>>>> >> >this host
>>>> >> >
>>>> >> >2020-04-14 09:46:12,884+0000 ERROR
>>>> >> >otopi.plugins.gr_he_common.network.bridge
>dialog.queryEnvKey:120
>>>> >Host
>>>> >> >name
>>>> >> >is not valid: ovirt-node-00.phoelex.com resolves to
>>>> >64:ff9b::c0a8:13d
>>>> >> >192.168.1.61 and not all of them can be mapped to non loopback
>>>> >devices
>>>> >> >on
>>>> >> >this host
>>>> >> >
>>>> >> >The node I'm running on has an IP address of .61 and resolves
>>>> >> >correctly.
>>>> >> >
>>>> >> >On Fri, Apr 10, 2020 at 12:55 PM Shareef Jalloq
>>>> ><shareef@jalloq.co.uk>
>>>> >> >wrote:
>>>> >> >
>>>> >> >> Where should I be checking if there are any files/folder not
>owned
>>>> >by
>>>> >> >> vdsm:kvm?  I checked on the mount the HA sits on and it's
>fine.
>>>> >> >>
>>>> >> >> How would I go about checking vdsm can access those images?
>If I
>>>> >run
>>>> >> >> virsh, it lists them and they were running yesterday even
>though
>>>> >the
>>>> >> >HA was
>>>> >> >> down.  I've since restarted both hosts but the broker is
>still
>>>> >> >spitting out
>>>> >> >> the same error (copied below).  How do I find the reason the
>>>> >broker
>>>> >> >can't
>>>> >> >> connect to the storage?  The conf file is already at DEBUG
>>>> >verbosity:
>>>> >> >>
>>>> >> >> [handler_logfile]
>>>> >> >>
>>>> >> >> class=logging.handlers.TimedRotatingFileHandler
>>>> >> >>
>>>> >> >> args=('/var/log/ovirt-hosted-engine-ha/broker.log', 'd', 1,
>7)
>>>> >> >>
>>>> >> >> level=DEBUG
>>>> >> >>
>>>> >> >> formatter=long
>>>> >> >>
>>>> >> >> And what are all these .prob-<num> files that are being
>created?
>>>> >> >There
>>>> >> >> are over 250K of them now on the mount I'm using for the Data
>>>> >domain.
>>>> >> >> They're all of 0 size and of the form,
>>>> >> >> /rhev/data-center/mnt/nas-01.phoelex.com:
>>>> >> >> _volume2_vmstore/.prob-ffa867da-93db-4211-82df-b1b04a625ab9
>>>> >> >>
>>>> >> >> @eevans:  The volume I have the Data Domain on has TB's free.
> The
>>>> >HA
>>>> >> >is
>>>> >> >> dead so I can't ssh in.  No idea what started these errors
>and the
>>>> >> >other
>>>> >> >> VMs were still running happily although they're on a
>different
>>>> >Data
>>>> >> >Domain.
>>>> >> >>
>>>> >> >> Shareef.
>>>> >> >>
>>>> >> >> MainThread::INFO::2020-04-10
>>>> >> >>
>>>> >>
>>>> >>
>>>>
>>>>
>>>07:45:00,408::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
>>>> >> >> Connecting the storage
>>>> >> >>
>>>> >> >> MainThread::INFO::2020-04-10
>>>> >> >>
>>>> >>
>>>> >>
>>>>
>>>>
>>>07:45:00,408::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >> Connecting storage server
>>>> >> >>
>>>> >> >> MainThread::INFO::2020-04-10
>>>> >> >>
>>>> >>
>>>> >>
>>>>
>>>>
>>>07:45:01,577::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >> Connecting storage server
>>>> >> >>
>>>> >> >> MainThread::INFO::2020-04-10
>>>> >> >>
>>>> >>
>>>> >>
>>>>
>>>>
>>>07:45:02,692::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >> Refreshing the storage domain
>>>> >> >>
>>>> >> >> MainThread::WARNING::2020-04-10
>>>> >> >>
>>>> >>
>>>> >>
>>>>
>>>>
>>>07:45:05,175::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>>>> >> >> Can't connect vdsm storage: Command StorageDomain.getInfo
>with
>>>> >args
>>>> >> >> {'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
>>>> >failed:
>>>> >> >>
>>>> >> >> (code=350, message=Error in storage domain action:
>>>> >> >> (u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
>>>> >> >>
>>>> >> >> On Thu, Apr 9, 2020 at 5:58 PM Strahil Nikolov
>>>> >> ><hunter86_bg@yahoo.com>
>>>> >> >> wrote:
>>>> >> >>
>>>> >> >>> On April 9, 2020 11:12:30 AM GMT+03:00, Shareef Jalloq <
>>>> >> >>> shareef@jalloq.co.uk> wrote:
>>>> >> >>> >OK, let's go through this.  I'm looking at the node that at
>>>> >least
>>>> >> >still
>>>> >> >>> >has
>>>> >> >>> >some VMs running.  virsh also tells me that the
>HostedEngine VM
>>>> >is
>>>> >> >>> >running
>>>> >> >>> >but it's unresponsive and I can't shut it down.
>>>> >> >>> >
>>>> >> >>> >1. All storage domains exist and are mounted.
>>>> >> >>> >2. The ha_agent exists:
>>>> >> >>> >
>>>> >> >>> >[root@ovirt-node-01 ovirt-hosted-engine-ha]# ls
>>>> >> >/rhev/data-center/mnt/
>>>> >> >>> >nas-01.phoelex.com
>>>> >> >>> \:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/
>>>> >> >>> >
>>>> >> >>> >dom_md  ha_agent  images  master
>>>> >> >>> >
>>>> >> >>> >3.  There are two links
>>>> >> >>> >
>>>> >> >>> >[root@ovirt-node-01 ovirt-hosted-engine-ha]# ll
>>>> >> >/rhev/data-center/mnt/
>>>> >> >>> >nas-01.phoelex.com
>>>> >> >>>
>>>> >>\:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ha_agent/
>>>> >> >>> >
>>>> >> >>> >total 8
>>>> >> >>> >
>>>> >> >>> >lrwxrwxrwx. 1 vdsm kvm 132 Apr  2 14:50
>hosted-engine.lockspace
>>>> >->
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ffb90b82-42fe-4253-85d5-aaec8c280aaf/90e68791-0c6f-406a-89ac-e0d86c631604
>>>> >> >>> >
>>>> >> >>> >lrwxrwxrwx. 1 vdsm kvm 132 Apr  2 14:50
>hosted-engine.metadata
>>>> >->
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/2161aed0-7250-4c1d-b667-ac94f60af17e/6b818e33-f80a-48cc-a59c-bba641e027d4
>>>> >> >>> >
>>>> >> >>> >4. The services exist but all seem to have some sort of
>warning:
>>>> >> >>> >
>>>> >> >>> >a) Apr 08 18:10:55 ovirt-node-01.phoelex.com sanlock[1728]:
>>>> >> >*2020-04-08
>>>> >> >>> >18:10:55 1744152 [36796]: s16 delta_renew long write time
>10
>>>> >sec*
>>>> >> >>> >
>>>> >> >>> >b) Mar 23 18:02:59 ovirt-node-01.phoelex.com
>supervdsmd[29409]:
>>>> >> >*failed
>>>> >> >>> >to
>>>> >> >>> >load module nvdimm: libbd_nvdimm.so.2: cannot open shared
>object
>>>> >> >file:
>>>> >> >>> >No
>>>> >> >>> >such file or directory*
>>>> >> >>> >
>>>> >> >>> >c) Apr 09 08:05:13 ovirt-node-01.phoelex.com vdsm[4801]:
>*ERROR
>>>> >> >failed
>>>> >> >>> >to
>>>> >> >>> >retrieve Hosted Engine HA score '[Errno 2] No such file or
>>>> >> >directory'Is
>>>> >> >>> >the
>>>> >> >>> >Hosted Engine setup finished?*
>>>> >> >>> >
>>>> >> >>> >d)Apr 08 22:48:27 ovirt-node-01.phoelex.com
>libvirtd[29307]:
>>>> >> >2020-04-08
>>>> >> >>> >22:48:27.134+0000: 29309: warning : qemuGetProcessInfo:1404
>:
>>>> >> >cannot
>>>> >> >>> >parse
>>>> >> >>> >process status data
>>>> >> >>> >
>>>> >> >>> >Apr 08 22:48:27 ovirt-node-01.phoelex.com libvirtd[29307]:
>>>> >> >2020-04-08
>>>> >> >>> >22:48:27.134+0000: 29309: error :
>virNetDevTapInterfaceStats:764
>>>> >:
>>>> >> >>> >internal
>>>> >> >>> >error: /proc/net/dev: Interface not found
>>>> >> >>> >
>>>> >> >>> >Apr 08 23:09:39 ovirt-node-01.phoelex.com libvirtd[29307]:
>>>> >> >2020-04-08
>>>> >> >>> >23:09:39.844+0000: 29307: error : virNetSocketReadWire:1806
>:
>>>> >End
>>>> >> >of
>>>> >> >>> >file
>>>> >> >>> >while reading data: Input/output error
>>>> >> >>> >
>>>> >> >>> >Apr 09 01:05:26 ovirt-node-01.phoelex.com libvirtd[29307]:
>>>> >> >2020-04-09
>>>> >> >>> >01:05:26.660+0000: 29307: error : virNetSocketReadWire:1806
>:
>>>> >End
>>>> >> >of
>>>> >> >>> >file
>>>> >> >>> >while reading data: Input/output error
>>>> >> >>> >
>>>> >> >>> >5 & 6.  The broker log is continually printing this error:
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
>>>> >> >>> >ovirt-hosted-engine-ha broker 2.3.6 started
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
>>>> >> >>> >Running broker
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,438::broker::120::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_monitor)
>>>> >> >>> >Starting monitor
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,438::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Searching for submonitors in
>>>> >> >>>
>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker
>>>> >> >>> >
>>>> >> >>> >/submonitors
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,439::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor network
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,440::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor cpu-load-no-engine
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor mgmt-bridge
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor network
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor cpu-load
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor engine-health
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor mgmt-bridge
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor cpu-load-no-engine
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor cpu-load
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor mem-free
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor storage-domain
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor storage-domain
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor mem-free
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Loaded submonitor engine-health
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,444::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >Finished loading submonitors
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker)
>>>> >> >>> >Starting storage broker
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
>>>> >> >>> >Connecting to VDSM
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug)
>>>> >> >>> >Creating a new json-rpc connection to VDSM
>>>> >> >>> >
>>>> >> >>> >Client localhost:54321::DEBUG::2020-04-09
>>>> >> >>> >08:07:31,453::concurrent::258::root::(run) START thread
>>>> >> ><Thread(Client
>>>> >> >>> >localhost:54321, started daemon 139992488138496)>
>(func=<bound
>>>> >> >method
>>>> >> >>> >Reactor.process_requests of
><yajsonrpc.betterAsyncore.Reactor
>>>> >> >object at
>>>> >> >>> >0x7f528acabc90>>, args=(), kwargs={})
>>>> >> >>> >
>>>> >> >>> >Client localhost:54321::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected)
>>>> >> >>> >Stomp connection established
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>08:07:31,467::stompclient::294::jsonrpc.AsyncoreClient::(send)
>>>> >> >Sending
>>>> >> >>> >response
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,530::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
>>>> >> >>> >Connecting the storage
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:31,531::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >Connecting storage server
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>08:07:31,531::stompclient::294::jsonrpc.AsyncoreClient::(send)
>>>> >> >Sending
>>>> >> >>> >response
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>08:07:31,534::stompclient::294::jsonrpc.AsyncoreClient::(send)
>>>> >> >Sending
>>>> >> >>> >response
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:32,199::storage_server::158::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path)
>>>> >> >>> >Storage domain a6cea67d-dbfb-45cf-a775-b4d0d47b26f2 is not
>>>> >> >available
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:32,199::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >Connecting storage server
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>08:07:32,199::stompclient::294::jsonrpc.AsyncoreClient::(send)
>>>> >> >Sending
>>>> >> >>> >response
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:32,814::storage_server::363::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >[{u'status': 0, u'id':
>u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}]
>>>> >> >>> >
>>>> >> >>> >MainThread::INFO::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:32,814::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >Refreshing the storage domain
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>08:07:32,815::stompclient::294::jsonrpc.AsyncoreClient::(send)
>>>> >> >Sending
>>>> >> >>> >response
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:33,129::storage_server::420::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >Error refreshing storage domain: Command
>StorageDomain.getStats
>>>> >> >with
>>>> >> >>> >args
>>>> >> >>> >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
>>>> >failed:
>>>> >> >>> >
>>>> >> >>> >(code=350, message=Error in storage domain action:
>>>> >> >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>08:07:33,130::stompclient::294::jsonrpc.AsyncoreClient::(send)
>>>> >> >Sending
>>>> >> >>> >response
>>>> >> >>> >
>>>> >> >>> >MainThread::DEBUG::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:33,795::storage_backends::208::ovirt_hosted_engine_ha.lib.storage_backends::(_get_sector_size)
>>>> >> >>> >Command StorageDomain.getInfo with args {'storagedomainID':
>>>> >> >>> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed:
>>>> >> >>> >
>>>> >> >>> >(code=350, message=Error in storage domain action:
>>>> >> >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
>>>> >> >>> >
>>>> >> >>> >MainThread::WARNING::2020-04-09
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>08:07:33,795::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>>>> >> >>> >Can't connect vdsm storage: Command StorageDomain.getInfo
>with
>>>> >args
>>>> >> >>> >{'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
>>>> >failed:
>>>> >> >>> >
>>>> >> >>> >(code=350, message=Error in storage domain action:
>>>> >> >>> >(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >The UUID it is moaning about is indeed the one that the HA
>sits
>>>> >on
>>>> >> >and
>>>> >> >>> >is
>>>> >> >>> >the one I listed the contents of in step 2 above.
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >So why can't it see this domain?
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> >Thanks, Shareef.
>>>> >> >>> >
>>>> >> >>> >On Thu, Apr 9, 2020 at 6:12 AM Strahil Nikolov
>>>> >> ><hunter86_bg@yahoo.com>
>>>> >> >>> >wrote:
>>>> >> >>> >
>>>> >> >>> >> On April 9, 2020 1:51:05 AM GMT+03:00, Shareef Jalloq <
>>>> >> >>> >> shareef@jalloq.co.uk> wrote:
>>>> >> >>> >> >Don't know if this is useful or not, but I just tried to
>>>> >> >shutdown
>>>> >> >>> >and
>>>> >> >>> >> >start
>>>> >> >>> >> >another VM on one of the hosts and get the following
>error:
>>>> >> >>> >> >
>>>> >> >>> >> >virsh # start scratch
>>>> >> >>> >> >
>>>> >> >>> >> >error: Failed to start domain scratch
>>>> >> >>> >> >
>>>> >> >>> >> >error: Network not found: no network with matching name
>>>> >> >>> >> >'vdsm-ovirtmgmt'
>>>> >> >>> >> >
>>>> >> >>> >> >Is this not referring to the interface name as the
>network is
>>>> >> >called
>>>> >> >>> >> >'ovirtmgnt'.
>>>> >> >>> >> >
>>>> >> >>> >> >On Wed, Apr 8, 2020 at 11:35 PM Shareef Jalloq
>>>> >> >>> ><shareef@jalloq.co.uk>
>>>> >> >>> >> >wrote:
>>>> >> >>> >> >
>>>> >> >>> >> >> Hmmm, virsh tells me the HE is running but it hasn't
>come
>>>> >up
>>>> >> >and
>>>> >> >>> >the
>>>> >> >>> >> >> agent.log is full of the same errors.
>>>> >> >>> >> >>
>>>> >> >>> >> >> On Wed, Apr 8, 2020 at 11:31 PM Shareef Jalloq
>>>> >> >>> ><shareef@jalloq.co.uk>
>>>> >> >>> >> >> wrote:
>>>> >> >>> >> >>
>>>> >> >>> >> >>> Ah hah!  Ok, so I've managed to start it using virsh
>on
>>>> >the
>>>> >> >>> >second
>>>> >> >>> >> >host
>>>> >> >>> >> >>> but my first host is still dead.
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> First of all, what are these 56,317 .prob- files that
>get
>>>> >> >dumped
>>>> >> >>> >to
>>>> >> >>> >> >the
>>>> >> >>> >> >>> NFS mounts?
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> Secondly, why doesn't the node mount the NFS
>directories
>>>> >at
>>>> >> >boot?
>>>> >> >>> >> >Is
>>>> >> >>> >> >>> that the issue with this particular node?
>>>> >> >>> >> >>>
>>>> >> >>> >> >>> On Wed, Apr 8, 2020 at 11:12 PM
>>>> ><eevans@digitaldatatechs.com>
>>>> >> >>> >wrote:
>>>> >> >>> >> >>>
>>>> >> >>> >> >>>> Did you try virsh list --inactive
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Eric Evans
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Digital Data Services LLC.
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> 304.660.9080
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> *From:* Shareef Jalloq <shareef@jalloq.co.uk>
>>>> >> >>> >> >>>> *Sent:* Wednesday, April 8, 2020 5:58 PM
>>>> >> >>> >> >>>> *To:* Strahil Nikolov <hunter86_bg@yahoo.com>
>>>> >> >>> >> >>>> *Cc:* Ovirt Users <users@ovirt.org>
>>>> >> >>> >> >>>> *Subject:* [ovirt-users] Re: ovirt-engine
>unresponsive -
>>>> >how
>>>> >> >to
>>>> >> >>> >> >rescue?
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> I've now shut down the VMs on one host and rebooted
>it
>>>> >but
>>>> >> >the
>>>> >> >>> >> >agent
>>>> >> >>> >> >>>> service doesn't start.  If I run 'hosted-engine
>>>> >--vm-status'
>>>> >> >I
>>>> >> >>> >get:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> The hosted engine configuration has not been
>retrieved
>>>> >from
>>>> >> >>> >shared
>>>> >> >>> >> >>>> storage. Please ensure that ovirt-ha-agent is
>running and
>>>> >> >the
>>>> >> >>> >> >storage
>>>> >> >>> >> >>>> server is reachable.
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> and indeed if I list the mounts under
>>>> >/rhev/data-center/mnt,
>>>> >> >>> >only
>>>> >> >>> >> >one of
>>>> >> >>> >> >>>> the directories is mounted.  I have 3 NFS mounts,
>one ISO
>>>> >> >Domain
>>>> >> >>> >> >and two
>>>> >> >>> >> >>>> Data Domains.  Only one Data Domain has mounted and
>this
>>>> >has
>>>> >> >>> >lots
>>>> >> >>> >> >of .prob
>>>> >> >>> >> >>>> files in.  So why haven't the other NFS exports been
>>>> >> >mounted?
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Manually mounting them doesn't seem to have helped
>much
>>>> >> >either.
>>>> >> >>> >I
>>>> >> >>> >> >can
>>>> >> >>> >> >>>> start the broker service but the agent service says
>no.
>>>> >> >Same
>>>> >> >>> >error
>>>> >> >>> >> >as the
>>>> >> >>> >> >>>> one in my last email.
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Shareef.
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> On Wed, Apr 8, 2020 at 9:57 PM Shareef Jalloq
>>>> >> >>> >> ><shareef@jalloq.co.uk>
>>>> >> >>> >> >>>> wrote:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Right, still down.  I've run virsh and it doesn't
>know
>>>> >> >anything
>>>> >> >>> >> >about
>>>> >> >>> >> >>>> the engine vm.
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> I've restarted the broker and agent services and I
>still
>>>> >get
>>>> >> >>> >> >nothing in
>>>> >> >>> >> >>>> virsh->list.
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> In the logs under /var/log/ovirt-hosted-engine-ha I
>see
>>>> >lots
>>>> >> >of
>>>> >> >>> >> >errors:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> broker.log:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
>>>> >> >>> >> >>>> ovirt-hosted-engine-ha broker 2.3.6 started
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Searching for submonitors in
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor network
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor cpu-load-no-engine
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor mgmt-bridge
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor network
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor cpu-load
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor engine-health
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor mgmt-bridge
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor cpu-load-no-engine
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor cpu-load
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor mem-free
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor storage-domain
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor storage-domain
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor mem-free
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Loaded submonitor engine-health
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,143::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Finished loading submonitors
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,197::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
>>>> >> >>> >> >>>> Connecting the storage
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,197::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >> >>>> Connecting storage server
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,414::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >> >>>> Connecting storage server
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:20,628::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>>>> >> >>> >> >>>> Refreshing the storage domain
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::WARNING::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:21,057::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>>>> >> >>> >> >>>> Can't connect vdsm storage: Command
>StorageDomain.getInfo
>>>> >> >with
>>>> >> >>> >args
>>>> >> >>> >> >>>> {'storagedomainID':
>>>> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
>>>> >> >>> >failed:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> (code=350, message=Error in storage domain action:
>>>> >> >>> >> >>>> (u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:21,901::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
>>>> >> >>> >> >>>> ovirt-hosted-engine-ha broker 2.3.6 started
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:56:21,901::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>>>> >> >>> >> >>>> Searching for submonitors in
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> agent.log:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:57:00,799::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>>>> >> >>> >> >>>> Trying to restart agent
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>>
>>>> >>
>>>>
>>>>
>>>>>20:57:00,799::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>> >> >>> >> >>>> Agent shutting down
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,144::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>> >> >>> >> >>>> ovirt-hosted-engine-ha agent 2.3.6 started
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,182::hosted_engine::234::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
>>>> >> >>> >> >>>> Found certificate common name:
>ovirt-node-01.phoelex.com
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,294::hosted_engine::543::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
>>>> >> >>> >> >>>> Initializing ha-broker connection
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,296::brokerlink::80::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
>>>> >> >>> >> >>>> Starting monitor network, options {'tcp_t_address':
>'',
>>>> >> >>> >> >'network_test':
>>>> >> >>> >> >>>> 'dns', 'tcp_t_port': '', 'addr': '192.168.1.99'}
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,296::hosted_engine::559::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
>>>> >> >>> >> >>>> Failed to start necessary monitors
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,297::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>>>> >> >>> >> >>>> Traceback (most recent call last):
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>   File
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> >> >>> >> >>>> line 131, in _run_agent
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>     return action(he)
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>   File
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> >> >>> >> >>>> line 55, in action_proper
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>     return he.start_monitoring()
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>   File
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> >> >>> >> >>>> line 432, in start_monitoring
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>     self._initialize_broker()
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>   File
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> >> >>> >> >>>> line 556, in _initialize_broker
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>     m.get('options', {}))
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>   File
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>> >> >>> >> >>>> line 89, in start_monitor
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>     ).format(t=type, o=options, e=e)
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> RequestError: brokerlink - failed to start monitor
>via
>>>> >> >>> >> >ovirt-ha-broker:
>>>> >> >>> >> >>>> [Errno 2] No such file or directory, [monitor:
>'network',
>>>> >> >>> >options:
>>>> >> >>> >> >>>> {'tcp_t_address': '', 'network_test': 'dns',
>>>> >'tcp_t_port':
>>>> >> >'',
>>>> >> >>> >> >'addr':
>>>> >> >>> >> >>>> '192.168.1.99'}]
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::ERROR::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>>
>>>> >>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,297::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>>>> >> >>> >> >>>> Trying to restart agent
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> MainThread::INFO::2020-04-08
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>>
>>>> >>
>>>>
>>>>
>>>>>20:57:11,297::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>> >> >>> >> >>>> Agent shutting down
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> On Wed, Apr 8, 2020 at 6:10 PM Strahil Nikolov
>>>> >> >>> >> ><hunter86_bg@yahoo.com>
>>>> >> >>> >> >>>> wrote:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> On April 8, 2020 7:47:20 PM GMT+03:00, "Maton,
>Brett" <
>>>> >> >>> >> >>>> matonb@ltresources.co.uk> wrote:
>>>> >> >>> >> >>>> >On the host you tried to restart the engine on:
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >Add an alias to virsh (authenticates with
>>>> >virsh_auth.conf)
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >alias virsh='virsh -c
>>>> >> >>> >> >>>>
>>>> >> >>>
>>>>
>>>>qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >Then run virsh:
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >virsh
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >virsh # list
>>>> >> >>> >> >>>> > Id    Name                           State
>>>> >> >>> >> >>>>
>>----------------------------------------------------
>>>> >> >>> >> >>>> > xx    HostedEngine                   Paused
>>>> >> >>> >> >>>> > xx    **********                     running
>>>> >> >>> >> >>>> > ...
>>>> >> >>> >> >>>> > xx     **********                     running
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >HostedEngine should be in the list, try and resume
>the
>>>> >> >engine:
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >virsh # resume HostedEngine
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >On Wed, 8 Apr 2020 at 17:28, Shareef Jalloq
>>>> >> >>> ><shareef@jalloq.co.uk>
>>>> >> >>> >> >>>> >wrote:
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>> >> Thanks!
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> The status hangs due to, I guess, the VM being
>>>> >down....
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> [root@ovirt-node-01 ~]# hosted-engine --vm-start
>>>> >> >>> >> >>>> >> VM exists and is down, cleaning up and restarting
>>>> >> >>> >> >>>> >> VM in WaitForLaunch
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> but this doesn't seem to do anything.  OK, after
>a
>>>> >while
>>>> >> >I
>>>> >> >>> >get a
>>>> >> >>> >> >>>> >status of
>>>> >> >>> >> >>>> >> it being barfed...
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> --== Host ovirt-node-00.phoelex.com (id: 1)
>status
>>>> >==--
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> conf_on_shared_storage             : True
>>>> >> >>> >> >>>> >> Status up-to-date                  : False
>>>> >> >>> >> >>>> >> Hostname                           :
>>>> >> >>> >ovirt-node-00.phoelex.com
>>>> >> >>> >> >>>> >> Host ID                            : 1
>>>> >> >>> >> >>>> >> Engine status                      : unknown
>>>> >stale-data
>>>> >> >>> >> >>>> >> Score                              : 3400
>>>> >> >>> >> >>>> >> stopped                            : False
>>>> >> >>> >> >>>> >> Local maintenance                  : False
>>>> >> >>> >> >>>> >> crc32                              : 9c4a034b
>>>> >> >>> >> >>>> >> local_conf_timestamp               : 523362
>>>> >> >>> >> >>>> >> Host timestamp                     : 523608
>>>> >> >>> >> >>>> >> Extra metadata (valid at timestamp):
>>>> >> >>> >> >>>> >> metadata_parse_version=1
>>>> >> >>> >> >>>> >> metadata_feature_version=1
>>>> >> >>> >> >>>> >> timestamp=523608 (Wed Apr  8 16:17:11 2020)
>>>> >> >>> >> >>>> >> host-id=1
>>>> >> >>> >> >>>> >> score=3400
>>>> >> >>> >> >>>> >> vm_conf_refresh_time=523362 (Wed Apr  8 16:13:06
>2020)
>>>> >> >>> >> >>>> >> conf_on_shared_storage=True
>>>> >> >>> >> >>>> >> maintenance=False
>>>> >> >>> >> >>>> >> state=EngineDown
>>>> >> >>> >> >>>> >> stopped=False
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> --== Host ovirt-node-01.phoelex.com (id: 2)
>status
>>>> >==--
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> conf_on_shared_storage             : True
>>>> >> >>> >> >>>> >> Status up-to-date                  : True
>>>> >> >>> >> >>>> >> Hostname                           :
>>>> >> >>> >ovirt-node-01.phoelex.com
>>>> >> >>> >> >>>> >> Host ID                            : 2
>>>> >> >>> >> >>>> >> Engine status                      : {"reason":
>"bad
>>>> >vm
>>>> >> >>> >status",
>>>> >> >>> >> >>>> >"health":
>>>> >> >>> >> >>>> >> "bad", "vm": "down_unexpected", "detail": "Down"}
>>>> >> >>> >> >>>> >> Score                              : 0
>>>> >> >>> >> >>>> >> stopped                            : False
>>>> >> >>> >> >>>> >> Local maintenance                  : False
>>>> >> >>> >> >>>> >> crc32                              : 5045f2eb
>>>> >> >>> >> >>>> >> local_conf_timestamp               : 1737037
>>>> >> >>> >> >>>> >> Host timestamp                     : 1737283
>>>> >> >>> >> >>>> >> Extra metadata (valid at timestamp):
>>>> >> >>> >> >>>> >> metadata_parse_version=1
>>>> >> >>> >> >>>> >> metadata_feature_version=1
>>>> >> >>> >> >>>> >> timestamp=1737283 (Wed Apr  8 16:16:17 2020)
>>>> >> >>> >> >>>> >> host-id=2
>>>> >> >>> >> >>>> >> score=0
>>>> >> >>> >> >>>> >> vm_conf_refresh_time=1737037 (Wed Apr  8 16:12:11
>>>> >2020)
>>>> >> >>> >> >>>> >> conf_on_shared_storage=True
>>>> >> >>> >> >>>> >> maintenance=False
>>>> >> >>> >> >>>> >> state=EngineUnexpectedlyDown
>>>> >> >>> >> >>>> >> stopped=False
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >> On Wed, Apr 8, 2020 at 5:09 PM Maton, Brett
>>>> >> >>> >> >>>> ><matonb@ltresources.co.uk>
>>>> >> >>> >> >>>> >> wrote:
>>>> >> >>> >> >>>> >>
>>>> >> >>> >> >>>> >>> First steps, on one of your hosts as root:
>>>> >> >>> >> >>>> >>>
>>>> >> >>> >> >>>> >>> To get information:
>>>> >> >>> >> >>>> >>> hosted-engine --vm-status
>>>> >> >>> >> >>>> >>>
>>>> >> >>> >> >>>> >>> To start the engine:
>>>> >> >>> >> >>>> >>> hosted-engine --vm-start
>>>> >> >>> >> >>>> >>>
>>>> >> >>> >> >>>> >>>
>>>> >> >>> >> >>>> >>> On Wed, 8 Apr 2020 at 17:00, Shareef Jalloq
>>>> >> >>> >> ><shareef@jalloq.co.uk>
>>>> >> >>> >> >>>> >wrote:
>>>> >> >>> >> >>>> >>>
>>>> >> >>> >> >>>> >>>> So my engine has gone down and I can't ssh into
>it
>>>> >> >either.
>>>> >> >>> >If
>>>> >> >>> >> >I
>>>> >> >>> >> >>>> >try to
>>>> >> >>> >> >>>> >>>> log into the web-ui of the node it is running
>on, I
>>>> >get
>>>> >> >>> >> >redirected
>>>> >> >>> >> >>>> >because
>>>> >> >>> >> >>>> >>>> the node can't reach the engine.
>>>> >> >>> >> >>>> >>>>
>>>> >> >>> >> >>>> >>>> What are my next steps?
>>>> >> >>> >> >>>> >>>>
>>>> >> >>> >> >>>> >>>> Shareef.
>>>> >> >>> >> >>>> >>>> _______________________________________________
>>>> >> >>> >> >>>> >>>> Users mailing list -- users@ovirt.org
>>>> >> >>> >> >>>> >>>> To unsubscribe send an email to
>>>> >users-leave@ovirt.org
>>>> >> >>> >> >>>> >>>> Privacy Statement:
>>>> >> >>> >https://www.ovirt.org/privacy-policy.html
>>>> >> >>> >> >>>> >>>> oVirt Code of Conduct:
>>>> >> >>> >> >>>> >>>>
>>>> >> >https://www.ovirt.org/community/about/community-guidelines/
>>>> >> >>> >> >>>> >>>> List Archives:
>>>> >> >>> >> >>>> >>>>
>>>> >> >>> >> >>>> >
>>>> >> >>> >> >>>>
>>>> >> >>> >> >
>>>> >> >>> >>
>>>> >> >>> >
>>>> >> >>>
>>>> >> >
>>>> >>
>>>> >
>>>>
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/
>>>> >> >>> >> >>>> >>>>
>>>> >> >>> >> >>>> >>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> This has  to be resolved:
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Engine status                      : unknown
>stale-data
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Run again 'hosted-engine --vm-status'. If it remains
>the
>>>> >> >same,
>>>> >> >>> >> >restart
>>>> >> >>> >> >>>> ovirt-ha-broker.service & ovirt-ha-agent.service
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Verify that the engine's storage is available. Then
>>>> >monitor
>>>> >> >the
>>>> >> >>> >> >broker
>>>> >> >>> >> >>>> & agent logs in /var/log/ovirt-hosted-engine-ha
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>> Best Regards,
>>>> >> >>> >> >>>> Strahil Nikolov
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >> >>>>
>>>> >> >>> >>
>>>> >> >>> >> Hi Shareef,
>>>> >> >>> >>
>>>> >> >>> >> The flow of activation oVirt is more complex than a plain
>KVM.
>>>> >> >>> >> Mounting of the domains happen during the activation of
>the
>>>> >node
>>>> >> >(
>>>> >> >>> >the
>>>> >> >>> >> HostedEngine is activating everything needed).
>>>> >> >>> >>
>>>> >> >>> >> Focus on the HostedEngine VM.
>>>> >> >>> >> Is it running properly ?
>>>> >> >>> >>
>>>> >> >>> >> If not,try:
>>>> >> >>> >> 1. Verify that the storage domain exists
>>>> >> >>> >> 2. Check if  it has 'ha_agents' directory
>>>> >> >>> >> 3. Check if the links are  OK, if not you can safely
>remove
>>>> >the
>>>> >> >links
>>>> >> >>> >>
>>>> >> >>> >> 4. Next check the services are running:
>>>> >> >>> >> A) sanlock
>>>> >> >>> >> B) supervdsmd
>>>> >> >>> >> C) vdsmd
>>>> >> >>> >> D) libvirtd
>>>> >> >>> >>
>>>> >> >>> >> 5. Increase the log level for broker  and agent services:
>>>> >> >>> >>
>>>> >> >>> >> cd  /etc/ovirt-hosted-engine-ha
>>>> >> >>> >> vim *-log.conf
>>>> >> >>> >>
>>>> >> >>> >> systemctl restart ovirt-ha-broker ovirt-ha-agent
>>>> >> >>> >>
>>>> >> >>> >> 6. Check what they are complaining about
>>>> >> >>> >> Keep in mind that agent will keep throwing errors  untill
>the
>>>> >> >broker
>>>> >> >>> >stops
>>>> >> >>> >> doing it (agent depends  on broker),  so broker must be
>OK
>>>> >before
>>>> >> >>> >> peoceeding with the agent log.
>>>> >> >>> >>
>>>> >> >>> >> About the manual VM start, you need  2 things:
>>>> >> >>> >>
>>>> >> >>> >> 1.  Define the VM network
>>>> >> >>> >> # cat vdsm-ovirtmgmt.xml <network>
>>>> >> >>> >>   <name>vdsm-ovirtmgmt</name>
>>>> >> >>> >>   <uuid>8ded486e-e681-4754-af4b-5737c2b05405</uuid>
>>>> >> >>> >>   <forward mode='bridge'/>
>>>> >> >>> >>   <bridge name='ovirtmgmt'/>
>>>> >> >>> >> </network>
>>>> >> >>> >>
>>>> >> >>> >> [root@ovirt1 HostedEngine-RECOVERY]# virsh define
>>>> >> >vdsm-ovirtmgmt.xml
>>>> >> >>> >>
>>>> >> >>> >> 2. Get an xml definition which can be found in the vdsm
>log.
>>>> >> >Every VM
>>>> >> >>> >at
>>>> >> >>> >> start up has it's configuration printed out  in vdsm log
>on
>>>> >the
>>>> >> >host
>>>> >> >>> >it
>>>> >> >>> >> starts.
>>>> >> >>> >> Save to file and then:
>>>> >> >>> >> A) virsh define myvm.xml
>>>> >> >>> >> B) virsh start myvm
>>>> >> >>> >>
>>>> >> >>> >> It seems there is/was a problem with your NFS shares.
>>>> >> >>> >>
>>>> >> >>> >>
>>>> >> >>> >> Best Regards,
>>>> >> >>> >> Strahil Nikolov
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>> Hey Shareef,
>>>> >> >>>
>>>> >> >>> Check if there are any files or folders not owned by
>vdsm:kvm .
>>>> >> >Something
>>>> >> >>> like this:
>>>> >> >>>
>>>> >> >>> find . -not -user 36 -not  -group 36 -print
>>>> >> >>>
>>>> >> >>> Also check if vdsm can access the images in the
>>>> >> >>> '<vol-mount-point>/images' directories.
>>>> >> >>>
>>>> >> >>> Best Regards,
>>>> >> >>> Strahil Nikolov
>>>> >> >>>
>>>> >> >>
>>>> >>
>>>> >> And the IPv6 address  '64:ff9b::c0a8:13d' ?
>>>> >>
>>>> >> I  don't see  in the log output.
>>>> >>
>>>> >> Best Regards,
>>>> >> Strahil Nikolov
>>>> >>
>>>>
>>>> Based  on your output , you got a PTR record  for IPv4  & IPv6 ...
>most
>>>> probably it's  the  reason.
>>>>
>>>> Set the IPv6 on the interface and try again.
>>>>
>>>> Best Regards,
>>>> Strahil Nikolov
>>>>
>>>

Do you have firewalld  up and running on the host ?

Best Regards,
Strahil Nikolov