On April 15, 2020 2:40:52 PM GMT+03:00, Shareef Jalloq <shareef(a)jalloq.co.uk>
wrote:
Yes, but there are no zones set up, just ports 22, 6801 adn 6900.
On Wed, Apr 15, 2020 at 12:37 PM Strahil Nikolov
<hunter86_bg(a)yahoo.com>
wrote:
> On April 15, 2020 2:28:05 PM GMT+03:00, Shareef Jalloq <
> shareef(a)jalloq.co.uk> wrote:
> >Oh this is painful. It seems to progress if you have both
> >he_force_ipv4
> >set and run the deployment with the '--4' switch.
> >
> >But then I get a failure when the ansible script checks for
> >firewalld-zones
> >and doesn't get anything back. Should the deployment flow not be
> >setting
> >any zones it needs?
> >
> >2020-04-15 10:57:25,439+0000 INFO
> >otopi.ovirt_hosted_engine_setup.ansible_utils
> >ansible_utils._process_output:109 TASK [ovirt.hosted_engine_setup :
Get
> >active list of active firewalld zones]
> >
> >2020-04-15 10:57:26,641+0000 DEBUG
> >otopi.ovirt_hosted_engine_setup.ansible_utils
> >ansible_utils._process_output:103 {u'stderr_lines': [],
u'changed':
> >True,
> >u'end': u'2020-04-15 10:57:26.481202',
u'_ansible_no_log': False,
> >u'stdout': u'', u'cmd': u'set -euo pipefail
&& firewall-cmd
> >--get-active-zones | grep -v "^\\s*interfaces"', u'start':
u'2020-04-15
> >10:57:26.050203', u'delta': u'0:00:00.430999',
u'stderr': u'',
u'rc':
> >1,
> >u'invocation': {u'module_args': {u'creates': None,
u'executable':
None,
> >u'_uses_shell': True, u'strip_empty_ends': True,
u'_raw_params':
u'set
> >-euo
> >pipefail && firewall-cmd --get-active-zones | grep -v
> >"^\\s*interfaces"',
> >u'removes': None, u'argv': None, u'warn': True,
u'chdir': None,
> >u'stdin_add_newline': True, u'stdin': None}},
u'stdout_lines': [],
> >u'msg':
> >u'non-zero return code'}
> >
> >2020-04-15 10:57:26,741+0000 ERROR
> >otopi.ovirt_hosted_engine_setup.ansible_utils
> >ansible_utils._process_output:107 fatal: [localhost]: FAILED! =>
> >{"changed": true, "cmd": "set -euo pipefail &&
firewall-cmd
> >--get-active-zones | grep -v \"^\\s*interfaces\"",
"delta":
> >"0:00:00.430999", "end": "2020-04-15
10:57:26.481202", "msg":
"non-zero
> >return code", "rc": 1, "start": "2020-04-15
10:57:26.050203",
"stderr":
> >"",
> >"stderr_lines": [], "stdout": "",
"stdout_lines": []}
> >
> >On Wed, Apr 15, 2020 at 10:23 AM Shareef Jalloq
<shareef(a)jalloq.co.uk>
> >wrote:
> >
> >> Ha, spoke too soon. It's now stuck in a loop and a google points
me
> >at
> >>
https://bugzilla.redhat.com/show_bug.cgi?id=1746585
> >>
> >> However, forcing ipv4 doesn't seem to have fixed the loop.
> >>
> >> On Wed, Apr 15, 2020 at 9:59 AM Shareef Jalloq
<shareef(a)jalloq.co.uk>
> >> wrote:
> >>
> >>> OK, that seems to have fixed it, thanks. Is this a side effect
of
> >>> redeploying the HE over a first time install? Nothing has changed
in
> >our
> >>> setup and I didn't need to do this when I initially set up our
> >nodes.
> >>>
> >>>
> >>>
> >>> On Tue, Apr 14, 2020 at 6:55 PM Strahil Nikolov
> ><hunter86_bg(a)yahoo.com>
> >>> wrote:
> >>>
> >>>> On April 14, 2020 6:17:17 PM GMT+03:00, Shareef Jalloq <
> >>>> shareef(a)jalloq.co.uk> wrote:
> >>>> >Hmmm, we're not using ipv6. Is that the issue?
> >>>> >
> >>>> >On Tue, Apr 14, 2020 at 3:56 PM Strahil Nikolov
> ><hunter86_bg(a)yahoo.com>
> >>>> >wrote:
> >>>> >
> >>>> >> On April 14, 2020 1:27:24 PM GMT+03:00, Shareef Jalloq
<
> >>>> >> shareef(a)jalloq.co.uk> wrote:
> >>>> >> >Right, I've given up on recovering the HE so want
to try and
> >>>> >redeploy
> >>>> >> >it.
> >>>> >> >There doesn't seem to be enough information to
debug why the
> >>>> >> >broker/agent
> >>>> >> >won't start cleanly.
> >>>> >> >
> >>>> >> >In running 'hosted-engine --deploy', I'm
seeing the
following
> >error
> >>>> >in
> >>>> >> >the
> >>>> >> >setup validation phase:
> >>>> >> >
> >>>> >> >2020-04-14 09:46:08,922+0000 DEBUG
> >otopi.plugins.otopi.dialog.human
> >>>> >> >dialog.__logString:204 DIALOG:SEND
Please
> >provide
> >>>> >the
> >>>> >> >hostname of this host on the management network
> >>>> >> >[ovirt-node-00.phoelex.com]:
> >>>> >> >
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,831+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
> >>>> >> >hostname.getResolvedAddresses:432
> >>>> >> >getResolvedAddresses:
set(['64:ff9b::c0a8:13d',
> >'192.168.1.61'])
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,832+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
> >>>> >> >hostname._validateFQDNresolvability:289
> >ovirt-node-00.phoelex.com
> >>>> >> >resolves
> >>>> >> >to: set(['64:ff9b::c0a8:13d',
'192.168.1.61'])
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,832+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.executeRaw:813
> >>>> >> >execute:
> >>>> >> >['/usr/bin/dig', '+noall',
'+answer',
> >'ovirt-node-00.phoelex.com',
> >>>> >> >'ANY'],
> >>>> >> >executable='None', cwd='None',
env=None
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,871+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.executeRaw:863
> >>>> >> >execute-result: ['/usr/bin/dig',
'+noall', '+answer', '
> >>>> >> >ovirt-node-00.phoelex.com', 'ANY'], rc=0
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,872+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.execute:921
> >>>> >> >execute-output: ['/usr/bin/dig',
'+noall', '+answer', '
> >>>> >> >ovirt-node-00.phoelex.com', 'ANY'] stdout:
> >>>> >> >
> >>>> >> >ovirt-node-00.phoelex.com. 86400 IN A
192.168.1.61
> >>>> >> >
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,872+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.execute:926
> >>>> >> >execute-output: ['/usr/bin/dig',
'+noall', '+answer', '
> >>>> >> >ovirt-node-00.phoelex.com', 'ANY'] stderr:
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,872+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.executeRaw:813
> >>>> >> >execute:
> >>>> >> >('/usr/sbin/ip', 'addr'),
executable='None', cwd='None',
> >env=None
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,876+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.executeRaw:863
> >>>> >> >execute-result: ('/usr/sbin/ip',
'addr'), rc=0
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,876+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.execute:921
> >>>> >> >execute-output: ('/usr/sbin/ip',
'addr') stdout:
> >>>> >> >
> >>>> >> >1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc
noqueue state
> >UNKNOWN
> >>>> >> >group
> >>>> >> >default qlen 1000
> >>>> >> >
> >>>> >> > link/loopback 00:00:00:00:00:00 brd
00:00:00:00:00:00
> >>>> >> >
> >>>> >> > inet 127.0.0.1/8 scope host lo
> >>>> >> >
> >>>> >> > valid_lft forever preferred_lft forever
> >>>> >> >
> >>>> >> > inet6 ::1/128 scope host
> >>>> >> >
> >>>> >> > valid_lft forever preferred_lft forever
> >>>> >> >
> >>>> >> >2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu
1500 qdisc mq
> >master
> >>>> >> >ovirtmgmt state UP group default qlen 1000
> >>>> >> >
> >>>> >> > link/ether ac:1f:6b:bc:32:6a brd ff:ff:ff:ff:ff:ff
> >>>> >> >
> >>>> >> >3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu
1500 qdisc
mq
> >state
> >>>> >> >DOWN
> >>>> >> >group default qlen 1000
> >>>> >> >
> >>>> >> > link/ether ac:1f:6b:bc:32:6b brd ff:ff:ff:ff:ff:ff
> >>>> >> >
> >>>> >> >4: ovs-system: <BROADCAST,MULTICAST> mtu 1500
qdisc noop
state
> >DOWN
> >>>> >> >group
> >>>> >> >default qlen 1000
> >>>> >> >
> >>>> >> > link/ether 02:e6:e2:80:93:8d brd ff:ff:ff:ff:ff:ff
> >>>> >> >
> >>>> >> >5: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc
noop state
DOWN
> >>>> >group
> >>>> >> >default qlen 1000
> >>>> >> >
> >>>> >> > link/ether 8a:26:44:50:ee:4a brd ff:ff:ff:ff:ff:ff
> >>>> >> >
> >>>> >> >21: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP>
mtu 1500
qdisc
> >>>> >noqueue
> >>>> >> >state UP group default qlen 1000
> >>>> >> >
> >>>> >> > link/ether ac:1f:6b:bc:32:6a brd ff:ff:ff:ff:ff:ff
> >>>> >> >
> >>>> >> > inet 192.168.1.61/24 brd 192.168.1.255 scope
global
> >ovirtmgmt
> >>>> >> >
> >>>> >> > valid_lft forever preferred_lft forever
> >>>> >> >
> >>>> >> > inet6 fe80::ae1f:6bff:febc:326a/64 scope link
> >>>> >> >
> >>>> >> > valid_lft forever preferred_lft forever
> >>>> >> >
> >>>> >> >22: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500
qdisc noop
> >state
> >>>> >DOWN
> >>>> >> >group
> >>>> >> >default qlen 1000
> >>>> >> >
> >>>> >> > link/ether 3a:02:7b:7d:b3:2a brd ff:ff:ff:ff:ff:ff
> >>>> >> >
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,876+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
plugin.execute:926
> >>>> >> >execute-output: ('/usr/sbin/ip',
'addr') stderr:
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,877+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
> >>>> >> >hostname.getLocalAddresses:251
> >>>> >> >addresses: [u'192.168.1.61',
u'fe80::ae1f:6bff:febc:326a']
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,877+0000 DEBUG
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
> >hostname.test_hostname:464
> >>>> >> >test_hostname exception
> >>>> >> >
> >>>> >> >Traceback (most recent call last):
> >>>> >> >
> >>>> >> >File
> >"/usr/lib/python2.7/site-packages/ovirt_setup_lib/hostname.py",
> >>>> >> >line
> >>>> >> >460, in test_hostname
> >>>> >> >
> >>>> >> > not_local_text,
> >>>> >> >
> >>>> >> >File
> >"/usr/lib/python2.7/site-packages/ovirt_setup_lib/hostname.py",
> >>>> >> >line
> >>>> >> >342, in _validateFQDNresolvability
> >>>> >> >
> >>>> >> > addresses=resolvedAddressesAsString
> >>>> >> >
> >>>> >> >RuntimeError:
ovirt-node-00.phoelex.com resolves to
> >>>> >64:ff9b::c0a8:13d
> >>>> >> >192.168.1.61 and not all of them can be mapped to non
loopback
> >>>> >devices
> >>>> >> >on
> >>>> >> >this host
> >>>> >> >
> >>>> >> >2020-04-14 09:46:12,884+0000 ERROR
> >>>> >> >otopi.plugins.gr_he_common.network.bridge
> >dialog.queryEnvKey:120
> >>>> >Host
> >>>> >> >name
> >>>> >> >is not valid:
ovirt-node-00.phoelex.com resolves to
> >>>> >64:ff9b::c0a8:13d
> >>>> >> >192.168.1.61 and not all of them can be mapped to non
loopback
> >>>> >devices
> >>>> >> >on
> >>>> >> >this host
> >>>> >> >
> >>>> >> >The node I'm running on has an IP address of .61
and
resolves
> >>>> >> >correctly.
> >>>> >> >
> >>>> >> >On Fri, Apr 10, 2020 at 12:55 PM Shareef Jalloq
> >>>> ><shareef(a)jalloq.co.uk>
> >>>> >> >wrote:
> >>>> >> >
> >>>> >> >> Where should I be checking if there are any
files/folder
not
> >owned
> >>>> >by
> >>>> >> >> vdsm:kvm? I checked on the mount the HA sits on
and it's
> >fine.
> >>>> >> >>
> >>>> >> >> How would I go about checking vdsm can access
those
images?
> >If I
> >>>> >run
> >>>> >> >> virsh, it lists them and they were running
yesterday even
> >though
> >>>> >the
> >>>> >> >HA was
> >>>> >> >> down. I've since restarted both hosts but the
broker is
> >still
> >>>> >> >spitting out
> >>>> >> >> the same error (copied below). How do I find the
reason
the
> >>>> >broker
> >>>> >> >can't
> >>>> >> >> connect to the storage? The conf file is already
at DEBUG
> >>>> >verbosity:
> >>>> >> >>
> >>>> >> >> [handler_logfile]
> >>>> >> >>
> >>>> >> >> class=logging.handlers.TimedRotatingFileHandler
> >>>> >> >>
> >>>> >> >>
args=('/var/log/ovirt-hosted-engine-ha/broker.log', 'd',
1,
> >7)
> >>>> >> >>
> >>>> >> >> level=DEBUG
> >>>> >> >>
> >>>> >> >> formatter=long
> >>>> >> >>
> >>>> >> >> And what are all these .prob-<num> files
that are being
> >created?
> >>>> >> >There
> >>>> >> >> are over 250K of them now on the mount I'm
using for the
Data
> >>>> >domain.
> >>>> >> >> They're all of 0 size and of the form,
> >>>> >> >> /rhev/data-center/mnt/nas-01.phoelex.com:
> >>>> >> >>
_volume2_vmstore/.prob-ffa867da-93db-4211-82df-b1b04a625ab9
> >>>> >> >>
> >>>> >> >> @eevans: The volume I have the Data Domain on has
TB's
free.
> > The
> >>>> >HA
> >>>> >> >is
> >>>> >> >> dead so I can't ssh in. No idea what started
these errors
> >and the
> >>>> >> >other
> >>>> >> >> VMs were still running happily although
they're on a
> >different
> >>>> >Data
> >>>> >> >Domain.
> >>>> >> >>
> >>>> >> >> Shareef.
> >>>> >> >>
> >>>> >> >> MainThread::INFO::2020-04-10
> >>>> >> >>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>07:45:00,408::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
> >>>> >> >> Connecting the storage
> >>>> >> >>
> >>>> >> >> MainThread::INFO::2020-04-10
> >>>> >> >>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>07:45:00,408::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >> Connecting storage server
> >>>> >> >>
> >>>> >> >> MainThread::INFO::2020-04-10
> >>>> >> >>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>07:45:01,577::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >> Connecting storage server
> >>>> >> >>
> >>>> >> >> MainThread::INFO::2020-04-10
> >>>> >> >>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>07:45:02,692::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >> Refreshing the storage domain
> >>>> >> >>
> >>>> >> >> MainThread::WARNING::2020-04-10
> >>>> >> >>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>07:45:05,175::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> >>>> >> >> Can't connect vdsm storage: Command
StorageDomain.getInfo
> >with
> >>>> >args
> >>>> >> >> {'storagedomainID':
'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
> >>>> >failed:
> >>>> >> >>
> >>>> >> >> (code=350, message=Error in storage domain
action:
> >>>> >> >>
(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
> >>>> >> >>
> >>>> >> >> On Thu, Apr 9, 2020 at 5:58 PM Strahil Nikolov
> >>>> >> ><hunter86_bg(a)yahoo.com>
> >>>> >> >> wrote:
> >>>> >> >>
> >>>> >> >>> On April 9, 2020 11:12:30 AM GMT+03:00,
Shareef Jalloq <
> >>>> >> >>> shareef(a)jalloq.co.uk> wrote:
> >>>> >> >>> >OK, let's go through this. I'm
looking at the node that
at
> >>>> >least
> >>>> >> >still
> >>>> >> >>> >has
> >>>> >> >>> >some VMs running. virsh also tells me
that the
> >HostedEngine VM
> >>>> >is
> >>>> >> >>> >running
> >>>> >> >>> >but it's unresponsive and I can't
shut it down.
> >>>> >> >>> >
> >>>> >> >>> >1. All storage domains exist and are
mounted.
> >>>> >> >>> >2. The ha_agent exists:
> >>>> >> >>> >
> >>>> >> >>> >[root@ovirt-node-01
ovirt-hosted-engine-ha]# ls
> >>>> >> >/rhev/data-center/mnt/
> >>>> >> >>> >nas-01.phoelex.com
> >>>> >> >>>
\:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/
> >>>> >> >>> >
> >>>> >> >>> >dom_md ha_agent images master
> >>>> >> >>> >
> >>>> >> >>> >3. There are two links
> >>>> >> >>> >
> >>>> >> >>> >[root@ovirt-node-01
ovirt-hosted-engine-ha]# ll
> >>>> >> >/rhev/data-center/mnt/
> >>>> >> >>> >nas-01.phoelex.com
> >>>> >> >>>
> >>>>
>>\:_volume2_vmstore/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ha_agent/
> >>>> >> >>> >
> >>>> >> >>> >total 8
> >>>> >> >>> >
> >>>> >> >>> >lrwxrwxrwx. 1 vdsm kvm 132 Apr 2 14:50
> >hosted-engine.lockspace
> >>>> >->
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/ffb90b82-42fe-4253-85d5-aaec8c280aaf/90e68791-0c6f-406a-89ac-e0d86c631604
> >>>> >> >>> >
> >>>> >> >>> >lrwxrwxrwx. 1 vdsm kvm 132 Apr 2 14:50
> >hosted-engine.metadata
> >>>> >->
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>/var/run/vdsm/storage/a6cea67d-dbfb-45cf-a775-b4d0d47b26f2/2161aed0-7250-4c1d-b667-ac94f60af17e/6b818e33-f80a-48cc-a59c-bba641e027d4
> >>>> >> >>> >
> >>>> >> >>> >4. The services exist but all seem to have
some sort of
> >warning:
> >>>> >> >>> >
> >>>> >> >>> >a) Apr 08 18:10:55
ovirt-node-01.phoelex.com
sanlock[1728]:
> >>>> >> >*2020-04-08
> >>>> >> >>> >18:10:55 1744152 [36796]: s16 delta_renew
long write
time
> >10
> >>>> >sec*
> >>>> >> >>> >
> >>>> >> >>> >b) Mar 23 18:02:59
ovirt-node-01.phoelex.com
> >supervdsmd[29409]:
> >>>> >> >*failed
> >>>> >> >>> >to
> >>>> >> >>> >load module nvdimm: libbd_nvdimm.so.2:
cannot open
shared
> >object
> >>>> >> >file:
> >>>> >> >>> >No
> >>>> >> >>> >such file or directory*
> >>>> >> >>> >
> >>>> >> >>> >c) Apr 09 08:05:13
ovirt-node-01.phoelex.com vdsm[4801]:
> >*ERROR
> >>>> >> >failed
> >>>> >> >>> >to
> >>>> >> >>> >retrieve Hosted Engine HA score
'[Errno 2] No such file
or
> >>>> >> >directory'Is
> >>>> >> >>> >the
> >>>> >> >>> >Hosted Engine setup finished?*
> >>>> >> >>> >
> >>>> >> >>> >d)Apr 08 22:48:27
ovirt-node-01.phoelex.com
> >libvirtd[29307]:
> >>>> >> >2020-04-08
> >>>> >> >>> >22:48:27.134+0000: 29309: warning :
qemuGetProcessInfo:1404
> >:
> >>>> >> >cannot
> >>>> >> >>> >parse
> >>>> >> >>> >process status data
> >>>> >> >>> >
> >>>> >> >>> >Apr 08 22:48:27
ovirt-node-01.phoelex.com
libvirtd[29307]:
> >>>> >> >2020-04-08
> >>>> >> >>> >22:48:27.134+0000: 29309: error :
> >virNetDevTapInterfaceStats:764
> >>>> >:
> >>>> >> >>> >internal
> >>>> >> >>> >error: /proc/net/dev: Interface not found
> >>>> >> >>> >
> >>>> >> >>> >Apr 08 23:09:39
ovirt-node-01.phoelex.com
libvirtd[29307]:
> >>>> >> >2020-04-08
> >>>> >> >>> >23:09:39.844+0000: 29307: error :
virNetSocketReadWire:1806
> >:
> >>>> >End
> >>>> >> >of
> >>>> >> >>> >file
> >>>> >> >>> >while reading data: Input/output error
> >>>> >> >>> >
> >>>> >> >>> >Apr 09 01:05:26
ovirt-node-01.phoelex.com
libvirtd[29307]:
> >>>> >> >2020-04-09
> >>>> >> >>> >01:05:26.660+0000: 29307: error :
virNetSocketReadWire:1806
> >:
> >>>> >End
> >>>> >> >of
> >>>> >> >>> >file
> >>>> >> >>> >while reading data: Input/output error
> >>>> >> >>> >
> >>>> >> >>> >5 & 6. The broker log is continually
printing this
error:
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,438::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >>>> >> >>> >ovirt-hosted-engine-ha broker 2.3.6
started
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,438::broker::55::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >>>> >> >>> >Running broker
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,438::broker::120::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_monitor)
> >>>> >> >>> >Starting monitor
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,438::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Searching for submonitors in
> >>>> >> >>>
> >>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker
> >>>> >> >>> >
> >>>> >> >>> >/submonitors
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,439::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor network
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,440::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor cpu-load-no-engine
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor mgmt-bridge
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor network
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor cpu-load
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,441::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor engine-health
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor mgmt-bridge
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,442::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor cpu-load-no-engine
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor cpu-load
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor mem-free
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor storage-domain
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor storage-domain
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,443::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor mem-free
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,444::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Loaded submonitor engine-health
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,444::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >Finished loading submonitors
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,444::broker::128::ovirt_hosted_engine_ha.broker.broker.Broker::(_get_storage_broker)
> >>>> >> >>> >Starting storage broker
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,444::storage_backends::369::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
> >>>> >> >>> >Connecting to VDSM
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,444::util::384::ovirt_hosted_engine_ha.lib.storage_backends::(__log_debug)
> >>>> >> >>> >Creating a new json-rpc connection to
VDSM
> >>>> >> >>> >
> >>>> >> >>> >Client localhost:54321::DEBUG::2020-04-09
> >>>> >> >>> >08:07:31,453::concurrent::258::root::(run)
START thread
> >>>> >> ><Thread(Client
> >>>> >> >>> >localhost:54321, started daemon
139992488138496)>
> >(func=<bound
> >>>> >> >method
> >>>> >> >>> >Reactor.process_requests of
> ><yajsonrpc.betterAsyncore.Reactor
> >>>> >> >object at
> >>>> >> >>> >0x7f528acabc90>>, args=(),
kwargs={})
> >>>> >> >>> >
> >>>> >> >>> >Client localhost:54321::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,459::stompclient::138::yajsonrpc.protocols.stomp.AsyncClient::(_process_connected)
> >>>> >> >>> >Stomp connection established
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>08:07:31,467::stompclient::294::jsonrpc.AsyncoreClient::(send)
> >>>> >> >Sending
> >>>> >> >>> >response
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,530::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
> >>>> >> >>> >Connecting the storage
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:31,531::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >Connecting storage server
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>08:07:31,531::stompclient::294::jsonrpc.AsyncoreClient::(send)
> >>>> >> >Sending
> >>>> >> >>> >response
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>08:07:31,534::stompclient::294::jsonrpc.AsyncoreClient::(send)
> >>>> >> >Sending
> >>>> >> >>> >response
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:32,199::storage_server::158::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(_validate_pre_connected_path)
> >>>> >> >>> >Storage domain
a6cea67d-dbfb-45cf-a775-b4d0d47b26f2 is
not
> >>>> >> >available
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:32,199::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >Connecting storage server
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>08:07:32,199::stompclient::294::jsonrpc.AsyncoreClient::(send)
> >>>> >> >Sending
> >>>> >> >>> >response
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:32,814::storage_server::363::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >[{u'status': 0, u'id':
> >u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}]
> >>>> >> >>> >
> >>>> >> >>> >MainThread::INFO::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:32,814::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >Refreshing the storage domain
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>08:07:32,815::stompclient::294::jsonrpc.AsyncoreClient::(send)
> >>>> >> >Sending
> >>>> >> >>> >response
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:33,129::storage_server::420::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >Error refreshing storage domain: Command
> >StorageDomain.getStats
> >>>> >> >with
> >>>> >> >>> >args
> >>>> >> >>> >{'storagedomainID':
'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
> >>>> >failed:
> >>>> >> >>> >
> >>>> >> >>> >(code=350, message=Error in storage domain
action:
> >>>> >> >>>
>(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>08:07:33,130::stompclient::294::jsonrpc.AsyncoreClient::(send)
> >>>> >> >Sending
> >>>> >> >>> >response
> >>>> >> >>> >
> >>>> >> >>> >MainThread::DEBUG::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:33,795::storage_backends::208::ovirt_hosted_engine_ha.lib.storage_backends::(_get_sector_size)
> >>>> >> >>> >Command StorageDomain.getInfo with args
{'storagedomainID':
> >>>> >> >>>
>'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed:
> >>>> >> >>> >
> >>>> >> >>> >(code=350, message=Error in storage domain
action:
> >>>> >> >>>
>(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
> >>>> >> >>> >
> >>>> >> >>> >MainThread::WARNING::2020-04-09
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>08:07:33,795::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> >>>> >> >>> >Can't connect vdsm storage: Command
StorageDomain.getInfo
> >with
> >>>> >args
> >>>> >> >>> >{'storagedomainID':
'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
> >>>> >failed:
> >>>> >> >>> >
> >>>> >> >>> >(code=350, message=Error in storage domain
action:
> >>>> >> >>>
>(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
> >>>> >> >>> >
> >>>> >> >>> >
> >>>> >> >>> >The UUID it is moaning about is indeed the
one that the
HA
> >sits
> >>>> >on
> >>>> >> >and
> >>>> >> >>> >is
> >>>> >> >>> >the one I listed the contents of in step 2
above.
> >>>> >> >>> >
> >>>> >> >>> >
> >>>> >> >>> >So why can't it see this domain?
> >>>> >> >>> >
> >>>> >> >>> >
> >>>> >> >>> >Thanks, Shareef.
> >>>> >> >>> >
> >>>> >> >>> >On Thu, Apr 9, 2020 at 6:12 AM Strahil
Nikolov
> >>>> >> ><hunter86_bg(a)yahoo.com>
> >>>> >> >>> >wrote:
> >>>> >> >>> >
> >>>> >> >>> >> On April 9, 2020 1:51:05 AM
GMT+03:00, Shareef Jalloq
<
> >>>> >> >>> >> shareef(a)jalloq.co.uk> wrote:
> >>>> >> >>> >> >Don't know if this is useful
or not, but I just tried
to
> >>>> >> >shutdown
> >>>> >> >>> >and
> >>>> >> >>> >> >start
> >>>> >> >>> >> >another VM on one of the hosts
and get the following
> >error:
> >>>> >> >>> >> >
> >>>> >> >>> >> >virsh # start scratch
> >>>> >> >>> >> >
> >>>> >> >>> >> >error: Failed to start domain
scratch
> >>>> >> >>> >> >
> >>>> >> >>> >> >error: Network not found: no
network with matching
name
> >>>> >> >>> >> >'vdsm-ovirtmgmt'
> >>>> >> >>> >> >
> >>>> >> >>> >> >Is this not referring to the
interface name as the
> >network is
> >>>> >> >called
> >>>> >> >>> >> >'ovirtmgnt'.
> >>>> >> >>> >> >
> >>>> >> >>> >> >On Wed, Apr 8, 2020 at 11:35 PM
Shareef Jalloq
> >>>> >> >>> ><shareef(a)jalloq.co.uk>
> >>>> >> >>> >> >wrote:
> >>>> >> >>> >> >
> >>>> >> >>> >> >> Hmmm, virsh tells me the HE
is running but it
hasn't
> >come
> >>>> >up
> >>>> >> >and
> >>>> >> >>> >the
> >>>> >> >>> >> >> agent.log is full of the
same errors.
> >>>> >> >>> >> >>
> >>>> >> >>> >> >> On Wed, Apr 8, 2020 at 11:31
PM Shareef Jalloq
> >>>> >> >>> ><shareef(a)jalloq.co.uk>
> >>>> >> >>> >> >> wrote:
> >>>> >> >>> >> >>
> >>>> >> >>> >> >>> Ah hah! Ok, so I've
managed to start it using
virsh
> >on
> >>>> >the
> >>>> >> >>> >second
> >>>> >> >>> >> >host
> >>>> >> >>> >> >>> but my first host is
still dead.
> >>>> >> >>> >> >>>
> >>>> >> >>> >> >>> First of all, what are
these 56,317 .prob- files
that
> >get
> >>>> >> >dumped
> >>>> >> >>> >to
> >>>> >> >>> >> >the
> >>>> >> >>> >> >>> NFS mounts?
> >>>> >> >>> >> >>>
> >>>> >> >>> >> >>> Secondly, why
doesn't the node mount the NFS
> >directories
> >>>> >at
> >>>> >> >boot?
> >>>> >> >>> >> >Is
> >>>> >> >>> >> >>> that the issue with this
particular node?
> >>>> >> >>> >> >>>
> >>>> >> >>> >> >>> On Wed, Apr 8, 2020 at
11:12 PM
> >>>> ><eevans(a)digitaldatatechs.com>
> >>>> >> >>> >wrote:
> >>>> >> >>> >> >>>
> >>>> >> >>> >> >>>> Did you try virsh
list --inactive
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Eric Evans
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Digital Data
Services LLC.
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> 304.660.9080
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> *From:* Shareef
Jalloq <shareef(a)jalloq.co.uk>
> >>>> >> >>> >> >>>> *Sent:* Wednesday,
April 8, 2020 5:58 PM
> >>>> >> >>> >> >>>> *To:* Strahil
Nikolov <hunter86_bg(a)yahoo.com>
> >>>> >> >>> >> >>>> *Cc:* Ovirt Users
<users(a)ovirt.org>
> >>>> >> >>> >> >>>> *Subject:*
[ovirt-users] Re: ovirt-engine
> >unresponsive -
> >>>> >how
> >>>> >> >to
> >>>> >> >>> >> >rescue?
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> I've now shut
down the VMs on one host and
rebooted
> >it
> >>>> >but
> >>>> >> >the
> >>>> >> >>> >> >agent
> >>>> >> >>> >> >>>> service doesn't
start. If I run 'hosted-engine
> >>>> >--vm-status'
> >>>> >> >I
> >>>> >> >>> >get:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> The hosted engine
configuration has not been
> >retrieved
> >>>> >from
> >>>> >> >>> >shared
> >>>> >> >>> >> >>>> storage. Please
ensure that ovirt-ha-agent is
> >running and
> >>>> >> >the
> >>>> >> >>> >> >storage
> >>>> >> >>> >> >>>> server is
reachable.
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> and indeed if I list
the mounts under
> >>>> >/rhev/data-center/mnt,
> >>>> >> >>> >only
> >>>> >> >>> >> >one of
> >>>> >> >>> >> >>>> the directories is
mounted. I have 3 NFS mounts,
> >one ISO
> >>>> >> >Domain
> >>>> >> >>> >> >and two
> >>>> >> >>> >> >>>> Data Domains. Only
one Data Domain has mounted
and
> >this
> >>>> >has
> >>>> >> >>> >lots
> >>>> >> >>> >> >of .prob
> >>>> >> >>> >> >>>> files in. So why
haven't the other NFS exports
been
> >>>> >> >mounted?
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Manually mounting
them doesn't seem to have
helped
> >much
> >>>> >> >either.
> >>>> >> >>> >I
> >>>> >> >>> >> >can
> >>>> >> >>> >> >>>> start the broker
service but the agent service
says
> >no.
> >>>> >> >Same
> >>>> >> >>> >error
> >>>> >> >>> >> >as the
> >>>> >> >>> >> >>>> one in my last
email.
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Shareef.
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> On Wed, Apr 8, 2020
at 9:57 PM Shareef Jalloq
> >>>> >> >>> >> ><shareef(a)jalloq.co.uk>
> >>>> >> >>> >> >>>> wrote:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Right, still down.
I've run virsh and it doesn't
> >know
> >>>> >> >anything
> >>>> >> >>> >> >about
> >>>> >> >>> >> >>>> the engine vm.
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> I've restarted
the broker and agent services and
I
> >still
> >>>> >get
> >>>> >> >>> >> >nothing in
> >>>> >> >>> >> >>>> virsh->list.
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> In the logs under
/var/log/ovirt-hosted-engine-ha
I
> >see
> >>>> >lots
> >>>> >> >of
> >>>> >> >>> >> >errors:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> broker.log:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >>>> >> >>> >> >>>>
ovirt-hosted-engine-ha broker 2.3.6 started
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Searching for
submonitors in
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
network
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
cpu-load-no-engine
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
mgmt-bridge
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
network
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
cpu-load
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
engine-health
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
mgmt-bridge
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
cpu-load-no-engine
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
cpu-load
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
mem-free
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
storage-domain
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
storage-domain
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
mem-free
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Loaded submonitor
engine-health
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,143::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Finished loading
submonitors
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,197::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
> >>>> >> >>> >> >>>> Connecting the
storage
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,197::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >> >>>> Connecting storage
server
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,414::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >> >>>> Connecting storage
server
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:20,628::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> >>>> >> >>> >> >>>> Refreshing the
storage domain
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::WARNING::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:21,057::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> >>>> >> >>> >> >>>> Can't connect
vdsm storage: Command
> >StorageDomain.getInfo
> >>>> >> >with
> >>>> >> >>> >args
> >>>> >> >>> >> >>>>
{'storagedomainID':
> >>>> >'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'}
> >>>> >> >>> >failed:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> (code=350,
message=Error in storage domain
action:
> >>>> >> >>> >> >>>>
(u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:21,901::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >>>> >> >>> >> >>>>
ovirt-hosted-engine-ha broker 2.3.6 started
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:56:21,901::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >>>> >> >>> >> >>>> Searching for
submonitors in
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> agent.log:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::ERROR::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:00,799::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
> >>>> >> >>> >> >>>> Trying to restart
agent
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:00,799::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
> >>>> >> >>> >> >>>> Agent shutting down
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,144::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
> >>>> >> >>> >> >>>>
ovirt-hosted-engine-ha agent 2.3.6 started
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,182::hosted_engine::234::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
> >>>> >> >>> >> >>>> Found certificate
common name:
> >ovirt-node-01.phoelex.com
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,294::hosted_engine::543::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
> >>>> >> >>> >> >>>> Initializing
ha-broker connection
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,296::brokerlink::80::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> >>>> >> >>> >> >>>> Starting monitor
network, options
{'tcp_t_address':
> >'',
> >>>> >> >>> >> >'network_test':
> >>>> >> >>> >> >>>> 'dns',
'tcp_t_port': '', 'addr': '192.168.1.99'}
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::ERROR::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,296::hosted_engine::559::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
> >>>> >> >>> >> >>>> Failed to start
necessary monitors
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::ERROR::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,297::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
> >>>> >> >>> >> >>>> Traceback (most
recent call last):
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> File
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> >>>> >> >>> >> >>>> line 131, in
_run_agent
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> return
action(he)
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> File
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> >>>> >> >>> >> >>>> line 55, in
action_proper
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> return
he.start_monitoring()
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> File
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> >>>> >> >>> >> >>>> line 432, in
start_monitoring
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
self._initialize_broker()
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> File
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> >>>> >> >>> >> >>>> line 556, in
_initialize_broker
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
m.get('options', {}))
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> File
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> >>>> >> >>> >> >>>> line 89, in
start_monitor
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> ).format(t=type,
o=options, e=e)
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> RequestError:
brokerlink - failed to start
monitor
> >via
> >>>> >> >>> >> >ovirt-ha-broker:
> >>>> >> >>> >> >>>> [Errno 2] No such
file or directory, [monitor:
> >'network',
> >>>> >> >>> >options:
> >>>> >> >>> >> >>>>
{'tcp_t_address': '', 'network_test': 'dns',
> >>>> >'tcp_t_port':
> >>>> >> >'',
> >>>> >> >>> >> >'addr':
> >>>> >> >>> >> >>>>
'192.168.1.99'}]
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::ERROR::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,297::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
> >>>> >> >>> >> >>>> Trying to restart
agent
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
MainThread::INFO::2020-04-08
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >>
> >>>>
> >>>>
>
>
>>>>>20:57:11,297::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
> >>>> >> >>> >> >>>> Agent shutting down
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> On Wed, Apr 8, 2020
at 6:10 PM Strahil Nikolov
> >>>> >> >>> >> ><hunter86_bg(a)yahoo.com>
> >>>> >> >>> >> >>>> wrote:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> On April 8, 2020
7:47:20 PM GMT+03:00, "Maton,
> >Brett" <
> >>>> >> >>> >> >>>>
matonb(a)ltresources.co.uk> wrote:
> >>>> >> >>> >> >>>> >On the host you
tried to restart the engine on:
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >Add an alias to
virsh (authenticates with
> >>>> >virsh_auth.conf)
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >alias
virsh='virsh -c
> >>>> >> >>> >> >>>>
> >>>> >> >>>
> >>>>
>
>>>>qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >Then run virsh:
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >virsh
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >virsh # list
> >>>> >> >>> >> >>>> > Id Name
State
> >>>> >> >>> >> >>>>
> >>----------------------------------------------------
> >>>> >> >>> >> >>>> > xx
HostedEngine Paused
> >>>> >> >>> >> >>>> > xx
********** running
> >>>> >> >>> >> >>>> > ...
> >>>> >> >>> >> >>>> > xx
********** running
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >HostedEngine
should be in the list, try and
resume
> >the
> >>>> >> >engine:
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >virsh # resume
HostedEngine
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >On Wed, 8 Apr
2020 at 17:28, Shareef Jalloq
> >>>> >> >>> ><shareef(a)jalloq.co.uk>
> >>>> >> >>> >> >>>> >wrote:
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>> >> Thanks!
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >> The status
hangs due to, I guess, the VM being
> >>>> >down....
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >>
[root@ovirt-node-01 ~]# hosted-engine
--vm-start
> >>>> >> >>> >> >>>> >> VM exists
and is down, cleaning up and
restarting
> >>>> >> >>> >> >>>> >> VM in
WaitForLaunch
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >> but this
doesn't seem to do anything. OK,
after
> >a
> >>>> >while
> >>>> >> >I
> >>>> >> >>> >get a
> >>>> >> >>> >> >>>> >status of
> >>>> >> >>> >> >>>> >> it being
barfed...
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >> --== Host
ovirt-node-00.phoelex.com (id: 1)
> >status
> >>>> >==--
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >>
conf_on_shared_storage : True
> >>>> >> >>> >> >>>> >> Status
up-to-date : False
> >>>> >> >>> >> >>>> >> Hostname
:
> >>>> >> >>> >ovirt-node-00.phoelex.com
> >>>> >> >>> >> >>>> >> Host ID
: 1
> >>>> >> >>> >> >>>> >> Engine
status : unknown
> >>>> >stale-data
> >>>> >> >>> >> >>>> >> Score
: 3400
> >>>> >> >>> >> >>>> >> stopped
: False
> >>>> >> >>> >> >>>> >> Local
maintenance : False
> >>>> >> >>> >> >>>> >> crc32
: 9c4a034b
> >>>> >> >>> >> >>>> >>
local_conf_timestamp : 523362
> >>>> >> >>> >> >>>> >> Host
timestamp : 523608
> >>>> >> >>> >> >>>> >> Extra
metadata (valid at timestamp):
> >>>> >> >>> >> >>>> >>
metadata_parse_version=1
> >>>> >> >>> >> >>>> >>
metadata_feature_version=1
> >>>> >> >>> >> >>>> >>
timestamp=523608 (Wed Apr 8 16:17:11 2020)
> >>>> >> >>> >> >>>> >> host-id=1
> >>>> >> >>> >> >>>> >> score=3400
> >>>> >> >>> >> >>>> >>
vm_conf_refresh_time=523362 (Wed Apr 8
16:13:06
> >2020)
> >>>> >> >>> >> >>>> >>
conf_on_shared_storage=True
> >>>> >> >>> >> >>>> >>
maintenance=False
> >>>> >> >>> >> >>>> >>
state=EngineDown
> >>>> >> >>> >> >>>> >>
stopped=False
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >> --== Host
ovirt-node-01.phoelex.com (id: 2)
> >status
> >>>> >==--
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >>
conf_on_shared_storage : True
> >>>> >> >>> >> >>>> >> Status
up-to-date : True
> >>>> >> >>> >> >>>> >> Hostname
:
> >>>> >> >>> >ovirt-node-01.phoelex.com
> >>>> >> >>> >> >>>> >> Host ID
: 2
> >>>> >> >>> >> >>>> >> Engine
status :
{"reason":
> >"bad
> >>>> >vm
> >>>> >> >>> >status",
> >>>> >> >>> >> >>>>
>"health":
> >>>> >> >>> >> >>>> >>
"bad", "vm": "down_unexpected", "detail":
"Down"}
> >>>> >> >>> >> >>>> >> Score
: 0
> >>>> >> >>> >> >>>> >> stopped
: False
> >>>> >> >>> >> >>>> >> Local
maintenance : False
> >>>> >> >>> >> >>>> >> crc32
: 5045f2eb
> >>>> >> >>> >> >>>> >>
local_conf_timestamp : 1737037
> >>>> >> >>> >> >>>> >> Host
timestamp : 1737283
> >>>> >> >>> >> >>>> >> Extra
metadata (valid at timestamp):
> >>>> >> >>> >> >>>> >>
metadata_parse_version=1
> >>>> >> >>> >> >>>> >>
metadata_feature_version=1
> >>>> >> >>> >> >>>> >>
timestamp=1737283 (Wed Apr 8 16:16:17 2020)
> >>>> >> >>> >> >>>> >> host-id=2
> >>>> >> >>> >> >>>> >> score=0
> >>>> >> >>> >> >>>> >>
vm_conf_refresh_time=1737037 (Wed Apr 8
16:12:11
> >>>> >2020)
> >>>> >> >>> >> >>>> >>
conf_on_shared_storage=True
> >>>> >> >>> >> >>>> >>
maintenance=False
> >>>> >> >>> >> >>>> >>
state=EngineUnexpectedlyDown
> >>>> >> >>> >> >>>> >>
stopped=False
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >> On Wed, Apr
8, 2020 at 5:09 PM Maton, Brett
> >>>> >> >>> >> >>>>
><matonb(a)ltresources.co.uk>
> >>>> >> >>> >> >>>> >> wrote:
> >>>> >> >>> >> >>>> >>
> >>>> >> >>> >> >>>> >>> First
steps, on one of your hosts as root:
> >>>> >> >>> >> >>>> >>>
> >>>> >> >>> >> >>>> >>> To get
information:
> >>>> >> >>> >> >>>> >>>
hosted-engine --vm-status
> >>>> >> >>> >> >>>> >>>
> >>>> >> >>> >> >>>> >>> To
start the engine:
> >>>> >> >>> >> >>>> >>>
hosted-engine --vm-start
> >>>> >> >>> >> >>>> >>>
> >>>> >> >>> >> >>>> >>>
> >>>> >> >>> >> >>>> >>> On Wed,
8 Apr 2020 at 17:00, Shareef Jalloq
> >>>> >> >>> >> ><shareef(a)jalloq.co.uk>
> >>>> >> >>> >> >>>> >wrote:
> >>>> >> >>> >> >>>> >>>
> >>>> >> >>> >> >>>> >>>> So
my engine has gone down and I can't ssh
into
> >it
> >>>> >> >either.
> >>>> >> >>> >If
> >>>> >> >>> >> >I
> >>>> >> >>> >> >>>> >try to
> >>>> >> >>> >> >>>> >>>> log
into the web-ui of the node it is
running
> >on, I
> >>>> >get
> >>>> >> >>> >> >redirected
> >>>> >> >>> >> >>>> >because
> >>>> >> >>> >> >>>> >>>> the
node can't reach the engine.
> >>>> >> >>> >> >>>> >>>>
> >>>> >> >>> >> >>>> >>>>
What are my next steps?
> >>>> >> >>> >> >>>> >>>>
> >>>> >> >>> >> >>>> >>>>
Shareef.
> >>>> >> >>> >> >>>> >>>>
_______________________________________________
> >>>> >> >>> >> >>>> >>>>
Users mailing list -- users(a)ovirt.org
> >>>> >> >>> >> >>>> >>>> To
unsubscribe send an email to
> >>>> >users-leave(a)ovirt.org
> >>>> >> >>> >> >>>> >>>>
Privacy Statement:
> >>>> >> >>> >https://www.ovirt.org/privacy-policy.html
> >>>> >> >>> >> >>>> >>>>
oVirt Code of Conduct:
> >>>> >> >>> >> >>>> >>>>
> >>>> >>
>https://www.ovirt.org/community/about/community-guidelines/
> >>>> >> >>> >> >>>> >>>>
List Archives:
> >>>> >> >>> >> >>>> >>>>
> >>>> >> >>> >> >>>> >
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >
> >>>> >> >>> >>
> >>>> >> >>> >
> >>>> >> >>>
> >>>> >> >
> >>>> >>
> >>>> >
> >>>>
> >
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRS...
> >>>> >> >>> >> >>>> >>>>
> >>>> >> >>> >> >>>> >>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> This has to be
resolved:
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Engine status
: unknown
> >stale-data
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Run again
'hosted-engine --vm-status'. If it
remains
> >the
> >>>> >> >same,
> >>>> >> >>> >> >restart
> >>>> >> >>> >> >>>>
ovirt-ha-broker.service & ovirt-ha-agent.service
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Verify that the
engine's storage is available.
Then
> >>>> >monitor
> >>>> >> >the
> >>>> >> >>> >> >broker
> >>>> >> >>> >> >>>> & agent logs in
/var/log/ovirt-hosted-engine-ha
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>> Best Regards,
> >>>> >> >>> >> >>>> Strahil Nikolov
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >> >>>>
> >>>> >> >>> >>
> >>>> >> >>> >> Hi Shareef,
> >>>> >> >>> >>
> >>>> >> >>> >> The flow of activation oVirt is more
complex than a
plain
> >KVM.
> >>>> >> >>> >> Mounting of the domains happen during
the activation
of
> >the
> >>>> >node
> >>>> >> >(
> >>>> >> >>> >the
> >>>> >> >>> >> HostedEngine is activating everything
needed).
> >>>> >> >>> >>
> >>>> >> >>> >> Focus on the HostedEngine VM.
> >>>> >> >>> >> Is it running properly ?
> >>>> >> >>> >>
> >>>> >> >>> >> If not,try:
> >>>> >> >>> >> 1. Verify that the storage domain
exists
> >>>> >> >>> >> 2. Check if it has
'ha_agents' directory
> >>>> >> >>> >> 3. Check if the links are OK, if not
you can safely
> >remove
> >>>> >the
> >>>> >> >links
> >>>> >> >>> >>
> >>>> >> >>> >> 4. Next check the services are
running:
> >>>> >> >>> >> A) sanlock
> >>>> >> >>> >> B) supervdsmd
> >>>> >> >>> >> C) vdsmd
> >>>> >> >>> >> D) libvirtd
> >>>> >> >>> >>
> >>>> >> >>> >> 5. Increase the log level for broker
and agent
services:
> >>>> >> >>> >>
> >>>> >> >>> >> cd /etc/ovirt-hosted-engine-ha
> >>>> >> >>> >> vim *-log.conf
> >>>> >> >>> >>
> >>>> >> >>> >> systemctl restart ovirt-ha-broker
ovirt-ha-agent
> >>>> >> >>> >>
> >>>> >> >>> >> 6. Check what they are complaining
about
> >>>> >> >>> >> Keep in mind that agent will keep
throwing errors
untill
> >the
> >>>> >> >broker
> >>>> >> >>> >stops
> >>>> >> >>> >> doing it (agent depends on broker),
so broker must
be
> >OK
> >>>> >before
> >>>> >> >>> >> peoceeding with the agent log.
> >>>> >> >>> >>
> >>>> >> >>> >> About the manual VM start, you need
2 things:
> >>>> >> >>> >>
> >>>> >> >>> >> 1. Define the VM network
> >>>> >> >>> >> # cat vdsm-ovirtmgmt.xml
<network>
> >>>> >> >>> >>
<name>vdsm-ovirtmgmt</name>
> >>>> >> >>> >>
<uuid>8ded486e-e681-4754-af4b-5737c2b05405</uuid>
> >>>> >> >>> >> <forward
mode='bridge'/>
> >>>> >> >>> >> <bridge
name='ovirtmgmt'/>
> >>>> >> >>> >> </network>
> >>>> >> >>> >>
> >>>> >> >>> >> [root@ovirt1 HostedEngine-RECOVERY]#
virsh define
> >>>> >> >vdsm-ovirtmgmt.xml
> >>>> >> >>> >>
> >>>> >> >>> >> 2. Get an xml definition which can be
found in the
vdsm
> >log.
> >>>> >> >Every VM
> >>>> >> >>> >at
> >>>> >> >>> >> start up has it's configuration
printed out in vdsm
log
> >on
> >>>> >the
> >>>> >> >host
> >>>> >> >>> >it
> >>>> >> >>> >> starts.
> >>>> >> >>> >> Save to file and then:
> >>>> >> >>> >> A) virsh define myvm.xml
> >>>> >> >>> >> B) virsh start myvm
> >>>> >> >>> >>
> >>>> >> >>> >> It seems there is/was a problem with
your NFS shares.
> >>>> >> >>> >>
> >>>> >> >>> >>
> >>>> >> >>> >> Best Regards,
> >>>> >> >>> >> Strahil Nikolov
> >>>> >> >>> >>
> >>>> >> >>>
> >>>> >> >>> Hey Shareef,
> >>>> >> >>>
> >>>> >> >>> Check if there are any files or folders not
owned by
> >vdsm:kvm .
> >>>> >> >Something
> >>>> >> >>> like this:
> >>>> >> >>>
> >>>> >> >>> find . -not -user 36 -not -group 36 -print
> >>>> >> >>>
> >>>> >> >>> Also check if vdsm can access the images in
the
> >>>> >> >>> '<vol-mount-point>/images'
directories.
> >>>> >> >>>
> >>>> >> >>> Best Regards,
> >>>> >> >>> Strahil Nikolov
> >>>> >> >>>
> >>>> >> >>
> >>>> >>
> >>>> >> And the IPv6 address '64:ff9b::c0a8:13d' ?
> >>>> >>
> >>>> >> I don't see in the log output.
> >>>> >>
> >>>> >> Best Regards,
> >>>> >> Strahil Nikolov
> >>>> >>
> >>>>
> >>>> Based on your output , you got a PTR record for IPv4 & IPv6
...
> >most
> >>>> probably it's the reason.
> >>>>
> >>>> Set the IPv6 on the interface and try again.
> >>>>
> >>>> Best Regards,
> >>>> Strahil Nikolov
> >>>>
> >>>
>
> Do you have firewalld up and running on the host ?
>
> Best Regards,
> Strahil Nikolov
>
I am guessing, but your interface is not asaigned to any zone , right?
Just add the interface to the default zone (usually 'public').
Best Regards,
Strahil Nikolov