[ovirt-users] ha-agent and broker continually crashing after 4.2 update

Martin Sivak msivak at redhat.com
Mon Jan 15 09:27:08 UTC 2018


I actually do not agree with Simone here. The fix he talks about adds
a call to prepareImage, but your log clearly shows that prepareImage
is the call that fails:

Jan 12 16:52:36 cultivar0 journal: vdsm storage.Dispatcher ERROR
FINISH prepareImage error=Volume does not exist:
(u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)

I have to ask how old the environment is. Was it by any chance
installed back in 3.3/3.4 days and upgraded since then?

Martin

On Mon, Jan 15, 2018 at 10:17 AM, Simone Tiraboschi <stirabos at redhat.com> wrote:
>
>
> On Fri, Jan 12, 2018 at 9:54 PM, Jayme <jaymef at gmail.com> wrote:
>>
>> recently upgraded to 4.2 and had some problems with engine vm running, got
>> that cleared up now my only remaining issue is that now it seems
>> ovirt-ha-broker and ovirt-ha-agent are continually crashing on all three of
>> my hosts.  Everything is up and working fine otherwise, all VMs running and
>> hosted engine VM is running along with interface etc.
>
>
> I think it's due to https://bugzilla.redhat.com/show_bug.cgi?id=1527394 with
> got recently fixed.
> ovirt-hosted-engine-ha-2.2.3 should address it, please let us know if not.
>
>
>>
>>
>> Jan 12 16:52:34 cultivar0 journal: vdsm storage.Dispatcher ERROR FINISH
>> prepareImage error=Volume does not exist:
>> (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:34 cultivar0 python: detected unhandled Python exception in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:34 cultivar0 abrt-server: Not saving repeating crash in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:34 cultivar0 systemd: ovirt-ha-broker.service: main process
>> exited, code=exited, status=1/FAILURE
>> Jan 12 16:52:34 cultivar0 systemd: Unit ovirt-ha-broker.service entered
>> failed state.
>> Jan 12 16:52:34 cultivar0 systemd: ovirt-ha-broker.service failed.
>> Jan 12 16:52:34 cultivar0 systemd: ovirt-ha-broker.service holdoff time
>> over, scheduling restart.
>> Jan 12 16:52:34 cultivar0 systemd: Cannot add dependency job for unit
>> lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Jan 12 16:52:34 cultivar0 systemd: Started oVirt Hosted Engine High
>> Availability Communications Broker.
>> Jan 12 16:52:34 cultivar0 systemd: Starting oVirt Hosted Engine High
>> Availability Communications Broker...
>> Jan 12 16:52:36 cultivar0 journal: vdsm storage.TaskManager.Task ERROR
>> (Task='73141dec-9d8f-4164-9c4e-67c43a102eff') Unexpected error#012Traceback
>> (most recent call last):#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in
>> _run#012    return fn(*args, **kargs)#012  File "<string>", line 2, in
>> prepareImage#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
>> method#012    ret = func(*args, **kwargs)#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3162, in
>> prepareImage#012    raise
>> se.VolumeDoesNotExist(leafUUID)#012VolumeDoesNotExist: Volume does not
>> exist: (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:36 cultivar0 journal: vdsm storage.Dispatcher ERROR FINISH
>> prepareImage error=Volume does not exist:
>> (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:36 cultivar0 python: detected unhandled Python exception in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:36 cultivar0 abrt-server: Not saving repeating crash in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:36 cultivar0 systemd: ovirt-ha-broker.service: main process
>> exited, code=exited, status=1/FAILURE
>> Jan 12 16:52:36 cultivar0 systemd: Unit ovirt-ha-broker.service entered
>> failed state.
>> Jan 12 16:52:36 cultivar0 systemd: ovirt-ha-broker.service failed.
>>
>> Jan 12 16:52:36 cultivar0 systemd: ovirt-ha-broker.service holdoff time
>> over, scheduling restart.
>> Jan 12 16:52:36 cultivar0 systemd: Cannot add dependency job for unit
>> lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Jan 12 16:52:36 cultivar0 systemd: Started oVirt Hosted Engine High
>> Availability Communications Broker.
>> Jan 12 16:52:36 cultivar0 systemd: Starting oVirt Hosted Engine High
>> Availability Communications Broker...
>> Jan 12 16:52:37 cultivar0 journal: vdsm storage.TaskManager.Task ERROR
>> (Task='bc7af1e2-0ab2-4164-ae88-d2bee03500f9') Unexpected error#012Traceback
>> (most recent call last):#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in
>> _run#012    return fn(*args, **kargs)#012  File "<string>", line 2, in
>> prepareImage#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
>> method#012    ret = func(*args, **kwargs)#012  File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3162, in
>> prepareImage#012    raise
>> se.VolumeDoesNotExist(leafUUID)#012VolumeDoesNotExist: Volume does not
>> exist: (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:37 cultivar0 journal: vdsm storage.Dispatcher ERROR FINISH
>> prepareImage error=Volume does not exist:
>> (u'8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8',)
>> Jan 12 16:52:37 cultivar0 python: detected unhandled Python exception in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:38 cultivar0 abrt-server: Not saving repeating crash in
>> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
>> Jan 12 16:52:38 cultivar0 systemd: ovirt-ha-broker.service: main process
>> exited, code=exited, status=1/FAILURE
>> Jan 12 16:52:38 cultivar0 systemd: Unit ovirt-ha-broker.service entered
>> failed state.
>> Jan 12 16:52:38 cultivar0 systemd: ovirt-ha-broker.service failed.
>> Jan 12 16:52:38 cultivar0 systemd: ovirt-ha-broker.service holdoff time
>> over, scheduling restart.
>> Jan 12 16:52:38 cultivar0 systemd: Cannot add dependency job for unit
>> lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Jan 12 16:52:38 cultivar0 systemd: start request repeated too quickly for
>> ovirt-ha-broker.service
>> Jan 12 16:52:38 cultivar0 systemd: Failed to start oVirt Hosted Engine
>> High Availability Communications Broker.
>> Jan 12 16:52:38 cultivar0 systemd: Unit ovirt-ha-broker.service entered
>> failed state.
>> Jan 12 16:52:38 cultivar0 systemd: ovirt-ha-broker.service failed.
>> Jan 12 16:52:40 cultivar0 systemd: ovirt-ha-agent.service holdoff time
>> over, scheduling restart.
>> Jan 12 16:52:40 cultivar0 systemd: Cannot add dependency job for unit
>> lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Jan 12 16:52:40 cultivar0 systemd: Started oVirt Hosted Engine High
>> Availability Communications Broker.
>> Jan 12 16:52:40 cultivar0 systemd: Starting oVirt Hosted Engine High
>> Availability Communications Broker...
>> Jan 12 16:52:40 cultivar0 systemd: Started oVirt Hosted Engine High
>> Availability Monitoring Agent.
>> Jan 12 16:52:40 cultivar0 systemd: Starting oVirt Hosted Engine High
>> Availability Monitoring Agent...
>> Jan 12 16:52:41 cultivar0 journal: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
>> start necessary monitors
>> Jan 12 16:52:41 cultivar0 journal: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call
>> last):#012  File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 131, in _run_agent#012    return action(he)#012  File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 55, in action_proper#012    return he.start_monitoring()#012  File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 416, in start_monitoring#012    self._initialize_broker()#012  File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 535, in _initialize_broker#012    m.get('options', {}))#012  File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>> line 83, in start_monitor#012    .format(type, options, e))#012RequestError:
>> Failed to start monitor ping, options {'addr': '192.168.0.1'}: [Errno 2] No
>> such file or directory
>> Jan 12 16:52:41 cultivar0 journal: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>> Jan 12 16:52:42 cultivar0 systemd: ovirt-ha-agent.service: main process
>> exited, code=exited, status=157/n/a
>> Jan 12 16:52:42 cultivar0 systemd: Unit ovirt-ha-agent.service entered
>> failed state.
>> Jan 12 16:52:42 cultivar0 systemd: ovirt-ha-agent.service failed.
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>


More information about the Users mailing list