an vdsm:
On Tue, Mar 19, 2019 at 1:24 PM ada per <adaper3(a)gmail.com> wrote:
Thank you! please see attached files:
On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi <stirabos(a)redhat.com>
wrote:
> Can you please check/attach also
> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ?
>
> On Tue, Mar 19, 2019 at 11:36 AM ada per <adaper3(a)gmail.com> wrote:
>
>> Hello everyone,
>>
>> For a strange reason the hosted engine went down and I cannot restart
>> it. I tried manually restarting it without any success can you please
>> advice?
>>
>> For all the nodes the engine status is the same as the one below.
>> --== Host nodex. (id: 6) status ==--
>> conf_on_shared_storage : True
>> Status up-to-date : True
>> Hostname : nodex
>> Host ID : 6
>> Engine status : {"reason": "bad vm
status",
>> "health": "bad", "vm": "down_unexpected",
"detail": "Down"}
>> Score : 3400
>> stopped : False
>> Local maintenance : False
>> crc32 : 323a9f45
>> local_conf_timestamp : 2648874
>> Host timestamp : 2648874
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=2648874 (Tue Mar 19 12:25:44 2019)
>> host-id=6
>> score=3400
>> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=GlobalMaintenance
>> stopped=False
>>
>> When I try the commands
>> root@node5# hosted-engine --vm-shutdown
>> I ge the response:
>> root@node5# Command VM.shutdown with args {'delay': '120',
'message':
>> 'VM is shutting down!', 'vmID':
'a492d2eb-1dfd-470d-a141-3e55d2189275'}
>> failed:(code=1, message=Virtual machine does not exist)
>>
>> But when I run : hosted-engine --vm-start
>> I get the response: VM exists and is down, cleaning up and restarting
>>
>>
>>
>> Below you can see the # journalctl -u ovirt-ha-agent logs
>>
>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled
>> monitoring loop exception
>> Traceback
>> (most recent call last):
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 430, in start_monitoring
>>
>> self._monitoring_loop()
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 449, in _monitoring_loop
>> for
>> old_state, state, delay in self.fsm:
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
>> line 127, in next
>> new_data =
>> self.refresh(self._state.data)
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
>> line 81, in refresh
>>
>> stats.update(self.hosted_engine.collect_stats())
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 737, in collect_stats
>> all_stats
>> = self._broker.get_stats_from_storage()
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>> line 143, in get_stats_from_storage
>> result =
>> self._proxy.get_stats()
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>> return
>> self.__send(self.__name, args)
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
>>
>> verbose=self.__verbose
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>> return
>> self.single_request(host, handler, request_body, verbose)
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request
>>
>> self.send_content(h, request_body)
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content
>>
>> connection.endheaders(request_body)
>> File
>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
>>
>> self._send_output(message_body)
>> File
>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output
>>
>> self.send(msg)
>> File
>> "/usr/lib64/python2.7/httplib.py", line 843, in send
>>
>> self.connect()
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
>> line 52, in connect
>>
>> self.sock.connect(base64.b16decode(self.host))
>> File
>> "/usr/lib64/python2.7/socket.py", line 224, in meth
>> return
>> getattr(self._sock,name)(*args)
>> error: [Errno
>> 2] No such file or directory
>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call
>> last):
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 131, in _run_agent
>> return
>> action(he)
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 55, in action_proper
>> return
>> he.start_monitoring()
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 437, in start_monitoring
>>
>> self.publish(stopped)
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 337, in publish
>>
>> self._push_to_storage(blocks)
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 708, in _push_to_storage
>>
>> self._broker.put_stats_on_storage(self.host_id, blocks)
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>> line 113, in put_stats_on_storage
>>
>> self._proxy.put_stats(host_id, xmlrpclib.Binary(data))
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>> return
>> self.__send(self.__name, args)
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
>>
>> verbose=self.__verbose
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>> return
>> self.single_request(host, handler, request_body, verbose)
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request
>>
>> self.send_content(h, request_body)
>> File
>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content
>>
>> connection.endheaders(request_body)
>> File
>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
>>
>> self._send_output(message_body)
>> File
>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output
>>
>> self.send(msg)
>> File
>> "/usr/lib64/python2.7/httplib.py", line 843, in send
>>
>> self.connect()
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
>> line 52, in connect
>>
>> self.sock.connect(base64.b16decode(self.host))
>> File
>> "/usr/lib64/python2.7/socket.py", line 224, in meth
>> return
>> getattr(self._sock,name)(*args)
>> error: [Errno
>> 2] No such file or directory
>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service: main process
>> exited, code=exited, status=157/n/a
>> Mar 14 12:04:42 node7. systemd[1]: Unit ovirt-ha-agent.service entered
>> failed state.
>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service failed.
>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service holdoff time
>> over, scheduling restart.
>> Mar 14 12:04:52 node7. systemd[1]: Stopped oVirt Hosted Engine High
>> Availability Monitoring Agent.
>> Mar 14 12:04:52 node7. systemd[1]: Started oVirt Hosted Engine High
>> Availability Monitoring Agent.
>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
>> start necessary monitors
>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call
>> last):
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 131, in _run_agent
>> return
>> action(he)
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 55, in action_proper
>> return
>> he.start_monitoring()
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 413, in start_monitoring
>>
>> self._initialize_broker()
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 537, in _initialize_broker
>>
>> m.get('options', {}))
>> File
>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>> line 86, in start_monitor
>>
>> ).format(t=type, o=options, e=e)
>> RequestError:
>> brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such
>> file or directory, [monitor: 'ping', options: {'addr': '19
>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service: main process
>> exited, code=exited, status=157/n/a
>> Mar 14 12:04:52 node7. systemd[1]: Unit ovirt-ha-agent.service entered
>> failed state.
>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service failed.
>> Mar 14 12:05:02 node7. systemd[1]: ovirt-ha-agent.service holdoff time
>> over, scheduling restart.
>> Mar 14 12:05:02 node7. systemd[1]: Stopped oVirt Hosted Engine High
>> Availability Monitoring Agent.
>> Mar 14 12:05:02 node7. systemd[1]: Started oVirt Hosted Engine High
>> Availability Monitoring Agent.
>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
>> stop engine vm with /usr/sbin/hosted-engine --vm-poweroff: Co
>> (code=1,
>> message=Virtual machine does not exist: {'vmId':
>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'})
>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
>> stop engine VM: Command VM.destroy with args {'vmID': 'a492d2
>> (code=1,
>> message=Virtual machine does not exist: {'vmId':
>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'})
>> Mar 15 14:28:16 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:28:36 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:29:00 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:29:22 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:29:44 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:30:06 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:30:28 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:30:50 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:31:12 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:31:33 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:31:56 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:32:18 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:32:40 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> Mar 15 14:33:02 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>> stopped on localhost
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>>
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NS2SASAK66T...
>>
>