Thanks for you reply.
Can you please provide step by step instructions on how to upgrade the vdsm
from a node command line?
On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi <stirabos(a)redhat.com>
wrote:
Hi Ada,
here the error:
2019-03-19 14:08:25,833+0200 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312)
2019-03-19 14:08:25,839+0200 INFO (vm/a492d2eb) [vdsm.api] FINISH
prepareImage error=Volume does not exist:
(u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal,
task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52)
2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb)
[storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
Unexpected error (task:875)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
in _run
return fn(*args, **kargs)
File "<string>", line 2, in prepareImage
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in
method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3199,
in prepareImage
legality = dom.produceVolume(imgUUID, volUUID).getLegality()
File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822, in
produceVolume
volUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
801, in __init__
self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line
71, in __init__
volUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 86,
in __init__
self.validate()
File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
112, in validate
self.validateVolumePath()
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line
131, in validateVolumePath
raise se.VolumeDoesNotExist(self.volUUID)
VolumeDoesNotExist: Volume does not exist:
(u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)
2019-03-19 14:08:25,840+0200 INFO (vm/a492d2eb)
[storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
aborting: Task is aborted: "Volume does not exist:
(u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181)
2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher]
FINISH prepareImage error=Volume does not exist:
(u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83)
I think it's still
https://bugzilla.redhat.com/1666795
<
https://bugzilla.redhat.com/show_bug.cgi?id=1666795>
Can you please try updating vdsm to vdsm-4.30.10 since the bug is reported
as solved in that version?
On Tue, Mar 19, 2019 at 12:30 PM ada per <adaper3(a)gmail.com> wrote:
> an vdsm:
>
>
>
>
>
> On Tue, Mar 19, 2019 at 1:24 PM ada per <adaper3(a)gmail.com> wrote:
>
>> Thank you! please see attached files:
>>
>> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi <stirabos(a)redhat.com>
>> wrote:
>>
>>> Can you please check/attach also
>>> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ?
>>>
>>> On Tue, Mar 19, 2019 at 11:36 AM ada per <adaper3(a)gmail.com> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> For a strange reason the hosted engine went down and I cannot restart
>>>> it. I tried manually restarting it without any success can you please
>>>> advice?
>>>>
>>>> For all the nodes the engine status is the same as the one below.
>>>> --== Host nodex. (id: 6) status ==--
>>>> conf_on_shared_storage : True
>>>> Status up-to-date : True
>>>> Hostname : nodex
>>>> Host ID : 6
>>>> Engine status : {"reason": "bad vm
status",
>>>> "health": "bad", "vm":
"down_unexpected", "detail": "Down"}
>>>> Score : 3400
>>>> stopped : False
>>>> Local maintenance : False
>>>> crc32 : 323a9f45
>>>> local_conf_timestamp : 2648874
>>>> Host timestamp : 2648874
>>>> Extra metadata (valid at timestamp):
>>>> metadata_parse_version=1
>>>> metadata_feature_version=1
>>>> timestamp=2648874 (Tue Mar 19 12:25:44 2019)
>>>> host-id=6
>>>> score=3400
>>>> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
>>>> conf_on_shared_storage=True
>>>> maintenance=False
>>>> state=GlobalMaintenance
>>>> stopped=False
>>>>
>>>> When I try the commands
>>>> root@node5# hosted-engine --vm-shutdown
>>>> I ge the response:
>>>> root@node5# Command VM.shutdown with args {'delay':
'120', 'message':
>>>> 'VM is shutting down!', 'vmID':
'a492d2eb-1dfd-470d-a141-3e55d2189275'}
>>>> failed:(code=1, message=Virtual machine does not exist)
>>>>
>>>> But when I run : hosted-engine --vm-start
>>>> I get the response: VM exists and is down, cleaning up and restarting
>>>>
>>>>
>>>>
>>>> Below you can see the # journalctl -u ovirt-ha-agent logs
>>>>
>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled
>>>> monitoring loop exception
>>>> Traceback
>>>> (most recent call last):
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 430, in start_monitoring
>>>>
>>>> self._monitoring_loop()
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 449, in _monitoring_loop
>>>> for
>>>> old_state, state, delay in self.fsm:
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
>>>> line 127, in next
>>>> new_data
>>>> = self.refresh(self._state.data)
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
>>>> line 81, in refresh
>>>>
>>>> stats.update(self.hosted_engine.collect_stats())
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 737, in collect_stats
>>>>
>>>> all_stats = self._broker.get_stats_from_storage()
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>> line 143, in get_stats_from_storage
>>>> result =
>>>> self._proxy.get_stats()
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>>>> return
>>>> self.__send(self.__name, args)
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
>>>>
>>>> verbose=self.__verbose
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>>>> return
>>>> self.single_request(host, handler, request_body, verbose)
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in
single_request
>>>>
>>>> self.send_content(h, request_body)
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in
send_content
>>>>
>>>> connection.endheaders(request_body)
>>>> File
>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
>>>>
>>>> self._send_output(message_body)
>>>> File
>>>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output
>>>>
>>>> self.send(msg)
>>>> File
>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send
>>>>
>>>> self.connect()
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
>>>> line 52, in connect
>>>>
>>>> self.sock.connect(base64.b16decode(self.host))
>>>> File
>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth
>>>> return
>>>> getattr(self._sock,name)(*args)
>>>> error:
>>>> [Errno 2] No such file or directory
>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
call
>>>> last):
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 131, in _run_agent
>>>> return
>>>> action(he)
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 55, in action_proper
>>>> return
>>>> he.start_monitoring()
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 437, in start_monitoring
>>>>
>>>> self.publish(stopped)
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 337, in publish
>>>>
>>>> self._push_to_storage(blocks)
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 708, in _push_to_storage
>>>>
>>>> self._broker.put_stats_on_storage(self.host_id, blocks)
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>> line 113, in put_stats_on_storage
>>>>
>>>> self._proxy.put_stats(host_id, xmlrpclib.Binary(data))
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>>>> return
>>>> self.__send(self.__name, args)
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
>>>>
>>>> verbose=self.__verbose
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>>>> return
>>>> self.single_request(host, handler, request_body, verbose)
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in
single_request
>>>>
>>>> self.send_content(h, request_body)
>>>> File
>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in
send_content
>>>>
>>>> connection.endheaders(request_body)
>>>> File
>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
>>>>
>>>> self._send_output(message_body)
>>>> File
>>>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output
>>>>
>>>> self.send(msg)
>>>> File
>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send
>>>>
>>>> self.connect()
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
>>>> line 52, in connect
>>>>
>>>> self.sock.connect(base64.b16decode(self.host))
>>>> File
>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth
>>>> return
>>>> getattr(self._sock,name)(*args)
>>>> error:
>>>> [Errno 2] No such file or directory
>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service: main
>>>> process exited, code=exited, status=157/n/a
>>>> Mar 14 12:04:42 node7. systemd[1]: Unit ovirt-ha-agent.service entered
>>>> failed state.
>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service failed.
>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service holdoff time
>>>> over, scheduling restart.
>>>> Mar 14 12:04:52 node7. systemd[1]: Stopped oVirt Hosted Engine High
>>>> Availability Monitoring Agent.
>>>> Mar 14 12:04:52 node7. systemd[1]: Started oVirt Hosted Engine High
>>>> Availability Monitoring Agent.
>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
>>>> start necessary monitors
>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
call
>>>> last):
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 131, in _run_agent
>>>> return
>>>> action(he)
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 55, in action_proper
>>>> return
>>>> he.start_monitoring()
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 413, in start_monitoring
>>>>
>>>> self._initialize_broker()
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 537, in _initialize_broker
>>>>
>>>> m.get('options', {}))
>>>> File
>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>> line 86, in start_monitor
>>>>
>>>> ).format(t=type, o=options, e=e)
>>>>
>>>> RequestError: brokerlink - failed to start monitor via ovirt-ha-broker:
>>>> [Errno 2] No such file or directory, [monitor: 'ping', options:
{'addr': '19
>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service: main
>>>> process exited, code=exited, status=157/n/a
>>>> Mar 14 12:04:52 node7. systemd[1]: Unit ovirt-ha-agent.service entered
>>>> failed state.
>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service failed.
>>>> Mar 14 12:05:02 node7. systemd[1]: ovirt-ha-agent.service holdoff time
>>>> over, scheduling restart.
>>>> Mar 14 12:05:02 node7. systemd[1]: Stopped oVirt Hosted Engine High
>>>> Availability Monitoring Agent.
>>>> Mar 14 12:05:02 node7. systemd[1]: Started oVirt Hosted Engine High
>>>> Availability Monitoring Agent.
>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
>>>> stop engine vm with /usr/sbin/hosted-engine --vm-poweroff: Co
>>>> (code=1,
>>>> message=Virtual machine does not exist: {'vmId':
>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'})
>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
>>>> stop engine VM: Command VM.destroy with args {'vmID':
'a492d2
>>>> (code=1,
>>>> message=Virtual machine does not exist: {'vmId':
>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'})
>>>> Mar 15 14:28:16 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:28:36 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:29:00 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:29:22 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:29:44 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:30:06 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:30:28 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:30:50 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:31:12 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:31:33 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:31:56 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:32:18 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:32:40 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> Mar 15 14:33:02 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM
>>>> stopped on localhost
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>>>> oVirt Code of Conduct:
>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NS2SASAK66T...
>>>>
>>>