After changing the ownership the engine is up!!
thanks for your help!!!:)
On Tue, Mar 19, 2019 at 3:25 PM Simone Tiraboschi <stirabos(a)redhat.com>
wrote:
On Tue, Mar 19, 2019 at 2:21 PM ada per <adaper3(a)gmail.com> wrote:
> Thanks for you reply.
>
> Can you please provide step by step instructions on how to upgrade the
> vdsm from a node command line?
>
Can you please report the version of vdsm you are using?
then check the ownership of
/rhev/data-center/00000000-0000-0000-0000-000000000000/05b2b2d5-a80e-4622-9410-8e1e9d362f3f/images/bb890447-f1f7-46af-8e57-543d61f0bd08/81685d19-0060-4f5d-a4cd-c5efa24aecfe
if it's not vdsm:kvm, change it and then try again with hosted-engine
--vm-start
>
> On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi <stirabos(a)redhat.com>
> wrote:
>
>> Hi Ada,
>> here the error:
>>
>> 2019-03-19 14:08:25,833+0200 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer]
>> RPC call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312)
>> 2019-03-19 14:08:25,839+0200 INFO (vm/a492d2eb) [vdsm.api] FINISH
>> prepareImage error=Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal,
>> task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52)
>> 2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb)
>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
>> Unexpected error (task:875)
>> Traceback (most recent call last):
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
>> 882, in _run
>> return fn(*args, **kargs)
>> File "<string>", line 2, in prepareImage
>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50,
>> in method
>> ret = func(*args, **kwargs)
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line
>> 3199, in prepareImage
>> legality = dom.produceVolume(imgUUID, volUUID).getLegality()
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line
822,
>> in produceVolume
>> volUUID)
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 801, in __init__
>> self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID,
>> volUUID)
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>> line 71, in __init__
>> volUUID)
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 86, in __init__
>> self.validate()
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 112, in validate
>> self.validateVolumePath()
>> File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>> line 131, in validateVolumePath
>> raise se.VolumeDoesNotExist(self.volUUID)
>> VolumeDoesNotExist: Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)
>> 2019-03-19 14:08:25,840+0200 INFO (vm/a492d2eb)
>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
>> aborting: Task is aborted: "Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181)
>> 2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher]
>> FINISH prepareImage error=Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83)
>>
>> I think it's still
https://bugzilla.redhat.com/1666795
>> <
https://bugzilla.redhat.com/show_bug.cgi?id=1666795>
>>
>> Can you please try updating vdsm to vdsm-4.30.10 since the bug is
>> reported as solved in that version?
>>
>>
>>
>>
>> On Tue, Mar 19, 2019 at 12:30 PM ada per <adaper3(a)gmail.com> wrote:
>>
>>> an vdsm:
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Mar 19, 2019 at 1:24 PM ada per <adaper3(a)gmail.com> wrote:
>>>
>>>> Thank you! please see attached files:
>>>>
>>>> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi <
>>>> stirabos(a)redhat.com> wrote:
>>>>
>>>>> Can you please check/attach also
>>>>> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log
?
>>>>>
>>>>> On Tue, Mar 19, 2019 at 11:36 AM ada per <adaper3(a)gmail.com>
wrote:
>>>>>
>>>>>> Hello everyone,
>>>>>>
>>>>>> For a strange reason the hosted engine went down and I cannot
>>>>>> restart it. I tried manually restarting it without any success
can you
>>>>>> please advice?
>>>>>>
>>>>>> For all the nodes the engine status is the same as the one
below.
>>>>>> --== Host nodex. (id: 6) status ==--
>>>>>> conf_on_shared_storage : True
>>>>>> Status up-to-date : True
>>>>>> Hostname : nodex
>>>>>> Host ID : 6
>>>>>> Engine status : {"reason":
"bad vm status",
>>>>>> "health": "bad", "vm":
"down_unexpected", "detail": "Down"}
>>>>>> Score : 3400
>>>>>> stopped : False
>>>>>> Local maintenance : False
>>>>>> crc32 : 323a9f45
>>>>>> local_conf_timestamp : 2648874
>>>>>> Host timestamp : 2648874
>>>>>> Extra metadata (valid at timestamp):
>>>>>> metadata_parse_version=1
>>>>>> metadata_feature_version=1
>>>>>> timestamp=2648874 (Tue Mar 19 12:25:44 2019)
>>>>>> host-id=6
>>>>>> score=3400
>>>>>> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
>>>>>> conf_on_shared_storage=True
>>>>>> maintenance=False
>>>>>> state=GlobalMaintenance
>>>>>> stopped=False
>>>>>>
>>>>>> When I try the commands
>>>>>> root@node5# hosted-engine --vm-shutdown
>>>>>> I ge the response:
>>>>>> root@node5# Command VM.shutdown with args {'delay':
'120',
>>>>>> 'message': 'VM is shutting down!',
'vmID':
>>>>>> 'a492d2eb-1dfd-470d-a141-3e55d2189275'} failed:(code=1,
message=Virtual
>>>>>> machine does not exist)
>>>>>>
>>>>>> But when I run : hosted-engine --vm-start
>>>>>> I get the response: VM exists and is down, cleaning up and
restarting
>>>>>>
>>>>>>
>>>>>>
>>>>>> Below you can see the # journalctl -u ovirt-ha-agent logs
>>>>>>
>>>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Unhandled
>>>>>> monitoring loop exception
>>>>>>
Traceback
>>>>>> (most recent call last):
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 430, in start_monitoring
>>>>>>
>>>>>> self._monitoring_loop()
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 449, in _monitoring_loop
>>>>>>
for
>>>>>> old_state, state, delay in self.fsm:
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
>>>>>> line 127, in next
>>>>>>
>>>>>> new_data = self.refresh(self._state.data)
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
>>>>>> line 81, in refresh
>>>>>>
>>>>>> stats.update(self.hosted_engine.collect_stats())
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 737, in collect_stats
>>>>>>
>>>>>> all_stats = self._broker.get_stats_from_storage()
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>> line 143, in get_stats_from_storage
>>>>>>
result
>>>>>> = self._proxy.get_stats()
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in
__call__
>>>>>>
return
>>>>>> self.__send(self.__name, args)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in
__request
>>>>>>
>>>>>> verbose=self.__verbose
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in
request
>>>>>>
return
>>>>>> self.single_request(host, handler, request_body, verbose)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in
single_request
>>>>>>
>>>>>> self.send_content(h, request_body)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in
send_content
>>>>>>
>>>>>> connection.endheaders(request_body)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in
endheaders
>>>>>>
>>>>>> self._send_output(message_body)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/httplib.py", line 881, in
_send_output
>>>>>>
>>>>>> self.send(msg)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send
>>>>>>
>>>>>> self.connect()
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
>>>>>> line 52, in connect
>>>>>>
>>>>>> self.sock.connect(base64.b16decode(self.host))
>>>>>> File
>>>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth
>>>>>>
return
>>>>>> getattr(self._sock,name)(*args)
>>>>>> error:
>>>>>> [Errno 2] No such file or directory
>>>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most
recent call
>>>>>> last):
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>> line 131, in _run_agent
>>>>>>
return
>>>>>> action(he)
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>> line 55, in action_proper
>>>>>>
return
>>>>>> he.start_monitoring()
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 437, in start_monitoring
>>>>>>
>>>>>> self.publish(stopped)
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 337, in publish
>>>>>>
>>>>>> self._push_to_storage(blocks)
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 708, in _push_to_storage
>>>>>>
>>>>>> self._broker.put_stats_on_storage(self.host_id, blocks)
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>> line 113, in put_stats_on_storage
>>>>>>
>>>>>> self._proxy.put_stats(host_id, xmlrpclib.Binary(data))
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in
__call__
>>>>>>
return
>>>>>> self.__send(self.__name, args)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in
__request
>>>>>>
>>>>>> verbose=self.__verbose
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in
request
>>>>>>
return
>>>>>> self.single_request(host, handler, request_body, verbose)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in
single_request
>>>>>>
>>>>>> self.send_content(h, request_body)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in
send_content
>>>>>>
>>>>>> connection.endheaders(request_body)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in
endheaders
>>>>>>
>>>>>> self._send_output(message_body)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/httplib.py", line 881, in
_send_output
>>>>>>
>>>>>> self.send(msg)
>>>>>> File
>>>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send
>>>>>>
>>>>>> self.connect()
>>>>>> File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
>>>>>> line 52, in connect
>>>>>>
>>>>>> self.sock.connect(base64.b16decode(self.host))
>>>>>> File
>>>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth
>>>>>>
return
>>>>>> getattr(self._sock,name)(*args)
>>>>>> error:
>>>>>> [Errno 2] No such file or directory
>>>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart
agent
>>>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service: main
>>>>>> process exited, code=exited, status=157/n/a
>>>>>> Mar 14 12:04:42 node7. systemd[1]: Unit ovirt-ha-agent.service
>>>>>> entered failed state.
>>>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service
failed.
>>>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service
holdoff
>>>>>> time over, scheduling restart.
>>>>>> Mar 14 12:04:52 node7. systemd[1]: Stopped oVirt Hosted Engine
High
>>>>>> Availability Monitoring Agent.
>>>>>> Mar 14 12:04:52 node7. systemd[1]: Started oVirt Hosted Engine
High
>>>>>> Availability Monitoring Agent.
>>>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Failed to
>>>>>> start necessary monitors
>>>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most
recent call
>>>>>> last):
>>>>>>
File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>> line 131, in _run_agent
>>>>>>
>>>>>> return action(he)
>>>>>>
File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>> line 55, in action_proper
>>>>>>
>>>>>> return he.start_monitoring()
>>>>>>
File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 413, in start_monitoring
>>>>>>
>>>>>> self._initialize_broker()
>>>>>>
File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>> line 537, in _initialize_broker
>>>>>>
>>>>>> m.get('options', {}))
>>>>>>
File
>>>>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>> line 86, in start_monitor
>>>>>>
>>>>>> ).format(t=type, o=options, e=e)
>>>>>>
>>>>>> RequestError: brokerlink - failed to start monitor via
ovirt-ha-broker:
>>>>>> [Errno 2] No such file or directory, [monitor: 'ping',
options: {'addr': '19
>>>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart
agent
>>>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service: main
>>>>>> process exited, code=exited, status=157/n/a
>>>>>> Mar 14 12:04:52 node7. systemd[1]: Unit ovirt-ha-agent.service
>>>>>> entered failed state.
>>>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service
failed.
>>>>>> Mar 14 12:05:02 node7. systemd[1]: ovirt-ha-agent.service
holdoff
>>>>>> time over, scheduling restart.
>>>>>> Mar 14 12:05:02 node7. systemd[1]: Stopped oVirt Hosted Engine
High
>>>>>> Availability Monitoring Agent.
>>>>>> Mar 14 12:05:02 node7. systemd[1]: Started oVirt Hosted Engine
High
>>>>>> Availability Monitoring Agent.
>>>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Failed to
>>>>>> stop engine vm with /usr/sbin/hosted-engine --vm-poweroff: Co
>>>>>>
(code=1,
>>>>>> message=Virtual machine does not exist: {'vmId':
>>>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'})
>>>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Failed to
>>>>>> stop engine VM: Command VM.destroy with args {'vmID':
'a492d2
>>>>>>
(code=1,
>>>>>> message=Virtual machine does not exist: {'vmId':
>>>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'})
>>>>>> Mar 15 14:28:16 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:28:36 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:29:00 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:29:22 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:29:44 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:30:06 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:30:28 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:30:50 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:31:12 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:31:33 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:31:56 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:32:18 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:32:40 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> Mar 15 14:33:02 node7. ovirt-ha-agent[31822]: ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Engine VM
>>>>>> stopped on localhost
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>>>>>> oVirt Code of Conduct:
>>>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
>>>>>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NS2SASAK66T...
>>>>>>
>>>>>