[ovirt-users] 回复:Re: Hosted-engine can not_switch

dhy336 at sina.com dhy336 at sina.com
Thu Apr 26 12:04:14 UTC 2018


Hi Martin, here is engine vm log
----- Original Message -----
From: Martin Sivak <msivak at redhat.com>
To: dhy336 <dhy336 at sina.com>, Martin Perina <mperina at redhat.com>
Cc: users <users at ovirt.org>
Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
Date: 2018-04-26 15:45


Hi,
Martin, can you please take a look? Even though the setup is a bit
weird (the hostname mainly..) it seems to be running, but the health
endpoint returns 404. Is there something maybe in SSO hostname
detection that would cause that? How can we debug this more?
A current summary of the issue at hand:
> hosted-engine.ovirt.com=192.168.122.91, it is engine VM, visit  hosted-engine.ovirt.com show me web UI.
> [root at hosted-engine2 ~]# curl http://hosted-engine.ovirt.com/ovirt-engine/services/health
> <html><head><title>Error</title></head><body>404 - Not Found</body></html>
Best regards
Martin Sivak
On Thu, Apr 26, 2018 at 9:32 AM, dhy336 <dhy336 at sina.com> wrote:
> sorry, I used 192.168.223 to replace 192.168.122.65, forget tell you,
> hosted-engine.ovirt.com=192.168.122.91, it is engine VM, visit
> hosted-engine.ovirt.com show me web UI.
>
> 发自网易邮箱手机版
>
> 在2018年04月26日 14:52,Martin Sivak 写道:
>
> Hi,
>
>> hosted-engine1 : 192.168.122.66
>> hosted-engine2 : 192.168.122.223
>
> But you said in an earlier email that:
>
>> I hava two node, A:192.168.122.65 ,   B:192.168.122.66
>
> Make sure your names resolve properly. So far it does exactly what it
> is supposed to do - when the engine is unreachable, it tries
> restarting it. Did you really use hosted-engine.ovirt.com as the fqdn?
> Are you sure it resolves to whatever IP the VM has (192.168.122.91)?
>
> Maybe you used /etc/hosts to configure the name on the first host and
> in the VM, but miss the record on the second host?
>
> What does $(host hosted-engine.ovirt.com) show you?
>
>> I can not visit web UI, but my engine VM is run, i can login it.  engine
>> has
>> some error
>>
>>
>> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>>  execution failed: java.net.NoRouteToHostException: No route to host
>
> I told you before. This is normal as it is trying to figure out
> whether the host is up.
>
>
> Best regards
>
> Martin Sivak
>
>
> On Thu, Apr 26, 2018 at 4:14 AM,  <dhy336 at sina.com> wrote:
>> engine VM:192.168.122.91
>> hosted-engine1 : 192.168.122.66
>> hosted-engine2 : 192.168.122.223
>>
>> I can not visit web UI, but my engine VM is run, i can login it.  engine
>> has
>> some error
>>
>>  2018-04-25 18:35:03,401+08 INFO
>>  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
>> Reactor)
>>  [] Connecting to hosted-engine1/192.168.122.66
>>  2018-04-25 18:35:06,411+08 ERROR
>>  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>  (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command
>>  'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
>>
>>
>> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>>  execution failed: java.net.NoRouteToHostException: No route to host
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------
>> [root at hosted-engine2 ~]# hosted-engine --check-liveliness
>> Hosted Engine is not up!
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------
>> [root at hosted-engine2 ~]# curl
>> http://hosted-engine.ovirt.com/ovirt-engine/services/health
>> <html><head><title>Error</title></head><body>404 - Not Found</body></html>
>>
>> Note: this command is blocked ,it takes 5 minutes
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage             : True
>> Status up-to-date                  : False
>> Hostname                           : hosted-engine1
>> Host ID                            : 1
>> Engine status                      : unknown stale-data
>> Score                              : 3400
>> stopped                            : False
>> Local maintenance                  : False
>> crc32                              : 1eae8968
>> local_conf_timestamp               : 48907
>> Host timestamp                     : 48907
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=48907 (Thu Apr 26 01:57:14 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=48907 (Thu Apr 26 01:57:15 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUp
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage             : True
>> Status up-to-date                  : True
>> Hostname                           : hosted-engine2
>> Host ID                            : 2
>> Engine status                      : {"reason": "failed liveliness check",
>> "health": "bad", "vm": "up", "detail": "Up"}
>> Score                              : 3000
>> stopped                            : False
>> Local maintenance                  : False
>> crc32                              : 1b92756d
>> local_conf_timestamp               : 44057
>> Host timestamp                     : 44057
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=44057 (Thu Apr 26 02:00:57 2018)
>> host-id=2
>> score=3000
>> vm_conf_refresh_time=44057 (Thu Apr 26 02:00:57 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineStarting
>> stopped=False
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----ovirt
>> From: Martin Sivak <msivak at redhat.com>
>> To: dhy336 <dhy336 at sina.com>
>> Cc: users <users at ovirt.org>
>> Subject: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>> Date: 2018-04-25 20:41
>>
>>
>>> 2018-04-25 18:35:06,411+08 ERROR
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command
>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
>>>
>>>
>>> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>>> execution failed: java.net.NoRouteToHostException: No route to host
>> This is expected and normal. The ovirt-engine service is trying to
>> find out whether host A is still unreachable or not. This is not the
>> issue you are looking for.
>>> 192.168.122.66 has been powered off, and hosted engine VM run in
>>> 192.168.122.223, I think engine should connect to 192.168.122.223,
>> You are mixing the IP of the engine VM and the IP of a host. The
>> engine runs in VM with stable .122.223 (independent on which host the
>> VM runs at) and manages two hosts .122.65 and .122.66. The engine
>> constantly monitors all its hosts and that means it is trying to
>> connect to them every now and then.
>> Please execute the two following commands on Host B and show us the
>> results (use the proper fqdn):
>> $(hosted-engine --check-liveliness)
>> $(curl http://{fqdn}/ovirt-engine/services/health)
>> Best regards
>> Martin Sivak
>> On Wed, Apr 25, 2018 at 2:34 PM, <dhy336 at sina.com> wrote:
>>> I login in engine VM by (#hosted-engine --console) , I find ovirt-engine
>>> process. and I find some error in /var/log/ovirt-engine/engine.log
>>>
>>> 192.168.122.66 has been powered off, and hosted engine VM run in
>>> 192.168.122.223, I think engine should connect to 192.168.122.223,
>>>
>>>
>>> 2018-04-25 18:35:03,401+08 INFO
>>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
>>> Reactor)
>>> [] Connecting to hosted-engine1/192.168.122.66
>>> 2018-04-25 18:35:06,411+08 ERROR
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command
>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
>>>
>>>
>>> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>>> execution failed: java.net.NoRouteToHostException: No route to host
>>> 2018-04-25 18:35:06,411+08 INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Failed to fetch vms
>>> info for host 'hosted-engine1' - skipping VMs monitoring.
>>> 2018-04-25 18:35:21,420+08 INFO
>>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
>>> Reactor)
>>> [] Connecting to hosted-engine1/192.168.122.66
>>> 2018-04-25 18:35:24,430+08 ERROR
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-1) [] Command
>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1,
>>>
>>>
>>> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})'
>>> execution failed: java.net.NoRouteToHostException: No route to host
>>> 2018-04-25 18:35:24,431+08 INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-1) [] Failed to fetch vms
>>> info for host 'hosted-engine1' - skipping VMs monitoring.
>>> 2018-04-25 18:35:39,438+08 INFO
>>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
>>> Reactor)
>>> [] Connecting to hosted-engine1/192.168.122.66
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Martin Sivak <msivak at redhat.com>
>>> To: dhy336 <dhy336 at sina.com>
>>> Cc: users <users at ovirt.org>
>>> Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>>> Date: 2018-04-25 20:27
>>>
>>>
>>> The engine will try connecting to all registered hosts all the time.
>>> That is normal.
>>> If your host can reach the engine then check whether it can reach
>>> http://{fqdn}/ovirt-engine/services/health as that is what is used to
>>> make sure the engine is alive.
>>> Best regards
>>> Martin Sivak
>>> On Wed, Apr 25, 2018 at 2:15 PM, <dhy336 at sina.com> wrote:
>>>> Hi Martin,
>>>>
>>>> thank you for answer
>>>> my host can reach the engine, I confuse why engine connect to another
>>>> host
>>>> which has been power off by me?
>>>>
>>>> ----- Original Message -----
>>>> From: Martin Sivak <msivak at redhat.com>
>>>> To: dhy336 <dhy336 at sina.com>, users <users at ovirt.org>
>>>> Subject: Re: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
>>>> not_switch
>>>> Date: 2018-04-25 19:12
>>>>
>>>> It is as I expected:
>>>> Engine status : {"reason": "failed liveliness check"
>>>> The host can't talk to the ovirt-engine service. Please make sure the
>>>> host can reach the engine fqdn as configured in
>>>> /etc/ovirt-hosted-engine/hosted-engine.conf on the fqdn= line.
>>>> You can check it manually by executing $(hosted-engine
>>>> --check-liveliness) from the host.
>>>> Best regards
>>>> Martin Sivak
>>>> On Wed, Apr 25, 2018 at 12:51 PM, <dhy336 at sina.com> wrote:
>>>>> Hi,
>>>>>
>>>>> two node :
>>>>> 192.168.122.66 hosted-engine1
>>>>> 192.168.122.223 hosted-engine2
>>>>>
>>>>> I power off hosted-engine1, so I do not attach hosted-engine1`s log,
>>>>>
>>>>> [root at hosted-engine2 ~]# hosted-engine --vm-status
>>>>>
>>>>> --== Host 1 status ==--
>>>>>
>>>>> conf_on_shared_storage : True
>>>>> Status up-to-date : False
>>>>> Hostname : hosted-engine1
>>>>> Host ID : 1
>>>>> Engine status : unknown stale-data
>>>>> Score : 3400
>>>>> stopped : False
>>>>> Local maintenance : False
>>>>> crc32 : a7af0afa
>>>>> local_conf_timestamp : 11485
>>>>> Host timestamp : 11485
>>>>> Extra metadata (valid at timestamp):
>>>>> metadata_parse_version=1
>>>>> metadata_feature_version=1
>>>>> timestamp=11485 (Wed Apr 25 10:08:34 2018)
>>>>> host-id=1
>>>>> score=3400
>>>>> vm_conf_refresh_time=11485 (Wed Apr 25 10:08:34 2018)
>>>>> conf_on_shared_storage=True
>>>>> maintenance=False
>>>>> state=EngineUp
>>>>> stopped=False
>>>>>
>>>>>
>>>>> --== Host 2 status ==--
>>>>>
>>>>> conf_on_shared_storage : True
>>>>> Status up-to-date : True
>>>>> Hostname : hosted-engine2
>>>>> Host ID : 2
>>>>> Engine status : {"reason": "failed liveliness check",
>>>>> "health": "bad", "vm": "up", "detail": "Up"}
>>>>> Score : 3000
>>>>> stopped : False
>>>>> Local maintenance : False
>>>>> crc32 : a2e82883
>>>>> local_conf_timestamp : 6278
>>>>> Host timestamp : 6278
>>>>> Extra metadata (valid at timestamp):
>>>>> metadata_parse_version=1
>>>>> metadata_feature_version=1
>>>>> timestamp=6278 (Wed Apr 25 10:37:44 2018)
>>>>> host-id=2
>>>>> score=3000
>>>>> vm_conf_refresh_time=6278 (Wed Apr 25 10:37:44 2018)
>>>>> conf_on_shared_storage=True
>>>>> maintenance=False
>>>>> state=EngineStop
>>>>> stopped=False
>>>>> timeout=Thu Jan 1 09:49:38 1970
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>> From: Martin Sivak <msivak at redhat.com>
>>>>> To: dhy336 <dhy336 at sina.com>, users <users at ovirt.org>
>>>>> Subject: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
>>>>> not_switch
>>>>> Date: 2018-04-25 17:41
>>>>>
>>>>>
>>>>> Please attach the output of hosted-engine --vm-status and the
>>>>> /var/log/ovirt-hosted-engine-ha/agent.log file from both hosts.
>>>>> The VM will restart if the ovirt-engine service does not become
>>>>> available within timeout. And that might mean couple of things - the
>>>>> FQDN of the engine is wrong, the engine needs something that was only
>>>>> available on the dead host (A) like some storage, host B cannot ping
>>>>> the gateway..
>>>>> Best regards
>>>>> Martin Sivak
>>>>> On Wed, Apr 25, 2018 at 11:33 AM, <dhy336 at sina.com> wrote:
>>>>>> sorry, I mis-represent,
>>>>>>
>>>>>> I hava two node, A:192.168.122.65 , B:192.168.122.66 with
>>>>>> hosted-engine.
>>>>>>
>>>>>> testing engine HA :
>>>>>>
>>>>>> first two node is up, and hosted-engine VM run in A, then I poweroff
>>>>>> A,
>>>>>> and
>>>>>> after 3 minutes, B start it`s hosted engine VM,
>>>>>> But it`s ovirt-engine connect to host A, and continue for about 10
>>>>>> minutes,
>>>>>> then hosted engine VM restart.
>>>>>> ----- Original Message -----
>>>>>> From: Martin Sivak <msivak at redhat.com>
>>>>>> To: dhy336 <dhy336 at sina.com>
>>>>>> Subject: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
>>>>>> not_switch
>>>>>> Date: 2018-04-25 17:11
>>>>>>
>>>>>>
>>>>>> Your hosted engine VM has its own address that does not depend on
>>>>>> which host it is currently running. So it should be available on the
>>>>>> same address no matter where the VM is running.
>>>>>> Best regards
>>>>>> Martin Sivak
>>>>>> On Wed, Apr 25, 2018 at 9:07 AM, <dhy336 at sina.com> wrote:
>>>>>>>>> I deploy two node for hosted engine, first hosted engine VM run in
>>>>>>>>> 192.168.122.65, I power off this host, hosted-engine VM switch
>>>>>>>>> another host,but ovirt engine still connect 192.168.122.65. if
>>>>>>>>> restart
>>>>>>>>> ovirt-engine server, it is work.
>>>>>>>
>>>>>>> I think this issue is error, because hosted engine VM has power up in
>>>>>>> another host( 192.168.122.66), so hosted engine should
>>>>>>> connect to host( 192.168.122.66), not connet to host(192.168.122.66)?
>>>>>>>
>>>>>>> thanks
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>> From: Martin Sivak <msivak at redhat.com>
>>>>>>> To: dhy336 <dhy336 at sina.com>
>>>>>>> Cc: users <users at ovirt.org>
>>>>>>> Subject: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can
>>>>>>> not_switch
>>>>>>> Date: 2018-04-20 18:28
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>> No, this is not an error. You killed the host without moving it to
>>>>>>> maintenance first. The engine has no way to distinguish this from
>>>>>>> temporary network failure for example. Give it some time and the host
>>>>>>> will move its status to one of the error states and handle the highly
>>>>>>> available VMs on it (if fencing is properly configured).
>>>>>>> Best regards
>>>>>>> Martin Sivak
>>>>>>> On Fri, Apr 20, 2018 at 12:13 PM, <dhy336 at sina.com> wrote:
>>>>>>>> this process is not error ?
>>>>>>>> ----- Original Message -----
>>>>>>>> From: Martin Sivak <msivak at redhat.com>
>>>>>>>> To: dhy336 <dhy336 at sina.com>
>>>>>>>> Cc: users <users at ovirt.org>
>>>>>>>> Subject: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>>>>>>>> Date: 2018-04-20 18:05
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>> the engine does not know you killed the host. It will notice
>>>>>>>> eventually and handle the situation. Just give it time (5 minutes or
>>>>>>>> so).
>>>>>>>> Best regards
>>>>>>>> --
>>>>>>>> Martin Sivak
>>>>>>>> SLA / oVirt
>>>>>>>> On Fri, Apr 20, 2018 at 12:00 PM, <dhy336 at sina.com> wrote:
>>>>>>>>> Hi, thanks for your feedback. I hava another qeustions
>>>>>>>>>
>>>>>>>>> I deploy two node for hosted engine, first hosted engine VM run in
>>>>>>>>> 192.168.122.65, I power off this host, hosted-engine VM switch
>>>>>>>>> another host,but ovirt engine still connect 192.168.122.65. if
>>>>>>>>> restart
>>>>>>>>> ovirt-engine server, it is work.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2018-04-20 17:13:04,692+08 ERROR
>>>>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>>>>>>>> (EE-ManagedThreadFactory-en gineScheduled-Thread-98) [] Command
>>>>>>>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine2,
>>>>>>>>> VdsIdVDSCommandParametersBase:{hos
>>>>>>>>> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed:
>>>>>>>>> java.net.NoRouteToHostException: No route to host
>>>>>>>>> 6568 2018-04-20 17:13:04,693+08 INFO
>>>>>>>>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>>>>>>>>> (EE-ManagedThreadFactory-engi neScheduled-Thread-98) [] Failed to
>>>>>>>>> fetch
>>>>>>>>> vms info for host 'hosted-engin2' - skipping VMs monitoring.
>>>>>>>>> 6569 2018-04-20 17:13:19,710+08 INFO
>>>>>>>>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
>>>>>>>>> Reactor)
>>>>>>>>> [] Connecting to hosted-engine2/192.168.122.656570 2018-04-20
>>>>>>>>> 17:13:22,730+08 ERROR
>>>>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>>>>>>>>> (EE-ManagedThreadFactory-en gineScheduled-Thread-45) [] Command
>>>>>>>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine-tchyp2,
>>>>>>>>> VdsIdVDSCommandParametersBase:{hos
>>>>>>>>> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed:
>>>>>>>>> java.net.NoRouteToHostException: No route to host
>>>>>>>>> 6571 2018-04-20 17:13:22,732+08 INFO
>>>>>>>>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>>>>>>>>> (EE-ManagedThreadFactory-engi neScheduled-Thread-45) [] Failed to
>>>>>>>>> fetch
>>>>>>>>> vms info for host 'hosted-engine2' - skipping VMs monitoring.
>>>>>>>>>
>>>>>>>>> ----- Original Message -----
>>>>>>>>> From: Martin Sivak <msivak at redhat.com>
>>>>>>>>> To: dhy336 <dhy336 at sina.com>
>>>>>>>>> Cc: users <users at ovirt.org>
>>>>>>>>> Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch
>>>>>>>>> Date: 2018-04-20 16:40
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> your ovirt-hosted-engine-ha package is too old. You need at least
>>>>>>>>> 2.1.9 to properly support 4.2 engine. The same applies to vdsm.
>>>>>>>>> Please
>>>>>>>>> upgrade the node.
>>>>>>>>> Best regards
>>>>>>>>> Martin Sivak
>>>>>>>>> On Fri, Apr 20, 2018 at 3:58 AM, <dhy336 at sina.com> wrote:
>>>>>>>>>> Hi I find some error logs in
>>>>>>>>>> /var/log/ovirt-hosted-engine-ha/broker.
>>>>>>>>>>
>>>>>>>>>> [root at hosted-engine2 ~]# ll /rhev/data-center/mnt
>>>>>>>>>> total 0
>>>>>>>>>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:28
>>>>>>>>>> 192.168.122.218:_exports_data
>>>>>>>>>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:12
>>>>>>>>>> 192.168.122.218:_exports_hosted-engine-test1
>>>>>>>>>> [root at hosted-engine2 ~]# ll
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
>>>>>>>>>> total 0
>>>>>>>>>> drwxr-xr-x. 5 vdsm kvm 50 Apr 18 22:14
>>>>>>>>>> 8a734205-65b7-4801-b7f0-d380eb45dbae
>>>>>>>>>> -rwxr-xr-x. 1 vdsm kvm 0 Apr 20 09:54 __DIRECT_IO_TEST__
>>>>>>>>>>
>>>>>>>>>> uuid 8a734205-65b7-4801-b7f0-d380eb45dbae is in
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/
>>>>>>>>>> but broker find it in /rhev/data-center/mnt, is it my version is
>>>>>>>>>> error?
>>>>>>>>>> my
>>>>>>>>>> ovirt-hosted-engine-ha version is 2.1.5, vdsm is 4.20.5,
>>>>>>>>>> ovirt-engine is 4.2
>>>>>>>>>>
>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 19:26:31,479::listener::41::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__)
>>>>>>>>>> Initializing SocketServer
>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 19:26:31,480::listener::56::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__)
>>>>>>>>>> SocketServer ready
>>>>>>>>>> Thread-1::INFO::2018-04-19
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 19:26:31,558::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>>>>>>>>>> Connection established
>>>>>>>>>> Thread-1::ERROR::2018-04-19
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 19:26:31,559::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>>>>>>>>>> Error handling request, data: 'set-storage-domain
>>>>>>>>>> FilesystemBackend
>>>>>>>>>> dom_type=nfs3 sd_uuid=8a734205-65b7-4801-b7f0-d380eb45dbae'
>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>> File
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>>>>>>>>> line 166, in handle
>>>>>>>>>> data)
>>>>>>>>>> File
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>>>>>>>>> line 299, in _dispatch
>>>>>>>>>> .set_storage_domain(client, sd_type, **options)
>>>>>>>>>> File
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>>>> line 66, in set_storage_domain
>>>>>>>>>> self._backends[client].connect()
>>>>>>>>>> File
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>>>>>>>>>> line 462, in connect
>>>>>>>>>> self._dom_type)
>>>>>>>>>> File
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>>>>>>>>>> line 107, in get_domain_path
>>>>>>>>>> " in {1}".format(sd_uuid, parent))
>>>>>>>>>> BackendFailureException: path to storage domain
>>>>>>>>>> 8a734205-65b7-4801-b7f0-d380eb45dbae not found in
>>>>>>>>>> /rhev/data-center/mnt
>>>>>>>>>> Thread-1::INFO::2018-04-19
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 19:26:31,563::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>>>>>>>>>> Connection closed
>>>>>>>>>> Thread-2::INFO::2018-04-19
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 19:26:44,601::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>>>>>>>>>> Connection established
>>>>>>>>>>
>>>>>>>>>> ----- 原始邮件 -----
>>>>>>>>>> 发件人:<dhy336 at sina.com>
>>>>>>>>>> 收件人:"Martin Sivak" <msivak at redhat.com>
>>>>>>>>>> 抄送人:users <users at ovirt.org>
>>>>>>>>>> 主题:[ovirt-users] 回复:Re: Hosted-engine can not_switch
>>>>>>>>>> 日期:2018年04月20日 09点30分
>>>>>>>>>>
>>>>>>>>>> libvirt has not error logs . I only find some error for vdsm.
>>>>>>>>>> vdsm log is:
>>>>>>>>>> 2018-04-20 09:24:52,610+0800 INFO (jsonrpc/1) [vdsm.api] FINISH
>>>>>>>>>> getVolumeInfo return={'info': {'status': 'OK', 'domain':
>>>>>>>>>> '8a734205-65b7-4801-b7f0-d380eb45dbae', 'voltype': 'LEAF',
>>>>>>>>>> 'description':
>>>>>>>>>> 'hosted-engine.lockspace', 'parent':
>>>>>>>>>> '00000000-0000-0000-0000-000000000000',
>>>>>>>>>> 'format': 'RAW', 'generation': 0, 'image':
>>>>>>>>>> '611272bd-c2cc-42bc-94e2-9aa52e754c35', 'ctime': '1524032037',
>>>>>>>>>> 'disktype':
>>>>>>>>>> '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1048576',
>>>>>>>>>> 'children': [], 'pool': '', 'capacity': '1048576', 'uuid':
>>>>>>>>>> u'7037aac6-7c8e-4efd-82f7-ca618c953fe6', 'truesize': '1048576',
>>>>>>>>>> 'type':
>>>>>>>>>> 'PREALLOCATED', 'lease': {'owners': [], 'version': None}}}
>>>>>>>>>> from=::1,48306,
>>>>>>>>>> task_id=03a7938e-8afb-4b16-b8dd-126c2b1f5d52 (api:52)
>>>>>>>>>> 2018-04-20 09:24:52,611+0800 INFO (jsonrpc/1)
>>>>>>>>>> [jsonrpc.JsonRpcServer]
>>>>>>>>>> RPC
>>>>>>>>>> call Volume.getInfo succeeded in 0.03 seconds (__init__:630)
>>>>>>>>>> 2018-04-20 09:24:54,113+0800 ERROR (periodic/3)
>>>>>>>>>> [virt.periodic.Operation]
>>>>>>>>>> <vdsm.virt.sampling.VMBulkstatsMonitor object at 0x1e92f90>
>>>>>>>>>> operation
>>>>>>>>>> failed
>>>>>>>>>> (periodic:215)
>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py",
>>>>>>>>>> line
>>>>>>>>>> 213,
>>>>>>>>>> in __call__
>>>>>>>>>> self._func()
>>>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py",
>>>>>>>>>> line
>>>>>>>>>> 522,
>>>>>>>>>> in __call__
>>>>>>>>>> self._send_metrics()
>>>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py",
>>>>>>>>>> line
>>>>>>>>>> 538,
>>>>>>>>>> in _send_metrics
>>>>>>>>>> vm_sample.interval)
>>>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line
>>>>>>>>>> 45,
>>>>>>>>>> in
>>>>>>>>>> produce
>>>>>>>>>> networks(vm, stats, first_sample, last_sample, interval)
>>>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line
>>>>>>>>>> 322,
>>>>>>>>>> in
>>>>>>>>>> networks
>>>>>>>>>> if nic.name.startswith('hostdev'):
>>>>>>>>>> AttributeError: name
>>>>>>>>>> 2018-04-20 09:24:54,800+0800 INFO (Reactor thread)
>>>>>>>>>> [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48308
>>>>>>>>>> (protocoldetector:61)
>>>>>>>>>> 2018-04-20 09:24:54,810+0800 INFO (Reactor thread)
>>>>>>>>>> [ProtocolDetector.Detector] Detected protocol stomp from ::1:48308
>>>>>>>>>> (protocoldetector:125)
>>>>>>>>>> 2018-04-20 09:24:54,810+0800 INFO (Reactor thread)
>>>>>>>>>> [Broker.StompAdapter]
>>>>>>>>>> Processing CONNECT request (stompreactor:103)
>>>>>>>>>> 2018-04-20 09:24:54,818+0800 INFO (JsonRpc (StompReactor))
>>>>>>>>>> [Broker.StompAdapter] Subscribe command received
>>>>>>>>>> (stompreactor:132)
>>>>>>>>>> 2018-04-20 09:24:55,119+0800 INFO (jsonrpc/6) [api.host] START
>>>>>>>>>> getHardwareInfo() from=::1,48308 (api:46)
>>>>>>>>>>
>>>>>>>>>> ----- 原始邮件 -----
>>>>>>>>>> 发件人:Martin Sivak <msivak at redhat.com>
>>>>>>>>>> 收件人:dhy336 <dhy336 at sina.com>
>>>>>>>>>> 抄送人:users <users at ovirt.org>
>>>>>>>>>> 主题:Re: [ovirt-users] Hosted-engine can not switch
>>>>>>>>>> 日期:2018年04月19日 20点16分
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We need more than just this small log snippet. Please check the
>>>>>>>>>> vdsm
>>>>>>>>>> and libvirt logs as well.
>>>>>>>>>> Best regards
>>>>>>>>>> Martin Sivak
>>>>>>>>>> On Thu, Apr 19, 2018 at 2:05 PM, <dhy336 at sina.com> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> I deploy three node with hosted engine, I force shut down a node
>>>>>>>>>>> which
>>>>>>>>>>> Host-engine VM is run, But hosted engine VM in other nodes can
>>>>>>>>>>> not
>>>>>>>>>>> run.
>>>>>>>>>>>
>>>>>>>>>>> I find some error in /var/log/ovirt-hosted-engine-ha/agent.log
>>>>>>>>>>>
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:35,787::hosted_engine::1192::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
>>>>>>>>>>> Cleaning state for non-running VM
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:42,587::hosted_engine::1176::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
>>>>>>>>>>> Vdsm state for VM clean
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:42,589::hosted_engine::1125::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
>>>>>>>>>>> Starting vm using `/usr/sbin/hosted-engine --vm-start`
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:47,599::hosted_engine::1131::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
>>>>>>>>>>> stdout:
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:47,600::hosted_engine::1132::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
>>>>>>>>>>> stderr: Virtual machine does not exist: {'vmId':
>>>>>>>>>>> u'08bbd680-a8a7-4267-82e7-89f36e87e930'}
>>>>>>>>>>>
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:47,600::hosted_engine::1144::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
>>>>>>>>>>> Engine VM started on localhost
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:47,609::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>>>>>>>>>>> Trying: notify time=1524139007.61 type=state_transition
>>>>>>>>>>> detail=EngineStart-EngineStarting hostname='hosted-engine2'
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:47,670::brokerlink::121::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>>>>>>>>>>> Success, was notification of state_transition
>>>>>>>>>>> (EngineStart-EngineStarting)
>>>>>>>>>>> sent? sent
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:47,670::hosted_engine::604::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
>>>>>>>>>>> Initializing VDSM
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:50,095::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>>>>>>>>> Connecting the storage
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:50,096::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server)
>>>>>>>>>>> Validating storage server
>>>>>>>>>>> MainThread::INFO::2018-04-19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 19:56:52,449::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>>>>>>>>> Storage domain reported as valid and reconnect is not forced.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Users mailing list
>>>>>>>>>>> Users at ovirt.org
>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list
>>>>>>>>>> Users at ovirt.org
>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180426/f85d395d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: server.log
Type: application/octet-stream
Size: 750283 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180426/f85d395d/attachment.obj>


More information about the Users mailing list