[ovirt-users] hosted-engine unknow stale-data

Kasturi Narra knarra at redhat.com
Mon Jan 22 06:46:15 UTC 2018


Hello Artem,

    Any reason why you chose hosted-engine undeploy action for the second
host ? I see that the cluster is in global maintenance mode, was this
intended ?

command to clear the entries from hosted-engine --vm-status is "hosted-engine
--clean-metadata --host-id=<old_host_id> --force-clean"

Hope this helps !!

Thanks
kasturi


On Fri, Jan 19, 2018 at 12:07 AM, Artem Tambovskiy <
artem.tambovskiy at gmail.com> wrote:

> Hi,
>
> Ok, i decided to remove second host from the cluster.
> I reinstalled from webUI it with hosted-engine action UNDEPLOY, and
> removed it from the cluster aftewards.
> All VM's are fine hosted engine running ok,
> But hosted-engine --vm-status still showing 2 hosts.
>
> How I can clean the traces of second host in a correct way?
>
>
> --== Host 1 status ==--
>
> conf_on_shared_storage             : True
> Status up-to-date                  : True
> Hostname                           : ovirt1.telia.ru
> Host ID                            : 1
> Engine status                      : {"health": "good", "vm": "up",
> "detail": "up"}
> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : 1b1b6f6d
> local_conf_timestamp               : 545385
> Host timestamp                     : 545385
> Extra metadata (valid at timestamp):
>         metadata_parse_version=1
>         metadata_feature_version=1
>         timestamp=545385 (Thu Jan 18 21:34:25 2018)
>         host-id=1
>         score=3400
>         vm_conf_refresh_time=545385 (Thu Jan 18 21:34:25 2018)
>         conf_on_shared_storage=True
>         maintenance=False
>         state=GlobalMaintenance
>         stopped=False
>
>
> --== Host 2 status ==--
>
> conf_on_shared_storage             : True
> Status up-to-date                  : False
> Hostname                           : ovirt1.telia.ru
> Host ID                            : 2
> Engine status                      : unknown stale-data
> Score                              : 0
> stopped                            : True
> Local maintenance                  : False
> crc32                              : c7037c03
> local_conf_timestamp               : 7530
> Host timestamp                     : 7530
> Extra metadata (valid at timestamp):
>         metadata_parse_version=1
>         metadata_feature_version=1
>         timestamp=7530 (Fri Jan 12 16:10:12 2018)
>         host-id=2
>         score=0
>         vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>         conf_on_shared_storage=True
>         maintenance=False
>         state=AgentStopped
>         stopped=True
>
>
> !! Cluster is in GLOBAL MAINTENANCE mode !!
>
> Thank you in advance!
> Regards,
> Artem
>
>
> On Wed, Jan 17, 2018 at 6:47 PM, Artem Tambovskiy <
> artem.tambovskiy at gmail.com> wrote:
>
>> Hello,
>>
>> Any further suggestions on how to fix the issue and make HA setup
>> working? Can the complete removal of second host (with complete removal
>> ovirt configuration files and packages) from cluster and adding it again
>> solve the issue? Or it might completly ruin the cluster?
>>
>> Regards,
>> Artem
>>
>> 16 янв. 2018 г. 17:00 пользователь "Artem Tambovskiy" <
>> artem.tambovskiy at gmail.com> написал:
>>
>> Hi Martin,
>>>
>>> Thanks for feedback.
>>>
>>> All hosts and hosted-engine running 4.1.8 release.
>>> The strange thing : I can see that host ID is set to 1 on both hosts at
>>> /etc/ovirt-hosted-engine/hosted-engine.conf file.
>>> I have no idea how this happen, the only thing I have changed recently
>>> is that I have changed mnt_options in order to add backup-volfile-servers
>>> by using hosted-engine --set-shared-config command
>>>
>>> Both agent and broker are running on second host
>>>
>>> [root at ovirt2 ovirt-hosted-engine-ha]# ps -ef | grep ovirt-ha-
>>> vdsm      42331      1 26 14:40 ?        00:31:35 /usr/bin/python
>>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon
>>> vdsm      42332      1  0 14:40 ?        00:00:16 /usr/bin/python
>>> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
>>>
>>> but I saw some tracebacks during the broker start
>>>
>>> [root at ovirt2 ovirt-hosted-engine-ha]# systemctl status ovirt-ha-broker
>>> -l
>>> ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability
>>> Communications Broker
>>>    Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
>>> enabled; vendor preset: disabled)
>>>    Active: active (running) since Tue 2018-01-16 14:40:15 MSK; 1h 58min
>>> ago
>>>  Main PID: 42331 (ovirt-ha-broker)
>>>    CGroup: /system.slice/ovirt-ha-broker.service
>>>            └─42331 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
>>> --no-daemon
>>>
>>> Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Started oVirt Hosted Engine
>>> High Availability Communications Broker.
>>> Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Starting oVirt Hosted
>>> Engine High Availability Communications Broker...
>>> Jan 16 14:40:16 ovirt2.telia.ru ovirt-ha-broker[42331]: ovirt-ha-broker
>>> ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error
>>> handling request, data: 'set-storage-domain FilesystemBackend
>>> dom_type=glusterfs sd_uuid=4a7f8717-9bb0-4d80-8016-498fa4b88162'
>>>                                                         Traceback (most
>>> recent call last):
>>>                                                           File
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>> line 166, in handle
>>>                                                             data)
>>>                                                           File
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>> line 299, in _dispatch
>>>
>>> .set_storage_domain(client, sd_type, **options)
>>>                                                           File
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>> line 66, in set_storage_domain
>>>
>>> self._backends[client].connect()
>>>                                                           File
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>>> line 462, in connect
>>>
>>> self._dom_type)
>>>                                                           File
>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>>> line 107, in get_domain_path
>>>                                                             " in
>>> {1}".format(sd_uuid, parent))
>>>
>>> BackendFailureException: path to storage domain
>>> 4a7f8717-9bb0-4d80-8016-498fa4b88162 not found in
>>> /rhev/data-center/mnt/glusterSD
>>>
>>>
>>>
>>> I have tried to issue hosted-engine --connect-storage on second host
>>> followed by agent & broker restart
>>> But there is no any visible improvements.
>>>
>>> Regards,
>>> Artem
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jan 16, 2018 at 4:18 PM, Martin Sivak <msivak at redhat.com> wrote:
>>>
>>>> Hi everybody,
>>>>
>>>> there are couple of things to check here.
>>>>
>>>> - what version of hosted engine agent is this? The logs look like
>>>> coming from 4.1
>>>> - what version of engine is used?
>>>> - check the host ID in /etc/ovirt-hosted-engine/hosted-engine.conf on
>>>> both hosts, the numbers must be different
>>>> - it looks like the agent or broker on host 2 is not active (or there
>>>> would be a report)
>>>> - the second host does not see data from the first host (unknown
>>>> stale-data), wait for a minute and check again, then check the storage
>>>> connection
>>>>
>>>> And then the general troubleshooting:
>>>>
>>>> - put hosted engine in global maintenance mode (and check that it is
>>>> visible from the other host using he --vm-status)
>>>> - mount storage domain (hosted-engine --connect-storage)
>>>> - check sanlock client status to see if proper lockspaces are present
>>>>
>>>> Best regards
>>>>
>>>> Martin Sivak
>>>>
>>>> On Tue, Jan 16, 2018 at 1:16 PM, Derek Atkins <derek at ihtfp.com> wrote:
>>>> > Why are both hosts reporting as ovirt 1?
>>>> > Look at the hostname fields to see what mean.
>>>> >
>>>> > -derek
>>>> > Sent using my mobile device. Please excuse any typos.
>>>> >
>>>> > On January 16, 2018 7:11:09 AM Artem Tambovskiy <
>>>> artem.tambovskiy at gmail.com>
>>>> > wrote:
>>>> >>
>>>> >> Hello,
>>>> >>
>>>> >> Yes, I followed exactly the same procedure while reinstalling the
>>>> hosts
>>>> >> (the only difference that I have SSH key configured instead of the
>>>> >> password).
>>>> >>
>>>> >> Just reinstalled the second host one more time, after 20 min the host
>>>> >> still haven't reached active score of 3400 (Hosted Engine HA:Not
>>>> Active) and
>>>> >> I still don't see crown icon for this host.
>>>> >>
>>>> >> hosted-engine --vm-status  from ovirt1 host
>>>> >>
>>>> >> [root at ovirt1 ~]# hosted-engine --vm-status
>>>> >>
>>>> >>
>>>> >> --== Host 1 status ==--
>>>> >>
>>>> >> conf_on_shared_storage             : True
>>>> >> Status up-to-date                  : True
>>>> >> Hostname                           : ovirt1.telia.ru
>>>> >> Host ID                            : 1
>>>> >> Engine status                      : {"health": "good", "vm": "up",
>>>> >> "detail": "up"}
>>>> >> Score                              : 3400
>>>> >> stopped                            : False
>>>> >> Local maintenance                  : False
>>>> >> crc32                              : 3f94156a
>>>> >> local_conf_timestamp               : 349144
>>>> >> Host timestamp                     : 349144
>>>> >> Extra metadata (valid at timestamp):
>>>> >>         metadata_parse_version=1
>>>> >>         metadata_feature_version=1
>>>> >>         timestamp=349144 (Tue Jan 16 15:03:45 2018)
>>>> >>         host-id=1
>>>> >>         score=3400
>>>> >>         vm_conf_refresh_time=349144 (Tue Jan 16 15:03:45 2018)
>>>> >>         conf_on_shared_storage=True
>>>> >>         maintenance=False
>>>> >>         state=EngineUp
>>>> >>         stopped=False
>>>> >>
>>>> >>
>>>> >> --== Host 2 status ==--
>>>> >>
>>>> >> conf_on_shared_storage             : True
>>>> >> Status up-to-date                  : False
>>>> >> Hostname                           : ovirt1.telia.ru
>>>> >> Host ID                            : 2
>>>> >> Engine status                      : unknown stale-data
>>>> >> Score                              : 0
>>>> >> stopped                            : True
>>>> >> Local maintenance                  : False
>>>> >> crc32                              : c7037c03
>>>> >> local_conf_timestamp               : 7530
>>>> >> Host timestamp                     : 7530
>>>> >> Extra metadata (valid at timestamp):
>>>> >>         metadata_parse_version=1
>>>> >>         metadata_feature_version=1
>>>> >>         timestamp=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>         host-id=2
>>>> >>         score=0
>>>> >>         vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>         conf_on_shared_storage=True
>>>> >>         maintenance=False
>>>> >>         state=AgentStopped
>>>> >>         stopped=True
>>>> >>
>>>> >>
>>>> >> hosted-engine --vm-status output from ovirt2 host
>>>> >>
>>>> >> [root at ovirt2 ovirt-hosted-engine-ha]# hosted-engine --vm-status
>>>> >>
>>>> >>
>>>> >> --== Host 1 status ==--
>>>> >>
>>>> >> conf_on_shared_storage             : True
>>>> >> Status up-to-date                  : False
>>>> >> Hostname                           : ovirt1.telia.ru
>>>> >> Host ID                            : 1
>>>> >> Engine status                      : unknown stale-data
>>>> >> Score                              : 3400
>>>> >> stopped                            : False
>>>> >> Local maintenance                  : False
>>>> >> crc32                              : 6d3606f1
>>>> >> local_conf_timestamp               : 349264
>>>> >> Host timestamp                     : 349264
>>>> >> Extra metadata (valid at timestamp):
>>>> >>         metadata_parse_version=1
>>>> >>         metadata_feature_version=1
>>>> >>         timestamp=349264 (Tue Jan 16 15:05:45 2018)
>>>> >>         host-id=1
>>>> >>         score=3400
>>>> >>         vm_conf_refresh_time=349264 (Tue Jan 16 15:05:45 2018)
>>>> >>         conf_on_shared_storage=True
>>>> >>         maintenance=False
>>>> >>         state=EngineUp
>>>> >>         stopped=False
>>>> >>
>>>> >>
>>>> >> --== Host 2 status ==--
>>>> >>
>>>> >> conf_on_shared_storage             : True
>>>> >> Status up-to-date                  : False
>>>> >> Hostname                           : ovirt1.telia.ru
>>>> >> Host ID                            : 2
>>>> >> Engine status                      : unknown stale-data
>>>> >> Score                              : 0
>>>> >> stopped                            : True
>>>> >> Local maintenance                  : False
>>>> >> crc32                              : c7037c03
>>>> >> local_conf_timestamp               : 7530
>>>> >> Host timestamp                     : 7530
>>>> >> Extra metadata (valid at timestamp):
>>>> >>         metadata_parse_version=1
>>>> >>         metadata_feature_version=1
>>>> >>         timestamp=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>         host-id=2
>>>> >>         score=0
>>>> >>         vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>         conf_on_shared_storage=True
>>>> >>         maintenance=False
>>>> >>         state=AgentStopped
>>>> >>         stopped=True
>>>> >>
>>>> >>
>>>> >> Also I saw some log messages in webGUI about time drift like
>>>> >>
>>>> >> "Host ovirt2.telia.ru has time-drift of 5305 seconds while maximum
>>>> >> configured value is 300 seconds." that is a bit weird as haven't
>>>> touched any
>>>> >> time settings since I installed the cluster.
>>>> >> both host have the same time and timezone (MSK) but hosted engine
>>>> lives in
>>>> >> UTC timezone. Is it mandatory to have everything in sync and in the
>>>> same
>>>> >> timezone?
>>>> >>
>>>> >> Regards,
>>>> >> Artem
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Tue, Jan 16, 2018 at 2:20 PM, Kasturi Narra <knarra at redhat.com>
>>>> wrote:
>>>> >>>
>>>> >>> Hello,
>>>> >>>
>>>> >>>      I now see that your hosted engine is up and running. Can you
>>>> let me
>>>> >>> know how did you try reinstalling the host? Below is the procedure
>>>> which is
>>>> >>> used and hope you did not miss any step while reinstalling. If no,
>>>> can you
>>>> >>> try reinstalling again and see if that works ?
>>>> >>>
>>>> >>> 1) Move the host to maintenance
>>>> >>> 2) click on reinstall
>>>> >>> 3) provide the password
>>>> >>> 4) uncheck 'automatically configure host firewall'
>>>> >>> 5) click on 'Deploy' tab
>>>> >>> 6) click Hosted Engine deployment as 'Deploy'
>>>> >>>
>>>> >>> And once the host installation is done, wait till the active score
>>>> of the
>>>> >>> host shows 3400 in the general tab then check hosted-engine
>>>> --vm-status.
>>>> >>>
>>>> >>> Thanks
>>>> >>> kasturi
>>>> >>>
>>>> >>> On Mon, Jan 15, 2018 at 4:57 PM, Artem Tambovskiy
>>>> >>> <artem.tambovskiy at gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hello,
>>>> >>>>
>>>> >>>> I have uploaded 2 archives with all relevant logs to shared hosting
>>>> >>>> files from host 1  (which is currently running all VM's including
>>>> >>>> hosted_engine)  -  https://yadi.sk/d/PttRoYV63RTvhK
>>>> >>>> files from second host - https://yadi.sk/d/UBducEsV3RTvhc
>>>> >>>>
>>>> >>>> I have tried to restart both ovirt-ha-agent and ovirt-ha-broker
>>>> but it
>>>> >>>> gives no effect. I have also tried to shutdown hosted_engine VM,
>>>> stop
>>>> >>>> ovirt-ha-agent and ovirt-ha-broker  services disconnect storage
>>>> and connect
>>>> >>>> it again  - no effect as well.
>>>> >>>> Also I tried to reinstall second host from WebGUI - this lead to
>>>> the
>>>> >>>> interesting situation - now  hosted-engine --vm-status  shows that
>>>> both
>>>> >>>> hosts have the same address.
>>>> >>>>
>>>> >>>> [root at ovirt1 ~]# hosted-engine --vm-status
>>>> >>>>
>>>> >>>> --== Host 1 status ==--
>>>> >>>>
>>>> >>>> conf_on_shared_storage             : True
>>>> >>>> Status up-to-date                  : True
>>>> >>>> Hostname                           : ovirt1.telia.ru
>>>> >>>> Host ID                            : 1
>>>> >>>> Engine status                      : {"health": "good", "vm": "up",
>>>> >>>> "detail": "up"}
>>>> >>>> Score                              : 3400
>>>> >>>> stopped                            : False
>>>> >>>> Local maintenance                  : False
>>>> >>>> crc32                              : a7758085
>>>> >>>> local_conf_timestamp               : 259327
>>>> >>>> Host timestamp                     : 259327
>>>> >>>> Extra metadata (valid at timestamp):
>>>> >>>>         metadata_parse_version=1
>>>> >>>>         metadata_feature_version=1
>>>> >>>>         timestamp=259327 (Mon Jan 15 14:06:48 2018)
>>>> >>>>         host-id=1
>>>> >>>>         score=3400
>>>> >>>>         vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)
>>>> >>>>         conf_on_shared_storage=True
>>>> >>>>         maintenance=False
>>>> >>>>         state=EngineUp
>>>> >>>>         stopped=False
>>>> >>>>
>>>> >>>>
>>>> >>>> --== Host 2 status ==--
>>>> >>>>
>>>> >>>> conf_on_shared_storage             : True
>>>> >>>> Status up-to-date                  : False
>>>> >>>> Hostname                           : ovirt1.telia.ru
>>>> >>>> Host ID                            : 2
>>>> >>>> Engine status                      : unknown stale-data
>>>> >>>> Score                              : 0
>>>> >>>> stopped                            : True
>>>> >>>> Local maintenance                  : False
>>>> >>>> crc32                              : c7037c03
>>>> >>>> local_conf_timestamp               : 7530
>>>> >>>> Host timestamp                     : 7530
>>>> >>>> Extra metadata (valid at timestamp):
>>>> >>>>         metadata_parse_version=1
>>>> >>>>         metadata_feature_version=1
>>>> >>>>         timestamp=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>>>         host-id=2
>>>> >>>>         score=0
>>>> >>>>         vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>>>         conf_on_shared_storage=True
>>>> >>>>         maintenance=False
>>>> >>>>         state=AgentStopped
>>>> >>>>         stopped=True
>>>> >>>>
>>>> >>>> Gluster seems working fine. all gluster nodes showing connected
>>>> state.
>>>> >>>>
>>>> >>>> Any advises on how to resolve this situation are highly
>>>> appreciated!
>>>> >>>>
>>>> >>>> Regards,
>>>> >>>> Artem
>>>> >>>>
>>>> >>>>
>>>> >>>> On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <knarra at redhat.com
>>>> >
>>>> >>>> wrote:
>>>> >>>>>
>>>> >>>>> Hello Artem,
>>>> >>>>>
>>>> >>>>>         Can you check if glusterd service is running on host1 and
>>>> all
>>>> >>>>> the peers are in connected state ? If yes, can you restart
>>>> ovirt-ha-agent
>>>> >>>>> and broker services and check if things are working fine ?
>>>> >>>>>
>>>> >>>>> Thanks
>>>> >>>>> kasturi
>>>> >>>>>
>>>> >>>>> On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy
>>>> >>>>> <artem.tambovskiy at gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> Explored logs on both hosts.
>>>> >>>>>> broker.log shows no errors.
>>>> >>>>>>
>>>> >>>>>> agent.log looking not good:
>>>> >>>>>>
>>>> >>>>>> on host1 (which running hosted engine) :
>>>> >>>>>>
>>>> >>>>>> MainThread::ERROR::2018-01-12
>>>> >>>>>> 21:51:03,883::agent::205::ovirt_hosted_engine_ha.agent.agent
>>>> .Agent::(_run_agent)
>>>> >>>>>> Traceback (most recent call last):
>>>> >>>>>>   File
>>>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age
>>>> nt/agent.py",
>>>> >>>>>> line 191, in _run_agent
>>>> >>>>>>     return action(he)
>>>> >>>>>>   File
>>>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age
>>>> nt/agent.py",
>>>> >>>>>> line 64, in action_proper
>>>> >>>>>>     return he.start_monitoring()
>>>> >>>>>>   File
>>>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age
>>>> nt/hosted_engine.py",
>>>> >>>>>> line 411, in start_monitoring
>>>> >>>>>>     self._initialize_sanlock()
>>>> >>>>>>   File
>>>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age
>>>> nt/hosted_engine.py",
>>>> >>>>>> line 749, in _initialize_sanlock
>>>> >>>>>>     "Failed to initialize sanlock, the number of errors has"
>>>> >>>>>> SanlockInitializationError: Failed to initialize sanlock, the
>>>> number
>>>> >>>>>> of errors has exceeded the limit
>>>> >>>>>>
>>>> >>>>>> MainThread::ERROR::2018-01-12
>>>> >>>>>> 21:51:03,884::agent::206::ovirt_hosted_engine_ha.agent.agent
>>>> .Agent::(_run_agent)
>>>> >>>>>> Trying to restart agent
>>>> >>>>>> MainThread::WARNING::2018-01-12
>>>> >>>>>> 21:51:08,889::agent::209::ovirt_hosted_engine_ha.agent.agent
>>>> .Agent::(_run_agent)
>>>> >>>>>> Restarting agent, attempt '1'
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:08,919::hosted_engine::242::ovirt_hosted_engine_ha.age
>>>> nthosted_engine.HostedEngine::(_get_hostname)
>>>> >>>>>> Found certificate common name: ovirt1.telia.ru
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:08,921::hosted_engine::604::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_vdsm)
>>>> >>>>>> Initializing VDSM
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:11,398::hosted_engine::630::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Connecting the storage
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:11,399::storage_server::220::ovirt_hosted_engine_ha.li
>>>> b.storage_server.StorageServer::(validate_storage_server)
>>>> >>>>>> Validating storage server
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:13,725::storage_server::239::ovirt_hosted_engine_ha.li
>>>> bstorage_server.StorageServer::(connect_storage_server)
>>>> >>>>>> Connecting storage server
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:18,390::storage_server::246::ovirt_hosted_engine_ha.li
>>>> b.storage_server.StorageServer::(connect_storage_server)
>>>> >>>>>> Connecting storage server
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:18,423::storage_server::253::ovirt_hosted_engine_ha.li
>>>> b.storage_server.StorageServer::(connect_storage_server)
>>>> >>>>>> Refreshing the storage domain
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:18,689::hosted_engine::663::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Preparing images
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:18,690::image::126::ovirt_hosted_engine_ha.lib.image.I
>>>> mage::(prepare_images)
>>>> >>>>>> Preparing images
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,895::hosted_engine::666::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Refreshing vm.conf
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,895::config::493::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(refresh_vm_conf)
>>>> >>>>>> Reloading vm.conf from the shared storage domain
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,896::config::416::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Trying to get a fresher copy of vm configuration from the
>>>> OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,896::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,897::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> OVF_STORE volume path:
>>>> >>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5
>>>> cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-
>>>> d109fa36dfcf
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,915::config::435::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Found an OVF for HE VM, trying to convert
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,918::config::440::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Got vm.conf from OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,919::hosted_engine::509::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_broker)
>>>> >>>>>> Initializing ha-broker connection
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,919::brokerlink::130::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Starting monitor ping, options {'addr': '80.239.162.97'}
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,922::brokerlink::141::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Success, id 140547104457680
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,922::brokerlink::130::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Starting monitor mgmt-bridge, options {'use_ssl': 'true',
>>>> 'bridge_name':
>>>> >>>>>> 'ovirtmgmt', 'address': '0'}
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,936::brokerlink::141::ovirt_hosted_engine_ha.lib.br
>>>> okerlinkBrokerLink::(start_monitor)
>>>> >>>>>> Success, id 140547104458064
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,936::brokerlink::130::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Starting monitor mem-free, options {'use_ssl': 'true',
>>>> 'address': '0'}
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,938::brokerlink::141::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Success, id 140547104458448
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,939::brokerlink::130::ovirt_hosted_engine_ha.lib.br
>>>> okerlinkBrokerLink::(start_monitor)
>>>> >>>>>> Starting monitor cpu-load-no-engine, options {'use_ssl': 'true',
>>>> 'vm_uuid':
>>>> >>>>>> 'b366e466-b0ea-4a09-866b-d0248d7523a6', 'address': '0'}
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,940::brokerlink::141::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Success, id 140547104457552
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,941::brokerlink::130::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Starting monitor engine-health, options {'use_ssl': 'true',
>>>> 'vm_uuid':
>>>> >>>>>> 'b366e466-b0ea-4a09-866b-d0248d7523a6', 'address': '0'}
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:21,942::brokerlink::141::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(start_monitor)
>>>> >>>>>> Success, id 140547104459792
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:26,951::brokerlink::179::ovirt_hosted_engine_ha.lib.br
>>>> okerlink.BrokerLink::(set_storage_domain)
>>>> >>>>>> Success, id 140546772847056
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:26,952::hosted_engine::601::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_broker)
>>>> >>>>>> Broker initialized, all submonitors started
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:51:27,049::hosted_engine::704::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_sanlock)
>>>> >>>>>> Ensuring lease for lockspace hosted-engine, host id 1 is
>>>> acquired (file:
>>>> >>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/0
>>>> 93faa75-5e33-4559-84fa-1f1f8d48153b/911c7637-b49d-463e-b186-
>>>> 23b404e50769)
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:53:48,067::hosted_engine::745::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_sanlock)
>>>> >>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:56:14,088::hosted_engine::745::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_sanlock)
>>>> >>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 21:58:40,111::hosted_engine::745::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_sanlock)
>>>> >>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:06,133::hosted_engine::745::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_sanlock)
>>>> >>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> agent.log from second host
>>>> >>>>>>
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:37,241::hosted_engine::630::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Connecting the storage
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:37,242::storage_server::220::ovirt_hosted_engine_ha.li
>>>> b.storage_server.StorageServer::(validate_storage_server)
>>>> >>>>>> Validating storage server
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:39,540::hosted_engine::639::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Storage domain reported as valid and reconnect is not forced.
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:41,939::hosted_engine::453::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(start_monitoring)
>>>> >>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:52,150::config::493::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(refresh_vm_conf)
>>>> >>>>>> Reloading vm.conf from the shared storage domain
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:52,150::config::416::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Trying to get a fresher copy of vm configuration from the
>>>> OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:52,151::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:52,153::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> OVF_STORE volume path:
>>>> >>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5
>>>> cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-
>>>> d109fa36dfcf
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:52,174::config::435::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Found an OVF for HE VM, trying to convert
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:52,179::config::440::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Got vm.conf from OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:52,189::hosted_engine::604::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_vdsm)
>>>> >>>>>> Initializing VDSM
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:54,586::hosted_engine::630::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Connecting the storage
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:54,587::storage_server::220::ovirt_hosted_engine_ha.li
>>>> b.storage_server.StorageServer::(validate_storage_server)
>>>> >>>>>> Validating storage server
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:56,903::hosted_engine::639::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Storage domain reported as valid and reconnect is not forced.
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:59,299::states::682::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine::(score)
>>>> >>>>>> Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:48
>>>> 2018
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:01:59,299::hosted_engine::453::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(start_monitoring)
>>>> >>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:09,659::config::493::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(refresh_vm_conf)
>>>> >>>>>> Reloading vm.conf from the shared storage domain
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:09,659::config::416::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Trying to get a fresher copy of vm configuration from the
>>>> OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:09,660::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:09,663::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> OVF_STORE volume path:
>>>> >>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5
>>>> cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-
>>>> d109fa36dfcf
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:09,683::config::435::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Found an OVF for HE VM, trying to convert
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:09,688::config::440::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Got vm.conf from OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:09,698::hosted_engine::604::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_vdsm)
>>>> >>>>>> Initializing VDSM
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:12,112::hosted_engine::630::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Connecting the storage
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:12,113::storage_server::220::ovirt_hosted_engine_ha.li
>>>> b.storage_server.StorageServer::(validate_storage_server)
>>>> >>>>>> Validating storage server
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:14,444::hosted_engine::639::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Storage domain reported as valid and reconnect is not forced.
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:16,859::states::682::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine::(score)
>>>> >>>>>> Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:47
>>>> 2018
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:16,859::hosted_engine::453::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(start_monitoring)
>>>> >>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,100::config::493::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(refresh_vm_conf)
>>>> >>>>>> Reloading vm.conf from the shared storage domain
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,100::config::416::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Trying to get a fresher copy of vm configuration from the
>>>> OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,101::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,103::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf
>>>> .ovf_store.OVFStore::(getEngineVMOVF)
>>>> >>>>>> OVF_STORE volume path:
>>>> >>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5
>>>> cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-
>>>> d109fa36dfcf
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,125::config::435::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Found an OVF for HE VM, trying to convert
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,129::config::440::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>> >>>>>> Got vm.conf from OVF_STORE
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,130::states::667::ovirt_hosted_engine_ha.agent.host
>>>> ed_engine.HostedEngine::(consume)
>>>> >>>>>> Engine down, local host does not have best score
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:27,139::hosted_engine::604::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engineHostedEngine::(_initialize_vdsm)
>>>> >>>>>> Initializing VDSM
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:29,584::hosted_engine::630::ovirt_hosted_engine_ha.age
>>>> nt.hosted_engine.HostedEngine::(_initialize_storage_images)
>>>> >>>>>> Connecting the storage
>>>> >>>>>> MainThread::INFO::2018-01-12
>>>> >>>>>> 22:02:29,586::storage_server::220::ovirt_hosted_engine_ha.li
>>>> b.storage_server.StorageServer::(validate_storage_server)
>>>> >>>>>> Validating storage server
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Any suggestions how to resolve this .
>>>> >>>>>>
>>>> >>>>>> regards,
>>>> >>>>>> Artem
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Fri, Jan 12, 2018 at 7:08 PM, Artem Tambovskiy
>>>> >>>>>> <artem.tambovskiy at gmail.com> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Trying to fix one thing I broke another :(
>>>> >>>>>>>
>>>> >>>>>>> I fixed mnt_options for hosted engine storage domain and
>>>> installed
>>>> >>>>>>> latest security patches to my hosts and hosted engine. All VM's
>>>> up and
>>>> >>>>>>> running, but  hosted_engine --vm-status reports about issues:
>>>> >>>>>>>
>>>> >>>>>>> [root at ovirt1 ~]# hosted-engine --vm-status
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --== Host 1 status ==--
>>>> >>>>>>>
>>>> >>>>>>> conf_on_shared_storage             : True
>>>> >>>>>>> Status up-to-date                  : False
>>>> >>>>>>> Hostname                           : ovirt2
>>>> >>>>>>> Host ID                            : 1
>>>> >>>>>>> Engine status                      : unknown stale-data
>>>> >>>>>>> Score                              : 0
>>>> >>>>>>> stopped                            : False
>>>> >>>>>>> Local maintenance                  : False
>>>> >>>>>>> crc32                              : 193164b8
>>>> >>>>>>> local_conf_timestamp               : 8350
>>>> >>>>>>> Host timestamp                     : 8350
>>>> >>>>>>> Extra metadata (valid at timestamp):
>>>> >>>>>>>         metadata_parse_version=1
>>>> >>>>>>>         metadata_feature_version=1
>>>> >>>>>>>         timestamp=8350 (Fri Jan 12 19:03:54 2018)
>>>> >>>>>>>         host-id=1
>>>> >>>>>>>         score=0
>>>> >>>>>>>         vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)
>>>> >>>>>>>         conf_on_shared_storage=True
>>>> >>>>>>>         maintenance=False
>>>> >>>>>>>         state=EngineUnexpectedlyDown
>>>> >>>>>>>         stopped=False
>>>> >>>>>>>         timeout=Thu Jan  1 05:24:43 1970
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --== Host 2 status ==--
>>>> >>>>>>>
>>>> >>>>>>> conf_on_shared_storage             : True
>>>> >>>>>>> Status up-to-date                  : False
>>>> >>>>>>> Hostname                           : ovirt1.telia.ru
>>>> >>>>>>> Host ID                            : 2
>>>> >>>>>>> Engine status                      : unknown stale-data
>>>> >>>>>>> Score                              : 0
>>>> >>>>>>> stopped                            : True
>>>> >>>>>>> Local maintenance                  : False
>>>> >>>>>>> crc32                              : c7037c03
>>>> >>>>>>> local_conf_timestamp               : 7530
>>>> >>>>>>> Host timestamp                     : 7530
>>>> >>>>>>> Extra metadata (valid at timestamp):
>>>> >>>>>>>         metadata_parse_version=1
>>>> >>>>>>>         metadata_feature_version=1
>>>> >>>>>>>         timestamp=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>>>>>>         host-id=2
>>>> >>>>>>>         score=0
>>>> >>>>>>>         vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>>>>>>         conf_on_shared_storage=True
>>>> >>>>>>>         maintenance=False
>>>> >>>>>>>         state=AgentStopped
>>>> >>>>>>>         stopped=True
>>>> >>>>>>> [root at ovirt1 ~]#
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> from second host situation looks a bit different:
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> [root at ovirt2 ~]# hosted-engine --vm-status
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --== Host 1 status ==--
>>>> >>>>>>>
>>>> >>>>>>> conf_on_shared_storage             : True
>>>> >>>>>>> Status up-to-date                  : True
>>>> >>>>>>> Hostname                           : ovirt2
>>>> >>>>>>> Host ID                            : 1
>>>> >>>>>>> Engine status                      : {"reason": "vm not running
>>>> on
>>>> >>>>>>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>>>> >>>>>>> Score                              : 0
>>>> >>>>>>> stopped                            : False
>>>> >>>>>>> Local maintenance                  : False
>>>> >>>>>>> crc32                              : 78eabdb6
>>>> >>>>>>> local_conf_timestamp               : 8403
>>>> >>>>>>> Host timestamp                     : 8402
>>>> >>>>>>> Extra metadata (valid at timestamp):
>>>> >>>>>>>         metadata_parse_version=1
>>>> >>>>>>>         metadata_feature_version=1
>>>> >>>>>>>         timestamp=8402 (Fri Jan 12 19:04:47 2018)
>>>> >>>>>>>         host-id=1
>>>> >>>>>>>         score=0
>>>> >>>>>>>         vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)
>>>> >>>>>>>         conf_on_shared_storage=True
>>>> >>>>>>>         maintenance=False
>>>> >>>>>>>         state=EngineUnexpectedlyDown
>>>> >>>>>>>         stopped=False
>>>> >>>>>>>         timeout=Thu Jan  1 05:24:43 1970
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --== Host 2 status ==--
>>>> >>>>>>>
>>>> >>>>>>> conf_on_shared_storage             : True
>>>> >>>>>>> Status up-to-date                  : False
>>>> >>>>>>> Hostname                           : ovirt1.telia.ru
>>>> >>>>>>> Host ID                            : 2
>>>> >>>>>>> Engine status                      : unknown stale-data
>>>> >>>>>>> Score                              : 0
>>>> >>>>>>> stopped                            : True
>>>> >>>>>>> Local maintenance                  : False
>>>> >>>>>>> crc32                              : c7037c03
>>>> >>>>>>> local_conf_timestamp               : 7530
>>>> >>>>>>> Host timestamp                     : 7530
>>>> >>>>>>> Extra metadata (valid at timestamp):
>>>> >>>>>>>         metadata_parse_version=1
>>>> >>>>>>>         metadata_feature_version=1
>>>> >>>>>>>         timestamp=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>>>>>>         host-id=2
>>>> >>>>>>>         score=0
>>>> >>>>>>>         vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>>>> >>>>>>>         conf_on_shared_storage=True
>>>> >>>>>>>         maintenance=False
>>>> >>>>>>>         state=AgentStopped
>>>> >>>>>>>         stopped=True
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> WebGUI shows that engine running on host ovirt1.
>>>> >>>>>>> Gluster looks fine
>>>> >>>>>>> [root at ovirt1 ~]# gluster volume status engine
>>>> >>>>>>> Status of volume: engine
>>>> >>>>>>> Gluster process                             TCP Port  RDMA Port
>>>> >>>>>>> Online  Pid
>>>> >>>>>>>
>>>> >>>>>>> ------------------------------------------------------------
>>>> ------------------
>>>> >>>>>>> Brick ovirt1.teliaru:/oVirt/engine         49169     0
>>>> Y
>>>> >>>>>>> 3244
>>>> >>>>>>> Brick ovirt2.telia.ru:/oVirt/engine         49179     0
>>>>   Y
>>>> >>>>>>> 20372
>>>> >>>>>>> Brick ovirt3.telia.ru:/oVirt/engine         49206     0
>>>>   Y
>>>> >>>>>>> 16609
>>>> >>>>>>> Self-heal Daemon on localhost               N/A       N/A
>>>>   Y
>>>> >>>>>>> 117868
>>>> >>>>>>> Self-heal Daemon on ovirt2.telia.ru         N/A       N/A
>>>>   Y
>>>> >>>>>>> 20521
>>>> >>>>>>> Self-heal Daemon on ovirt3                  N/A       N/A
>>>>   Y
>>>> >>>>>>> 25093
>>>> >>>>>>>
>>>> >>>>>>> Task Status of Volume engine
>>>> >>>>>>>
>>>> >>>>>>> ------------------------------------------------------------
>>>> ------------------
>>>> >>>>>>> There are no active volume tasks
>>>> >>>>>>>
>>>> >>>>>>> How to resolve this issue?
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> _______________________________________________
>>>> >>>>>>> Users mailing list
>>>> >>>>>>> Users at ovirt.org
>>>> >>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> _______________________________________________
>>>> >>>>>> Users mailing list
>>>> >>>>>> Users at ovirt.org
>>>> >>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >> _______________________________________________
>>>> >> Users mailing list
>>>> >> Users at ovirt.org
>>>> >> http://lists.ovirt.org/mailman/listinfo/users
>>>> >>
>>>> >
>>>> > _______________________________________________
>>>> > Users mailing list
>>>> > Users at ovirt.org
>>>> > http://lists.ovirt.org/mailman/listinfo/users
>>>> >
>>>>
>>>
>>>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180122/a6c57f5b/attachment.html>


More information about the Users mailing list