[ovirt-users] Engine Down Score is 0

Sandro Bonazzola sbonazzo at redhat.com
Wed May 31 14:19:01 UTC 2017


On Wed, May 31, 2017 at 4:18 PM, Sandro Bonazzola <sbonazzo at redhat.com>
wrote:

> The hosted engine fails to start due to:
>
> Traceback (most recent call last):
>   File "/usr/share/vdsm/virt/vm.py", line 714, in _startUnderlyingVm
>     self._run()
>   File "/usr/share/vdsm/virt/vm.py", line 2026, in _run
>     self._connection.createXML(domxml, flags),
>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
> 123, in wrapper
>     ret = f(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 917, in
> wrapper
>     return func(inst, *args, **kwargs)
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3782, in
> createXML
>     if ret is None:raise libvirtError('virDomainCreateXML() failed',
> conn=self)
> libvirtError: Cannot get interface MTU on 'None': No such device
>
> The Hosted Engine VM has been configured with:
>                 <interface address="None" type="bridge">
>                         <mac address="00:1a:4a:16:01:75"/>
>                         <model type="virtio"/>
>                         <source bridge="storage"/>
>                         <link state="up"/>
>                 </interface>
>
>

also:
                <interface address="None" type="bridge">
                        <mac address="00:16:3e:21:57:02"/>
                        <model type="virtio"/>
                        <source bridge="None"/>
                        <link state="up"/>
                </interface>




> Not sure how you got there.
> Can you share steps you did before having this issue?
>
>
> On Wed, May 31, 2017 at 4:04 PM, Rizwan Qureshi <rqureshi at connexin.co.uk>
> wrote:
>
>> Hi Sandro,
>>
>> PFA the sos report and vdsm logs.
>>
>>
>>
>> *From:* Sandro Bonazzola [mailto:sbonazzo at redhat.com]
>> *Sent:* Wednesday, May 31, 2017 11:04 AM
>> *To:* Rizwan Qureshi <rqureshi at connexin.co.uk>; Martin Sivak <
>> msivak at redhat.com>
>> *Cc:* Michal Skrivanek <mskrivan at redhat.com>; users at ovirt.org
>>
>> *Subject:* Re: [ovirt-users] Engine Down Score is 0
>>
>>
>>
>>
>>
>>
>>
>> On Wed, May 31, 2017 at 11:10 AM, Rizwan Qureshi <rqureshi at connexin.co.uk>
>> wrote:
>>
>> Hi Sandro,
>>
>> Thanks for your response.
>>
>>
>>
>> We have tried that already but the VM shuts down after the –vm-start
>> executes.
>>
>>
>>
>> Also, apologies for wrong information bfore. We have 3 dedicated hosts
>> not 1.
>>
>>
>>
>> Please see below the output of –vm-status command:
>>
>>
>>
>>
>>
>> Can you please share sos report or at least vdsm logs from the hosts?
>>
>> Need to understand what happened to the vm.
>>
>>
>>
>>
>>
>>
>>
>> --== Host 1 status ==--
>>
>>
>>
>> Status up-to-date                  : True
>>
>> Hostname                           : vmhost1.le1.uk.cxn-network.net
>>
>> Host ID                            : 1
>>
>> Engine status                      : {"reason": "bad vm status",
>> "health": "bad", "vm": "down", "detail": "down"}
>>
>> Score                              : 0
>>
>> stopped                            : False
>>
>> Local maintenance                  : False
>>
>> crc32                              : 37a4ce89
>>
>> Host timestamp                     : 76387
>>
>> Extra metadata (valid at timestamp):
>>
>>         metadata_parse_version=1
>>
>>         metadata_feature_version=1
>>
>>         timestamp=76387 (Wed May 31 10:04:47 2017)
>>
>>         host-id=1
>>
>>         score=0
>>
>>         maintenance=False
>>
>>         state=EngineUnexpectedlyDown
>>
>>         stopped=False
>>
>>         timeout=Thu Jan  1 22:17:00 1970
>>
>>
>>
>>
>>
>> --== Host 2 status ==--
>>
>>
>>
>> Status up-to-date                  : True
>>
>> Hostname                           : vmhost2.le1.uk.cxn-network.net
>>
>> Host ID                            : 2
>>
>> Engine status                      : {"reason": "bad vm status",
>> "health": "bad", "vm": "down", "detail": "down"}
>>
>> Score                              : 0
>>
>> stopped                            : False
>>
>> Local maintenance                  : False
>>
>> crc32                              : 937b5542
>>
>> Host timestamp                     : 76069
>>
>> Extra metadata (valid at timestamp):
>>
>>         metadata_parse_version=1
>>
>>         metadata_feature_version=1
>>
>>         timestamp=76069 (Wed May 31 10:04:51 2017)
>>
>>         host-id=2
>>
>>         score=0
>>
>>         maintenance=False
>>
>>         state=EngineUnexpectedlyDown
>>
>>         stopped=False
>>
>>         timeout=Thu Jan  1 22:17:49 1970
>>
>>
>>
>>
>>
>> --== Host 3 status ==--
>>
>>
>>
>> Status up-to-date                  : True
>>
>> Hostname                           : vmhost3.le1.uk.cxn-network.net
>>
>> Host ID                            : 3
>>
>> Engine status                      : {"reason": "bad vm status",
>> "health": "bad", "vm": "down", "detail": "down"}
>>
>> Score                              : 0
>>
>> stopped                            : False
>>
>> Local maintenance                  : False
>>
>> crc32                              : 8ffac898
>>
>> Host timestamp                     : 76212
>>
>> Extra metadata (valid at timestamp):
>>
>>         metadata_parse_version=1
>>
>>         metadata_feature_version=1
>>
>>         timestamp=76212 (Wed May 31 10:04:55 2017)
>>
>>         host-id=3
>>
>>         score=0
>>
>>         maintenance=False
>>
>>         state=EngineUnexpectedlyDown
>>
>>         stopped=False
>>
>>         timeout=Thu Jan  1 22:16:58 1970
>>
>>
>>
>> *From:* Sandro Bonazzola [mailto:sbonazzo at redhat.com]
>> *Sent:* Wednesday, May 31, 2017 7:52 AM
>> *To:* Rizwan Qureshi <rqureshi at connexin.co.uk>; Michal Skrivanek <
>> mskrivan at redhat.com>
>> *Cc:* users at ovirt.org
>> *Subject:* Re: [ovirt-users] Engine Down Score is 0
>>
>>
>>
>>
>>
>>
>>
>> On Tue, May 30, 2017 at 6:32 PM, Rizwan Qureshi <rqureshi at connexin.co.uk>
>> wrote:
>>
>> Hello Ovirt Users,
>>
>> I am new to ovirt.
>>
>>
>>
>> Hi, welcome ot oVirt community!
>>
>>
>>
>>
>>
>> Just trying to fix the issue with the engine which seems to be down and
>> hence all our VMs which we are very much dependent upon are not working.
>>
>>
>>
>> Other VMs shouldn't be affected by an unavailability of the engine, they
>> should keep going if already started.
>>
>>
>>
>>
>>
>> Tried googling the log snippet but to no avail. Hoping to get some help
>> from you guys.
>>
>>
>>
>> I am completely blank and don’t know whats wrong with it.
>>
>>
>>
>> We have 3 Dell servers. One for Engine and the other two for nodes.
>> Please see the log snippet from the engine server agent.log. Please let me
>> know if more information is needed to debug the issue.
>>
>>
>>
>> So you have only 1 host dedicated to running hosted engine?
>>
>>
>>
>>
>>
>>
>>
>> MainThread::INFO::2017-05-30 17:21:51,649::hosted_engine::6
>> 12::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
>> Initializing VDSM
>>
>> MainThread::INFO::2017-05-30 17:21:53,820::hosted_engine::6
>> 39::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>> Connecting the storage
>>
>> MainThread::INFO::2017-05-30 17:21:53,821::storage_server::219::
>> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Connecting storage server
>>
>> MainThread::INFO::2017-05-30 17:21:58,134::storage_server::226::
>> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Connecting storage server
>>
>> MainThread::INFO::2017-05-30 17:21:58,142::storage_server::233::
>> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Refreshing the storage domain
>>
>> MainThread::INFO::2017-05-30 17:21:58,258::hosted_engine::6
>> 66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>> Preparing images
>>
>> MainThread::INFO::2017-05-30 17:21:58,258::image::126::ovir
>> t_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
>>
>> MainThread::INFO::2017-05-30 17:22:00,637::hosted_engine::6
>> 69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>> Reloading vm.conf from the shared storage domain
>>
>> MainThread::INFO::2017-05-30 17:22:00,638::config::206::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>> Trying to get a fresher copy of vm configuration from the OVF_STORE
>>
>> MainThread::INFO::2017-05-30 17:22:02,838::ovf_store::103::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
>> OVF_STORE: imgUUID:f4d55192-a2c8-4c2b-8e6c-46a49c30967c,
>> volUUID:0ae2c43f-5883-4354-a33f-e68e21ae3733
>>
>> MainThread::INFO::2017-05-30 17:22:02,977::ovf_store::103::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
>> OVF_STORE: imgUUID:4ef3bd37-e48d-44a8-b906-5353de1a32cc,
>> volUUID:d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>>
>> MainThread::INFO::2017-05-30 17:22:03,032::ovf_store::112::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>> Extracting Engine VM OVF from the OVF_STORE
>>
>> MainThread::INFO::2017-05-30 17:22:03,033::ovf_store::119::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>> OVF_STORE volume path: /rhev/data-center/mnt/nas1.le1.uk.cxn-network.net:
>> _oVirt/4f9e46d7-594e-473c-b0c5-1770c5773a2e/images/4ef3bd37-e48d-44a8-
>> b906-5353de1a32cc/d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>>
>> MainThread::INFO::2017-05-30 17:22:03,044::config::226::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>> Found an OVF for HE VM, trying to convert
>>
>> MainThread::INFO::2017-05-30 17:22:03,046::config::231::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>> Got vm.conf from OVF_STORE
>>
>> MainThread::INFO::2017-05-30 17:22:03,082::states::672::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is 0
>> due to unexpected vm shutdown at Tue May 30 17:18:38 2017
>>
>> MainThread::INFO::2017-05-30 17:22:03,082::hosted_engine::4
>> 61::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUnexpectedlyDown (score: 0)
>>
>> MainThread::INFO::2017-05-30 17:22:03,082::hosted_engine::4
>> 66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host vmhost2.le1.uk.cxn-network.net (id: 2, score: 0)
>>
>> MainThread::INFO::2017-05-30 17:22:13,135::hosted_engine::6
>> 12::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
>> Initializing VDSM
>>
>> MainThread::INFO::2017-05-30 17:22:15,305::hosted_engine::6
>> 39::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>> Connecting the storage
>>
>> MainThread::INFO::2017-05-30 17:22:15,306::storage_server::219::
>> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Connecting storage server
>>
>> MainThread::INFO::2017-05-30 17:22:19,618::storage_server::226::
>> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Connecting storage server
>>
>> MainThread::INFO::2017-05-30 17:22:19,626::storage_server::233::
>> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
>> Refreshing the storage domain
>>
>> MainThread::INFO::2017-05-30 17:22:19,742::hosted_engine::6
>> 66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>> Preparing images
>>
>> MainThread::INFO::2017-05-30 17:22:19,742::image::126::ovir
>> t_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
>>
>> MainThread::INFO::2017-05-30 17:22:22,140::hosted_engine::6
>> 69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
>> Reloading vm.conf from the shared storage domain
>>
>> MainThread::INFO::2017-05-30 17:22:22,140::config::206::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>> Trying to get a fresher copy of vm configuration from the OVF_STORE
>>
>> MainThread::INFO::2017-05-30 17:22:24,333::ovf_store::103::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
>> OVF_STORE: imgUUID:f4d55192-a2c8-4c2b-8e6c-46a49c30967c,
>> volUUID:0ae2c43f-5883-4354-a33f-e68e21ae3733
>>
>> MainThread::INFO::2017-05-30 17:22:24,472::ovf_store::103::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
>> OVF_STORE: imgUUID:4ef3bd37-e48d-44a8-b906-5353de1a32cc,
>> volUUID:d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>>
>> MainThread::INFO::2017-05-30 17:22:24,519::ovf_store::112::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>> Extracting Engine VM OVF from the OVF_STORE
>>
>> MainThread::INFO::2017-05-30 17:22:24,520::ovf_store::119::
>> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>> OVF_STORE volume path: /rhev/data-center/mnt/nas1.le1.uk.cxn-network.net:
>> _oVirt/4f9e46d7-594e-473c-b0c5-1770c5773a2e/images/4ef3bd37-e48d-44a8-
>> b906-5353de1a32cc/d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>>
>> MainThread::INFO::2017-05-30 17:22:24,531::config::226::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>> Found an OVF for HE VM, trying to convert
>>
>> MainThread::INFO::2017-05-30 17:22:24,533::config::231::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
>> Got vm.conf from OVF_STORE
>>
>> MainThread::INFO::2017-05-30 17:22:24,568::states::672::ovi
>> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is 0
>> due to unexpected vm shutdown at Tue May 30 17:18:39 2017
>>
>> MainThread::INFO::2017-05-30 17:22:24,568::hosted_engine::4
>> 61::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUnexpectedlyDown (score: 0)
>>
>> MainThread::INFO::2017-05-30 17:22:24,568::hosted_engine::4
>> 66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host vmhost2.le1.uk.cxn-network.net (id: 2, score: 0)
>>
>>
>>
>>
>>
>> Looks like hosted engine has been turned off at Tue May 30 17:18:38 2017
>> without moving to maintenance before doing it.
>>
>> Above logs are timed 2017-05-30 17:22:24, so 4 minutes after the shutdown.
>>
>> The agent should have already restarted it in a few minutes after above
>> logs.
>>
>> Next time, if you need to urgently bring it up again you can use:
>>
>> hosted-engine ---vm-start
>>
>> as described in http://www.ovirt.org/documentation/self-hosted/chap-
>> Troubleshooting/
>>
>>
>>
>>
>>
>> --
>>
>> Best Regards,
>>
>> Rizwan Qureshi
>>
>> VoIP Admin
>>
>> Ph: 01482xxxxxxx
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>>
>>
>>
>> --
>>
>> *SANDRO BONAZZOLA*
>>
>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>>
>> Red Hat EMEA <https://www.redhat.com/>
>>
>> <https://red.ht/sig>
>>
>> *TRIED. TESTED. TRUSTED.* <https://red.ht/sig>
>>
>>   <https://red.ht/sig>
>>
>>
>>
>> <https://red.ht/sig>
>>
>>   <https://red.ht/sig>
>>
>> -- <https://red.ht/sig>
>>
>> *SANDRO BONAZZOLA <https://red.ht/sig>*
>>
>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>> <https://red.ht/sig>
>>
>> *Red Hat EMEA* <https://red.ht/sig>
>>
>> <https://red.ht/sig>
>>
>> *TRIED. TESTED. TRUSTED.* <https://red.ht/sig>
>>
>>   <https://red.ht/sig>
>>
>
>
>
> --
>
> SANDRO BONAZZOLA
>
> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>
> Red Hat EMEA <https://www.redhat.com/>
> <https://red.ht/sig>
> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
>



-- 

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/>
<https://red.ht/sig>
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170531/f19a3638/attachment-0001.html>


More information about the Users mailing list