[ovirt-users] Engine Down Score is 0

Sandro Bonazzola sbonazzo at redhat.com
Wed May 31 14:18:27 UTC 2017


The hosted engine fails to start due to:

Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 714, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 2026, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 917, in
wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3782, in
createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed',
conn=self)
libvirtError: Cannot get interface MTU on 'None': No such device

The Hosted Engine VM has been configured with:
                <interface address="None" type="bridge">
                        <mac address="00:1a:4a:16:01:75"/>
                        <model type="virtio"/>
                        <source bridge="storage"/>
                        <link state="up"/>
                </interface>

Not sure how you got there.
Can you share steps you did before having this issue?


On Wed, May 31, 2017 at 4:04 PM, Rizwan Qureshi <rqureshi at connexin.co.uk>
wrote:

> Hi Sandro,
>
> PFA the sos report and vdsm logs.
>
>
>
> *From:* Sandro Bonazzola [mailto:sbonazzo at redhat.com]
> *Sent:* Wednesday, May 31, 2017 11:04 AM
> *To:* Rizwan Qureshi <rqureshi at connexin.co.uk>; Martin Sivak <
> msivak at redhat.com>
> *Cc:* Michal Skrivanek <mskrivan at redhat.com>; users at ovirt.org
>
> *Subject:* Re: [ovirt-users] Engine Down Score is 0
>
>
>
>
>
>
>
> On Wed, May 31, 2017 at 11:10 AM, Rizwan Qureshi <rqureshi at connexin.co.uk>
> wrote:
>
> Hi Sandro,
>
> Thanks for your response.
>
>
>
> We have tried that already but the VM shuts down after the –vm-start
> executes.
>
>
>
> Also, apologies for wrong information bfore. We have 3 dedicated hosts not
> 1.
>
>
>
> Please see below the output of –vm-status command:
>
>
>
>
>
> Can you please share sos report or at least vdsm logs from the hosts?
>
> Need to understand what happened to the vm.
>
>
>
>
>
>
>
> --== Host 1 status ==--
>
>
>
> Status up-to-date                  : True
>
> Hostname                           : vmhost1.le1.uk.cxn-network.net
>
> Host ID                            : 1
>
> Engine status                      : {"reason": "bad vm status", "health":
> "bad", "vm": "down", "detail": "down"}
>
> Score                              : 0
>
> stopped                            : False
>
> Local maintenance                  : False
>
> crc32                              : 37a4ce89
>
> Host timestamp                     : 76387
>
> Extra metadata (valid at timestamp):
>
>         metadata_parse_version=1
>
>         metadata_feature_version=1
>
>         timestamp=76387 (Wed May 31 10:04:47 2017)
>
>         host-id=1
>
>         score=0
>
>         maintenance=False
>
>         state=EngineUnexpectedlyDown
>
>         stopped=False
>
>         timeout=Thu Jan  1 22:17:00 1970
>
>
>
>
>
> --== Host 2 status ==--
>
>
>
> Status up-to-date                  : True
>
> Hostname                           : vmhost2.le1.uk.cxn-network.net
>
> Host ID                            : 2
>
> Engine status                      : {"reason": "bad vm status", "health":
> "bad", "vm": "down", "detail": "down"}
>
> Score                              : 0
>
> stopped                            : False
>
> Local maintenance                  : False
>
> crc32                              : 937b5542
>
> Host timestamp                     : 76069
>
> Extra metadata (valid at timestamp):
>
>         metadata_parse_version=1
>
>         metadata_feature_version=1
>
>         timestamp=76069 (Wed May 31 10:04:51 2017)
>
>         host-id=2
>
>         score=0
>
>         maintenance=False
>
>         state=EngineUnexpectedlyDown
>
>         stopped=False
>
>         timeout=Thu Jan  1 22:17:49 1970
>
>
>
>
>
> --== Host 3 status ==--
>
>
>
> Status up-to-date                  : True
>
> Hostname                           : vmhost3.le1.uk.cxn-network.net
>
> Host ID                            : 3
>
> Engine status                      : {"reason": "bad vm status", "health":
> "bad", "vm": "down", "detail": "down"}
>
> Score                              : 0
>
> stopped                            : False
>
> Local maintenance                  : False
>
> crc32                              : 8ffac898
>
> Host timestamp                     : 76212
>
> Extra metadata (valid at timestamp):
>
>         metadata_parse_version=1
>
>         metadata_feature_version=1
>
>         timestamp=76212 (Wed May 31 10:04:55 2017)
>
>         host-id=3
>
>         score=0
>
>         maintenance=False
>
>         state=EngineUnexpectedlyDown
>
>         stopped=False
>
>         timeout=Thu Jan  1 22:16:58 1970
>
>
>
> *From:* Sandro Bonazzola [mailto:sbonazzo at redhat.com]
> *Sent:* Wednesday, May 31, 2017 7:52 AM
> *To:* Rizwan Qureshi <rqureshi at connexin.co.uk>; Michal Skrivanek <
> mskrivan at redhat.com>
> *Cc:* users at ovirt.org
> *Subject:* Re: [ovirt-users] Engine Down Score is 0
>
>
>
>
>
>
>
> On Tue, May 30, 2017 at 6:32 PM, Rizwan Qureshi <rqureshi at connexin.co.uk>
> wrote:
>
> Hello Ovirt Users,
>
> I am new to ovirt.
>
>
>
> Hi, welcome ot oVirt community!
>
>
>
>
>
> Just trying to fix the issue with the engine which seems to be down and
> hence all our VMs which we are very much dependent upon are not working.
>
>
>
> Other VMs shouldn't be affected by an unavailability of the engine, they
> should keep going if already started.
>
>
>
>
>
> Tried googling the log snippet but to no avail. Hoping to get some help
> from you guys.
>
>
>
> I am completely blank and don’t know whats wrong with it.
>
>
>
> We have 3 Dell servers. One for Engine and the other two for nodes. Please
> see the log snippet from the engine server agent.log. Please let me know if
> more information is needed to debug the issue.
>
>
>
> So you have only 1 host dedicated to running hosted engine?
>
>
>
>
>
>
>
> MainThread::INFO::2017-05-30 17:21:51,649::hosted_engine::
> 612::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_vdsm) Initializing VDSM
>
> MainThread::INFO::2017-05-30 17:21:53,820::hosted_engine::
> 639::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_storage_images) Connecting the storage
>
> MainThread::INFO::2017-05-30 17:21:53,821::storage_server::
> 219::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:21:58,134::storage_server::
> 226::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:21:58,142::storage_server::
> 233::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Refreshing the storage domain
>
> MainThread::INFO::2017-05-30 17:21:58,258::hosted_engine::
> 666::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_storage_images) Preparing images
>
> MainThread::INFO::2017-05-30 17:21:58,258::image::126::
> ovirt_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
>
> MainThread::INFO::2017-05-30 17:22:00,637::hosted_engine::
> 669::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_storage_images) Reloading vm.conf from the
> shared storage domain
>
> MainThread::INFO::2017-05-30 17:22:00,638::config::206::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.
> config::(refresh_local_conf_file) Trying to get a fresher copy of vm
> configuration from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:02,838::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:f4d55192-a2c8-4c2b-8e6c-46a49c30967c,
> volUUID:0ae2c43f-5883-4354-a33f-e68e21ae3733
>
> MainThread::INFO::2017-05-30 17:22:02,977::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:4ef3bd37-e48d-44a8-b906-5353de1a32cc,
> volUUID:d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:03,032::ovf_store::112::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> Extracting Engine VM OVF from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:03,033::ovf_store::119::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> OVF_STORE volume path: /rhev/data-center/mnt/nas1.le1.uk.cxn-network.net:
> _oVirt/4f9e46d7-594e-473c-b0c5-1770c5773a2e/images/4ef3bd37-
> e48d-44a8-b906-5353de1a32cc/d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:03,044::config::226::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.
> config::(refresh_local_conf_file) Found an OVF for HE VM, trying to
> convert
>
> MainThread::INFO::2017-05-30 17:22:03,046::config::231::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.
> config::(refresh_local_conf_file) Got vm.conf from OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:03,082::states::672::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is
> 0 due to unexpected vm shutdown at Tue May 30 17:18:38 2017
>
> MainThread::INFO::2017-05-30 17:22:03,082::hosted_engine::
> 461::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUnexpectedlyDown (score: 0)
>
> MainThread::INFO::2017-05-30 17:22:03,082::hosted_engine::
> 466::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host vmhost2.le1.uk.cxn-network.net (id: 2, score: 0)
>
> MainThread::INFO::2017-05-30 17:22:13,135::hosted_engine::
> 612::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_vdsm) Initializing VDSM
>
> MainThread::INFO::2017-05-30 17:22:15,305::hosted_engine::
> 639::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_storage_images) Connecting the storage
>
> MainThread::INFO::2017-05-30 17:22:15,306::storage_server::
> 219::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:22:19,618::storage_server::
> 226::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:22:19,626::storage_server::
> 233::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Refreshing the storage domain
>
> MainThread::INFO::2017-05-30 17:22:19,742::hosted_engine::
> 666::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_storage_images) Preparing images
>
> MainThread::INFO::2017-05-30 17:22:19,742::image::126::
> ovirt_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
>
> MainThread::INFO::2017-05-30 17:22:22,140::hosted_engine::
> 669::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_storage_images) Reloading vm.conf from the
> shared storage domain
>
> MainThread::INFO::2017-05-30 17:22:22,140::config::206::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.
> config::(refresh_local_conf_file) Trying to get a fresher copy of vm
> configuration from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:24,333::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:f4d55192-a2c8-4c2b-8e6c-46a49c30967c,
> volUUID:0ae2c43f-5883-4354-a33f-e68e21ae3733
>
> MainThread::INFO::2017-05-30 17:22:24,472::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:4ef3bd37-e48d-44a8-b906-5353de1a32cc,
> volUUID:d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:24,519::ovf_store::112::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> Extracting Engine VM OVF from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:24,520::ovf_store::119::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> OVF_STORE volume path: /rhev/data-center/mnt/nas1.le1.uk.cxn-network.net:
> _oVirt/4f9e46d7-594e-473c-b0c5-1770c5773a2e/images/4ef3bd37-
> e48d-44a8-b906-5353de1a32cc/d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:24,531::config::226::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.
> config::(refresh_local_conf_file) Found an OVF for HE VM, trying to
> convert
>
> MainThread::INFO::2017-05-30 17:22:24,533::config::231::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.
> config::(refresh_local_conf_file) Got vm.conf from OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:24,568::states::672::
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is
> 0 due to unexpected vm shutdown at Tue May 30 17:18:39 2017
>
> MainThread::INFO::2017-05-30 17:22:24,568::hosted_engine::
> 461::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUnexpectedlyDown (score: 0)
>
> MainThread::INFO::2017-05-30 17:22:24,568::hosted_engine::
> 466::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host vmhost2.le1.uk.cxn-network.net (id: 2, score: 0)
>
>
>
>
>
> Looks like hosted engine has been turned off at Tue May 30 17:18:38 2017
> without moving to maintenance before doing it.
>
> Above logs are timed 2017-05-30 17:22:24, so 4 minutes after the shutdown.
>
> The agent should have already restarted it in a few minutes after above
> logs.
>
> Next time, if you need to urgently bring it up again you can use:
>
> hosted-engine ---vm-start
>
> as described in http://www.ovirt.org/documentation/self-hosted/
> chap-Troubleshooting/
>
>
>
>
>
> --
>
> Best Regards,
>
> Rizwan Qureshi
>
> VoIP Admin
>
> Ph: 01482xxxxxxx
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
>
> --
>
> *SANDRO BONAZZOLA*
>
> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>
> Red Hat EMEA <https://www.redhat.com/>
>
> <https://red.ht/sig>
>
> *TRIED. TESTED. TRUSTED.* <https://red.ht/sig>
>
>   <https://red.ht/sig>
>
>
>
> <https://red.ht/sig>
>
>   <https://red.ht/sig>
>
> -- <https://red.ht/sig>
>
> *SANDRO BONAZZOLA <https://red.ht/sig>*
>
> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
> <https://red.ht/sig>
>
> *Red Hat EMEA* <https://red.ht/sig>
>
> <https://red.ht/sig>
>
> *TRIED. TESTED. TRUSTED.* <https://red.ht/sig>
>
>   <https://red.ht/sig>
>



-- 

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/>
<https://red.ht/sig>
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170531/4b7b002f/attachment-0001.html>


More information about the Users mailing list