On Wed, May 31, 2017 at 4:18 PM, Sandro Bonazzola <sbonazzo(a)redhat.com>
wrote:
The hosted engine fails to start due to:
Traceback (most recent call last):
File "/usr/share/vdsm/virt/vm.py", line 714, in _startUnderlyingVm
self._run()
File "/usr/share/vdsm/virt/vm.py", line 2026, in _run
self._connection.createXML(domxml, flags),
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
123, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 917, in
wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3782, in
createXML
if ret is None:raise libvirtError('virDomainCreateXML() failed',
conn=self)
libvirtError: Cannot get interface MTU on 'None': No such device
The Hosted Engine VM has been configured with:
<interface address="None" type="bridge">
<mac address="00:1a:4a:16:01:75"/>
<model type="virtio"/>
<source bridge="storage"/>
<link state="up"/>
</interface>
also:
<interface address="None" type="bridge">
<mac address="00:16:3e:21:57:02"/>
<model type="virtio"/>
<source bridge="None"/>
<link state="up"/>
</interface>
Not sure how you got there.
Can you share steps you did before having this issue?
On Wed, May 31, 2017 at 4:04 PM, Rizwan Qureshi <rqureshi(a)connexin.co.uk>
wrote:
> Hi Sandro,
>
> PFA the sos report and vdsm logs.
>
>
>
> *From:* Sandro Bonazzola [mailto:sbonazzo@redhat.com]
> *Sent:* Wednesday, May 31, 2017 11:04 AM
> *To:* Rizwan Qureshi <rqureshi(a)connexin.co.uk>; Martin Sivak <
> msivak(a)redhat.com>
> *Cc:* Michal Skrivanek <mskrivan(a)redhat.com>; users(a)ovirt.org
>
> *Subject:* Re: [ovirt-users] Engine Down Score is 0
>
>
>
>
>
>
>
> On Wed, May 31, 2017 at 11:10 AM, Rizwan Qureshi <rqureshi(a)connexin.co.uk>
> wrote:
>
> Hi Sandro,
>
> Thanks for your response.
>
>
>
> We have tried that already but the VM shuts down after the –vm-start
> executes.
>
>
>
> Also, apologies for wrong information bfore. We have 3 dedicated hosts
> not 1.
>
>
>
> Please see below the output of –vm-status command:
>
>
>
>
>
> Can you please share sos report or at least vdsm logs from the hosts?
>
> Need to understand what happened to the vm.
>
>
>
>
>
>
>
> --== Host 1 status ==--
>
>
>
> Status up-to-date : True
>
> Hostname :
vmhost1.le1.uk.cxn-network.net
>
> Host ID : 1
>
> Engine status : {"reason": "bad vm status",
> "health": "bad", "vm": "down",
"detail": "down"}
>
> Score : 0
>
> stopped : False
>
> Local maintenance : False
>
> crc32 : 37a4ce89
>
> Host timestamp : 76387
>
> Extra metadata (valid at timestamp):
>
> metadata_parse_version=1
>
> metadata_feature_version=1
>
> timestamp=76387 (Wed May 31 10:04:47 2017)
>
> host-id=1
>
> score=0
>
> maintenance=False
>
> state=EngineUnexpectedlyDown
>
> stopped=False
>
> timeout=Thu Jan 1 22:17:00 1970
>
>
>
>
>
> --== Host 2 status ==--
>
>
>
> Status up-to-date : True
>
> Hostname :
vmhost2.le1.uk.cxn-network.net
>
> Host ID : 2
>
> Engine status : {"reason": "bad vm status",
> "health": "bad", "vm": "down",
"detail": "down"}
>
> Score : 0
>
> stopped : False
>
> Local maintenance : False
>
> crc32 : 937b5542
>
> Host timestamp : 76069
>
> Extra metadata (valid at timestamp):
>
> metadata_parse_version=1
>
> metadata_feature_version=1
>
> timestamp=76069 (Wed May 31 10:04:51 2017)
>
> host-id=2
>
> score=0
>
> maintenance=False
>
> state=EngineUnexpectedlyDown
>
> stopped=False
>
> timeout=Thu Jan 1 22:17:49 1970
>
>
>
>
>
> --== Host 3 status ==--
>
>
>
> Status up-to-date : True
>
> Hostname :
vmhost3.le1.uk.cxn-network.net
>
> Host ID : 3
>
> Engine status : {"reason": "bad vm status",
> "health": "bad", "vm": "down",
"detail": "down"}
>
> Score : 0
>
> stopped : False
>
> Local maintenance : False
>
> crc32 : 8ffac898
>
> Host timestamp : 76212
>
> Extra metadata (valid at timestamp):
>
> metadata_parse_version=1
>
> metadata_feature_version=1
>
> timestamp=76212 (Wed May 31 10:04:55 2017)
>
> host-id=3
>
> score=0
>
> maintenance=False
>
> state=EngineUnexpectedlyDown
>
> stopped=False
>
> timeout=Thu Jan 1 22:16:58 1970
>
>
>
> *From:* Sandro Bonazzola [mailto:sbonazzo@redhat.com]
> *Sent:* Wednesday, May 31, 2017 7:52 AM
> *To:* Rizwan Qureshi <rqureshi(a)connexin.co.uk>; Michal Skrivanek <
> mskrivan(a)redhat.com>
> *Cc:* users(a)ovirt.org
> *Subject:* Re: [ovirt-users] Engine Down Score is 0
>
>
>
>
>
>
>
> On Tue, May 30, 2017 at 6:32 PM, Rizwan Qureshi <rqureshi(a)connexin.co.uk>
> wrote:
>
> Hello Ovirt Users,
>
> I am new to ovirt.
>
>
>
> Hi, welcome ot oVirt community!
>
>
>
>
>
> Just trying to fix the issue with the engine which seems to be down and
> hence all our VMs which we are very much dependent upon are not working.
>
>
>
> Other VMs shouldn't be affected by an unavailability of the engine, they
> should keep going if already started.
>
>
>
>
>
> Tried googling the log snippet but to no avail. Hoping to get some help
> from you guys.
>
>
>
> I am completely blank and don’t know whats wrong with it.
>
>
>
> We have 3 Dell servers. One for Engine and the other two for nodes.
> Please see the log snippet from the engine server agent.log. Please let me
> know if more information is needed to debug the issue.
>
>
>
> So you have only 1 host dedicated to running hosted engine?
>
>
>
>
>
>
>
> MainThread::INFO::2017-05-30 17:21:51,649::hosted_engine::6
> 12::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
> Initializing VDSM
>
> MainThread::INFO::2017-05-30 17:21:53,820::hosted_engine::6
>
39::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Connecting the storage
>
> MainThread::INFO::2017-05-30 17:21:53,821::storage_server::219::
> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:21:58,134::storage_server::226::
> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:21:58,142::storage_server::233::
> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Refreshing the storage domain
>
> MainThread::INFO::2017-05-30 17:21:58,258::hosted_engine::6
>
66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Preparing images
>
> MainThread::INFO::2017-05-30 17:21:58,258::image::126::ovir
> t_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
>
> MainThread::INFO::2017-05-30 17:22:00,637::hosted_engine::6
>
69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Reloading vm.conf from the shared storage domain
>
> MainThread::INFO::2017-05-30 17:22:00,638::config::206::ovi
>
rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
> Trying to get a fresher copy of vm configuration from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:02,838::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:f4d55192-a2c8-4c2b-8e6c-46a49c30967c,
> volUUID:0ae2c43f-5883-4354-a33f-e68e21ae3733
>
> MainThread::INFO::2017-05-30 17:22:02,977::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:4ef3bd37-e48d-44a8-b906-5353de1a32cc,
> volUUID:d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:03,032::ovf_store::112::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> Extracting Engine VM OVF from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:03,033::ovf_store::119::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> OVF_STORE volume path: /rhev/data-center/mnt/nas1.le1.uk.cxn-network.net:
> _oVirt/4f9e46d7-594e-473c-b0c5-1770c5773a2e/images/4ef3bd37-e48d-44a8-
> b906-5353de1a32cc/d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:03,044::config::226::ovi
>
rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
> Found an OVF for HE VM, trying to convert
>
> MainThread::INFO::2017-05-30 17:22:03,046::config::231::ovi
>
rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
> Got vm.conf from OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:03,082::states::672::ovi
> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is 0
> due to unexpected vm shutdown at Tue May 30 17:18:38 2017
>
> MainThread::INFO::2017-05-30 17:22:03,082::hosted_engine::4
> 61::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUnexpectedlyDown (score: 0)
>
> MainThread::INFO::2017-05-30 17:22:03,082::hosted_engine::4
> 66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host
vmhost2.le1.uk.cxn-network.net (id: 2, score: 0)
>
> MainThread::INFO::2017-05-30 17:22:13,135::hosted_engine::6
> 12::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
> Initializing VDSM
>
> MainThread::INFO::2017-05-30 17:22:15,305::hosted_engine::6
>
39::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Connecting the storage
>
> MainThread::INFO::2017-05-30 17:22:15,306::storage_server::219::
> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:22:19,618::storage_server::226::
> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
>
> MainThread::INFO::2017-05-30 17:22:19,626::storage_server::233::
> ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Refreshing the storage domain
>
> MainThread::INFO::2017-05-30 17:22:19,742::hosted_engine::6
>
66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Preparing images
>
> MainThread::INFO::2017-05-30 17:22:19,742::image::126::ovir
> t_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
>
> MainThread::INFO::2017-05-30 17:22:22,140::hosted_engine::6
>
69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images)
> Reloading vm.conf from the shared storage domain
>
> MainThread::INFO::2017-05-30 17:22:22,140::config::206::ovi
>
rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
> Trying to get a fresher copy of vm configuration from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:24,333::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:f4d55192-a2c8-4c2b-8e6c-46a49c30967c,
> volUUID:0ae2c43f-5883-4354-a33f-e68e21ae3733
>
> MainThread::INFO::2017-05-30 17:22:24,472::ovf_store::103::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found
> OVF_STORE: imgUUID:4ef3bd37-e48d-44a8-b906-5353de1a32cc,
> volUUID:d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:24,519::ovf_store::112::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> Extracting Engine VM OVF from the OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:24,520::ovf_store::119::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> OVF_STORE volume path: /rhev/data-center/mnt/nas1.le1.uk.cxn-network.net:
> _oVirt/4f9e46d7-594e-473c-b0c5-1770c5773a2e/images/4ef3bd37-e48d-44a8-
> b906-5353de1a32cc/d63b9f36-c5f1-4ddc-9d4a-e01f22023e73
>
> MainThread::INFO::2017-05-30 17:22:24,531::config::226::ovi
>
rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
> Found an OVF for HE VM, trying to convert
>
> MainThread::INFO::2017-05-30 17:22:24,533::config::231::ovi
>
rt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file)
> Got vm.conf from OVF_STORE
>
> MainThread::INFO::2017-05-30 17:22:24,568::states::672::ovi
> rt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is 0
> due to unexpected vm shutdown at Tue May 30 17:18:39 2017
>
> MainThread::INFO::2017-05-30 17:22:24,568::hosted_engine::4
> 61::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUnexpectedlyDown (score: 0)
>
> MainThread::INFO::2017-05-30 17:22:24,568::hosted_engine::4
> 66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host
vmhost2.le1.uk.cxn-network.net (id: 2, score: 0)
>
>
>
>
>
> Looks like hosted engine has been turned off at Tue May 30 17:18:38 2017
> without moving to maintenance before doing it.
>
> Above logs are timed 2017-05-30 17:22:24, so 4 minutes after the shutdown.
>
> The agent should have already restarted it in a few minutes after above
> logs.
>
> Next time, if you need to urgently bring it up again you can use:
>
> hosted-engine ---vm-start
>
> as described in
http://www.ovirt.org/documentation/self-hosted/chap-
> Troubleshooting/
>
>
>
>
>
> --
>
> Best Regards,
>
> Rizwan Qureshi
>
> VoIP Admin
>
> Ph: 01482xxxxxxx
>
>
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
>
> --
>
> *SANDRO BONAZZOLA*
>
> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>
> Red Hat EMEA <
https://www.redhat.com/>
>
> <
https://red.ht/sig>
>
> *TRIED. TESTED. TRUSTED.* <
https://red.ht/sig>
>
> <
https://red.ht/sig>
>
>
>
> <
https://red.ht/sig>
>
> <
https://red.ht/sig>
>
> -- <
https://red.ht/sig>
>
> *SANDRO BONAZZOLA <
https://red.ht/sig>*
>
> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
> <
https://red.ht/sig>
>
> *Red Hat EMEA* <
https://red.ht/sig>
>
> <
https://red.ht/sig>
>
> *TRIED. TESTED. TRUSTED.* <
https://red.ht/sig>
>
> <
https://red.ht/sig>
>
--
SANDRO BONAZZOLA
ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
Red Hat EMEA <
https://www.redhat.com/>
<
https://red.ht/sig>
TRIED. TESTED. TRUSTED. <
https://redhat.com/trusted>
--
SANDRO BONAZZOLA
ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
Red Hat EMEA <
TRIED. TESTED. TRUSTED. <