[ovirt-users] HE in bad stauts, will not start following storage issue - HELP

Ian Neilsen ian.neilsen at gmail.com
Fri Mar 10 06:37:09 UTC 2017


I just noticed this in the vdsm.logs.  The agent looks like it is trying to
start hosted engine on both machines??

<on_poweroff>destroy</on_poweroff><on_reboot>destroy</on_reboot><on_crash>destroy</on_crash></domain>
Thread-7517::ERROR::2017-03-10
01:26:13,053::vm::773::virt.vm::(_startUnderlyingVm)
vmId=`2419f9fe-4998-4b7a-9fe9-151571d20379`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 714, in _startUnderlyingVm
self._run()
  File "/usr/share/vdsm/virt/vm.py", line 2026, in _run
self._connection.createXML(domxml, flags),
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
123, in wrapper ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 917, in
wrapper return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3782, in
createXML if ret is None:raise libvirtError('virDomainCreateXML() failed',
conn=self)

libvirtError: Failed to acquire lock: Permission denied

INFO::2017-03-10 01:26:13,054::vm::1330::virt.vm::(setDownStatus)
vmId=`2419f9fe-4998-4b7a-9fe9-151571d20379`::Changed state to Down: Failed
to acquire lock: Permission denied (code=1)
INFO::2017-03-10 01:26:13,054::guestagent::430::virt.vm::(stop)
vmId=`2419f9fe-4998-4b7a-9fe9-151571d20379`::Stopping connection

DEBUG::2017-03-10 01:26:13,054::vmchannels::238::vds::(unregister) Delete
fileno 56 from listener.
DEBUG::2017-03-10 01:26:13,055::vmchannels::66::vds::(_unregister_fd)
Failed to unregister FD from epoll (ENOENT): 56
DEBUG::2017-03-10 01:26:13,055::__init__::209::jsonrpc.Notification::(emit)
Sending event {"params": {"2419f9fe-4998-4b7a-9fe9-151571d20379":
{"status": "Down", "exitReason": 1, "exitMessage": "Failed to acquire lock:
Permission denied", "exitCode": 1}, "notify_time": 4308740560}, "jsonrpc":
"2.0", "method": "|virt|VM_status|2419f9fe-4998-4b7a-9fe9-151571d20379"}
VM Channels Listener::DEBUG::2017-03-10
01:26:13,475::vmchannels::142::vds::(_do_del_channels) fileno 56 was
removed from listener.
DEBUG::2017-03-10 01:26:14,430::check::296::storage.check::(_start_process)
START check u'/rhev/data-center/mnt/glusterSD/192.168.3.10:_data/a08822ec-3f5b-4dba-ac2d-5510f0b4b6a2/dom_md/metadata'
cmd=['/usr/bin/taskset', '--cpu-list', '0-39', '/usr/bin/dd',
u'if=/rhev/data-center/mnt/glusterSD/192.168.3.10:_data/a08822ec-3f5b-4dba-ac2d-5510f0b4b6a2/dom_md/metadata',
'of=/dev/null', 'bs=4096', 'count=1', 'iflag=direct'] delay=0.00
DEBUG::2017-03-10 01:26:14,481::asyncevent::564::storage.asyncevent::(reap)
Process <cpopen.CPopen object at 0x3ba6550> terminated (count=1)
DEBUG::2017-03-10
01:26:14,481::check::327::storage.check::(_check_completed) FINISH check
u'/rhev/data-center/mnt/glusterSD/192.168.3.10:_data/a08822ec-3f5b-4dba-ac2d-5510f0b4b6a2/dom_md/metadata'
rc=0 err=bytearray(b'0+1 records in\n0+1 records out\n300 bytes (300 B)
copied, 8.7603e-05 s, 3.4 MB/s\n') elapsed=0.06


On 10 March 2017 at 10:40, Ian Neilsen <ian.neilsen at gmail.com> wrote:

> Hi All
>
> I had a storage issue with my gluster volumes running under ovirt hosted.
> I now cannot start the hosted engine manager vm from "hosted-engine
> --vm-start".
> I've scoured the net to find a way, but can't seem to find anything
> concrete.
>
> Running Centos7, ovirt 4.0 and gluster 3.8.9
>
> How do I recover the engine manager. Im at a loss!
>
> Engine Status = score between nodes was 0 for all, now node 1 is reading
> 3400, but all others are 0
>
> {"reason": "bad vm status", "health": "bad", "vm": "down", "detail":
> "down"}
>
>
> Logs from agent.log
> ==================
>
> INFO::2017-03-09 19:32:52,600::state_decorators::51::ovirt_hosted_
> engine_ha.agent.hosted_engine.HostedEngine::(check) Global maintenance
> detected
> INFO::2017-03-09 19:32:52,603::hosted_engine::612::ovirt_hosted_engine_ha.
> agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM
> INFO::2017-03-09 19:32:54,820::hosted_engine::639::ovirt_hosted_engine_ha.
> agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting
> the storage
> INFO::2017-03-09 19:32:54,821::storage_server::
> 219::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
> INFO::2017-03-09 19:32:59,194::storage_server::
> 226::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
> INFO::2017-03-09 19:32:59,211::storage_server::
> 233::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Refreshing the storage domain
> INFO::2017-03-09 19:32:59,328::hosted_engine::666::ovirt_hosted_engine_ha.
> agent.hosted_engine.HostedEngine::(_initialize_storage_images) Preparing
> images
> INFO::2017-03-09 19:32:59,328::image::126::ovirt_hosted_engine_ha.lib.image.Image::(prepare_images)
> Preparing images
> INFO::2017-03-09 19:33:01,748::hosted_engine::669::ovirt_hosted_engine_ha.
> agent.hosted_engine.HostedEngine::(_initialize_storage_images) Reloading
> vm.conf from the shared storage domain
> INFO::2017-03-09 19:33:01,748::config::206::ovirt_hosted_engine_ha.agent.
> hosted_engine.HostedEngine.config::(refresh_local_conf_file) Trying to
> get a fresher copy of vm configuration from the OVF_STORE
> WARNING::2017-03-09 19:33:04,056::ovf_store::107::
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Unable to find
> OVF_STORE
> ERROR::2017-03-09 19:33:04,058::config::235::ovirt_hosted_engine_ha.agent.
> hosted_engine.HostedEngine.config::(refresh_local_conf_file) Unable to
> get vm.conf from OVF_STORE, falling back to initial vm.conf
>
> ovirt-ha-agent logs
> ================
>
> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config
> ERROR Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf
>
> vdsm
> ======
>
> vdsm vds.dispatcher ERROR SSL error during reading data: unexpected eof
>
> ovirt-ha-broker
> ============
>
> ovirt-ha-broker cpu_load_no_engine.EngineHealth ERROR Failed to
> getVmStats: 'pid'
>
> --
> Ian Neilsen
>
> Mobile: 0424 379 762
> Linkedin: http://au.linkedin.com/in/ianneilsen
> Twitter : ineilsen
>



-- 
Ian Neilsen

Mobile: 0424 379 762
Linkedin: http://au.linkedin.com/in/ianneilsen
Twitter : ineilsen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170310/08ab50c3/attachment.html>


More information about the Users mailing list