[ovirt-users] Host Issue

Bryan Sockel Bryan.Sockel at altn.com
Thu Feb 2 15:18:00 UTC 2017


Hi,

Came into the office with an issue with my ovirt setup this morning.  On one 
of my hosts the / partition was completely full causing the host to go into 
an unknown state.  I was able to clear out some space for the time being and 
attempting to recover my that host.  VM's are still running and responding 
on the host.

I am using Gluster volumes in my configuration, and had to restart gluster 
service on that host.  I also restarted the ovirt-ha-agent service.

I am seeing this entry in my agent.log every two seconds:

MainThread::INFO::2017-02-02 
09:11:19,606::util::214::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) 
Waiting for VDSM hardware info

In my vdsm.log i am seeing this
jsonrpc.Executor/4::INFO::2017-02-02 
09:13:42,088::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getAllVmStats' in bridge with {}
jsonrpc.Executor/4::INFO::2017-02-02 
09:13:42,088::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getAllVmStats failed (error 99) in 0.00 seconds
jsonrpc.Executor/5::INFO::2017-02-02 
09:13:42,114::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/5::INFO::2017-02-02 
09:13:42,115::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
jsonrpc.Executor/6::INFO::2017-02-02 
09:13:44,121::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/6::INFO::2017-02-02 
09:13:44,122::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
jsonrpc.Executor/7::INFO::2017-02-02 
09:13:46,127::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/7::INFO::2017-02-02 
09:13:46,127::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
clientIFinit::DEBUG::2017-02-02 
09:13:46,257::task::597::Storage.TaskManager.Task::(_updateState) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state init -> state 
preparing
clientIFinit::INFO::2017-02-02 
09:13:46,258::logUtils::49::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList(options=None)
clientIFinit::INFO::2017-02-02 
09:13:46,258::logUtils::52::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList, Return response: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:46,258::task::1193::Storage.TaskManager.Task::(prepare) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::finished: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:46,258::task::597::Storage.TaskManager.Task::(_updateState) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state preparing -> 
state finished
clientIFinit::DEBUG::2017-02-02 
09:13:46,258::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) 
Owner.releaseAll requests {} resources {}
clientIFinit::DEBUG::2017-02-02 
09:13:46,259::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) 
Owner.cancelAll requests {}
clientIFinit::DEBUG::2017-02-02 
09:13:46,259::task::995::Storage.TaskManager.Task::(_decref) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::ref 0 aborting False
clientIFinit::INFO::2017-02-02 
09:13:46,259::clientIF::558::vds::(_waitForStoragePool) recovery: waiting 
for storage pool to go up
jsonrpc.Executor/0::INFO::2017-02-02 
09:13:48,133::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/0::INFO::2017-02-02 
09:13:48,134::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
jsonrpc.Executor/1::INFO::2017-02-02 
09:13:50,140::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/1::INFO::2017-02-02 
09:13:50,140::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
clientIFinit::DEBUG::2017-02-02 
09:13:51,265::task::597::Storage.TaskManager.Task::(_updateState) 
Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state init -> state 
preparing
clientIFinit::INFO::2017-02-02 
09:13:51,265::logUtils::49::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList(options=None)
clientIFinit::INFO::2017-02-02 
09:13:51,265::logUtils::52::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList, Return response: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:51,265::task::1193::Storage.TaskManager.Task::(prepare) 
Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::finished: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:51,266::task::597::Storage.TaskManager.Task::(_updateState) 
Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state preparing -> 
state finished
clientIFinit::DEBUG::2017-02-02 
09:13:51,266::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) 
Owner.releaseAll requests {} resources {}
clientIFinit::DEBUG::2017-02-02 
09:13:51,266::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) 
Owner.cancelAll requests {}
clientIFinit::DEBUG::2017-02-02 
09:13:51,266::task::995::Storage.TaskManager.Task::(_decref) 
Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::ref 0 aborting False
clientIFinit::INFO::2017-02-02 
09:13:51,266::clientIF::558::vds::(_waitForStoragePool) recovery: waiting 
for storage pool to go up
jsonrpc.Executor/2::INFO::2017-02-02 
09:13:52,146::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/2::INFO::2017-02-02 
09:13:52,147::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.01 seconds

I attempt to move my service into maintenance mode, but it is unable to 
migrate any of my vm's on that host over to another host.  Is there away to 
get my host running again with out reboot or to migrate my VM's via CLI so i 
can restart my host.

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170202/01f85de2/attachment-0001.html>


More information about the Users mailing list