
Hi, Came into the office with an issue with my ovirt setup this morning. On one of my hosts the / partition was completely full causing the host to go into an unknown state. I was able to clear out some space for the time being and attempting to recover my that host. VM's are still running and responding on the host. I am using Gluster volumes in my configuration, and had to restart gluster service on that host. I also restarted the ovirt-ha-agent service. I am seeing this entry in my agent.log every two seconds: MainThread::INFO::2017-02-02 09:11:19,606::util::214::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM hardware info In my vdsm.log i am seeing this jsonrpc.Executor/4::INFO::2017-02-02 09:13:42,088::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getAllVmStats' in bridge with {} jsonrpc.Executor/4::INFO::2017-02-02 09:13:42,088::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getAllVmStats failed (error 99) in 0.00 seconds jsonrpc.Executor/5::INFO::2017-02-02 09:13:42,114::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/5::INFO::2017-02-02 09:13:42,115::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/6::INFO::2017-02-02 09:13:44,121::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/6::INFO::2017-02-02 09:13:44,122::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/7::INFO::2017-02-02 09:13:46,127::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/7::INFO::2017-02-02 09:13:46,127::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds clientIFinit::DEBUG::2017-02-02 09:13:46,257::task::597::Storage.TaskManager.Task::(_updateState) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state init -> state preparing clientIFinit::INFO::2017-02-02 09:13:46,258::logUtils::49::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList(options=None) clientIFinit::INFO::2017-02-02 09:13:46,258::logUtils::52::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList, Return response: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:46,258::task::1193::Storage.TaskManager.Task::(prepare) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::finished: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:46,258::task::597::Storage.TaskManager.Task::(_updateState) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state preparing -> state finished clientIFinit::DEBUG::2017-02-02 09:13:46,258::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} clientIFinit::DEBUG::2017-02-02 09:13:46,259::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} clientIFinit::DEBUG::2017-02-02 09:13:46,259::task::995::Storage.TaskManager.Task::(_decref) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::ref 0 aborting False clientIFinit::INFO::2017-02-02 09:13:46,259::clientIF::558::vds::(_waitForStoragePool) recovery: waiting for storage pool to go up jsonrpc.Executor/0::INFO::2017-02-02 09:13:48,133::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/0::INFO::2017-02-02 09:13:48,134::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/1::INFO::2017-02-02 09:13:50,140::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/1::INFO::2017-02-02 09:13:50,140::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds clientIFinit::DEBUG::2017-02-02 09:13:51,265::task::597::Storage.TaskManager.Task::(_updateState) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state init -> state preparing clientIFinit::INFO::2017-02-02 09:13:51,265::logUtils::49::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList(options=None) clientIFinit::INFO::2017-02-02 09:13:51,265::logUtils::52::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList, Return response: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:51,265::task::1193::Storage.TaskManager.Task::(prepare) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::finished: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:51,266::task::597::Storage.TaskManager.Task::(_updateState) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state preparing -> state finished clientIFinit::DEBUG::2017-02-02 09:13:51,266::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} clientIFinit::DEBUG::2017-02-02 09:13:51,266::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} clientIFinit::DEBUG::2017-02-02 09:13:51,266::task::995::Storage.TaskManager.Task::(_decref) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::ref 0 aborting False clientIFinit::INFO::2017-02-02 09:13:51,266::clientIF::558::vds::(_waitForStoragePool) recovery: waiting for storage pool to go up jsonrpc.Executor/2::INFO::2017-02-02 09:13:52,146::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/2::INFO::2017-02-02 09:13:52,147::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.01 seconds I attempt to move my service into maintenance mode, but it is unable to migrate any of my vm's on that host over to another host. Is there away to get my host running again with out reboot or to migrate my VM's via CLI so i can restart my host. Thanks

Hi, VDSM error 99 means RecoveryInProgress and it might take some time depending on how many VMs there are. So I suggest you wait a bit more for now and see what happens. Best regards -- Martin Sivak oVirt / SLA On Thu, Feb 2, 2017 at 4:18 PM, Bryan Sockel <Bryan.Sockel@altn.com> wrote:
Hi,
Came into the office with an issue with my ovirt setup this morning. On one of my hosts the / partition was completely full causing the host to go into an unknown state. I was able to clear out some space for the time being and attempting to recover my that host. VM's are still running and responding on the host.
I am using Gluster volumes in my configuration, and had to restart gluster service on that host. I also restarted the ovirt-ha-agent service.
I am seeing this entry in my agent.log every two seconds:
MainThread::INFO::2017-02-02 09:11:19,606::util::214::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM hardware info
In my vdsm.log i am seeing this jsonrpc.Executor/4::INFO::2017-02-02 09:13:42,088::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getAllVmStats' in bridge with {} jsonrpc.Executor/4::INFO::2017-02-02 09:13:42,088::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getAllVmStats failed (error 99) in 0.00 seconds jsonrpc.Executor/5::INFO::2017-02-02 09:13:42,114::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/5::INFO::2017-02-02 09:13:42,115::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/6::INFO::2017-02-02 09:13:44,121::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/6::INFO::2017-02-02 09:13:44,122::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/7::INFO::2017-02-02 09:13:46,127::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/7::INFO::2017-02-02 09:13:46,127::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds clientIFinit::DEBUG::2017-02-02 09:13:46,257::task::597::Storage.TaskManager.Task::(_updateState) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state init -> state preparing clientIFinit::INFO::2017-02-02 09:13:46,258::logUtils::49::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList(options=None) clientIFinit::INFO::2017-02-02 09:13:46,258::logUtils::52::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList, Return response: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:46,258::task::1193::Storage.TaskManager.Task::(prepare) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::finished: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:46,258::task::597::Storage.TaskManager.Task::(_updateState) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state preparing -> state finished clientIFinit::DEBUG::2017-02-02 09:13:46,258::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} clientIFinit::DEBUG::2017-02-02 09:13:46,259::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} clientIFinit::DEBUG::2017-02-02 09:13:46,259::task::995::Storage.TaskManager.Task::(_decref) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::ref 0 aborting False clientIFinit::INFO::2017-02-02 09:13:46,259::clientIF::558::vds::(_waitForStoragePool) recovery: waiting for storage pool to go up jsonrpc.Executor/0::INFO::2017-02-02 09:13:48,133::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/0::INFO::2017-02-02 09:13:48,134::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/1::INFO::2017-02-02 09:13:50,140::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/1::INFO::2017-02-02 09:13:50,140::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds clientIFinit::DEBUG::2017-02-02 09:13:51,265::task::597::Storage.TaskManager.Task::(_updateState) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state init -> state preparing clientIFinit::INFO::2017-02-02 09:13:51,265::logUtils::49::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList(options=None) clientIFinit::INFO::2017-02-02 09:13:51,265::logUtils::52::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList, Return response: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:51,265::task::1193::Storage.TaskManager.Task::(prepare) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::finished: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:51,266::task::597::Storage.TaskManager.Task::(_updateState) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state preparing -> state finished clientIFinit::DEBUG::2017-02-02 09:13:51,266::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} clientIFinit::DEBUG::2017-02-02 09:13:51,266::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} clientIFinit::DEBUG::2017-02-02 09:13:51,266::task::995::Storage.TaskManager.Task::(_decref) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::ref 0 aborting False clientIFinit::INFO::2017-02-02 09:13:51,266::clientIF::558::vds::(_waitForStoragePool) recovery: waiting for storage pool to go up jsonrpc.Executor/2::INFO::2017-02-02 09:13:52,146::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/2::INFO::2017-02-02 09:13:52,147::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.01 seconds
I attempt to move my service into maintenance mode, but it is unable to migrate any of my vm's on that host over to another host. Is there away to get my host running again with out reboot or to migrate my VM's via CLI so i can restart my host.
Thanks
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (2)
-
Bryan Sockel
-
Martin Sivak