[ovirt-users] Host non-responsive after engine failure

spfma.tech at e.mail.fr spfma.tech at e.mail.fr
Tue Mar 27 10:08:26 UTC 2018


Hi,

I had an electrical failure on the server hosting the engine.
After the reboot it was able to gain access to it again, log into the GUI, but the currently online node is not leaving "not responsive" status.
Of course, the network storage paths are still mounted, the VMs are running, but I can't gain control again.

In vdsmd.log, I have a lot of messages like this one :
2018-03-27 12:03:11,281+0200 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:674)
2018-03-27 12:03:16,286+0200 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=b90f550e-ee68-4a91-a7c6-3b60f11c3978 (api:46)
2018-03-27 12:03:16,286+0200 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=b90f550e-ee68-4a91-a7c6-3b60f11c3978 (api:52)
2018-03-27 12:03:16,287+0200 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:674)
2018-03-27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] START repoStats(domains=()) from=internal, task_id=067714b4-8172-4eec-92bb-6ac16586a657 (api:46)
2018-03-27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] FINISH repoStats return={} from=internal, task_id=067714b4-8172-4eec-92bb-6ac16586a657 (api:52)
2018-03-27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] START multipath_health() from=internal, task_id=e97421fb-5d5a-4291-9231-94bc1961cc49 (api:46)
2018-03-27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] FINISH multipath_health return={} from=internal, task_id=e97421fb-5d5a-4291-9231-94bc1961cc49 (api:52)
2018-03-27 12:03:20,458+0200 INFO (jsonrpc/6) [api.host] START getAllVmStats() from=::1,57576 (api:46)
2018-03-27 12:03:20,462+0200 INFO (jsonrpc/6) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,57576 (api:52)
2018-03-27 12:03:20,464+0200 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:573)
2018-03-27 12:03:20,474+0200 INFO (jsonrpc/7) [api.host] START getAllVmIoTunePolicies() from=::1,57576 (api:46)
2018-03-27 12:03:20,475+0200 INFO (jsonrpc/7) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'c33a30ba-7fe8-4ff4-aeac-80cb396b9670': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/593f6f61-cb7f-4c53-b6e7-617964c222e9/329b2e8b-6cf9-4b39-9190-14a32697ce44', 'name': 'sda'}]}, 'e8a90739-7737-413e-8edc-a373192f4476': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/97e078f7-69c6-46c2-b620-26474cd65929/bbb4a1fb-5594-4750-be71-c6b55dca3257', 'name': 'vda'}]}, '3aec5ce4-691f-487c-a916-aa7f7a664d8c': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/46a65a1b-d00a-452d-ab9b-70862bb5c053/a4d2ad44-5577-4412-9a8c-819d1f12647a', 'name': 'sda'}, {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/0c3a13ce-8f7a-4034-a8cc-12f795b8aa17/c48e0e37-e54b-4ca3-b3ed-b66ead9fad44', 'name': 'sdb'}]}, '5de1de8f-ac01-459f-b4b8-6d1ed05c8ca3': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/320ac81c-7db7-4ec0-a271-755e91442b6a/8bfc95c5-318c-43dd-817f-6c7a8a7a5b43', 'name': 'sda'}, {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/e7ad86bb-3c63-466b-82cf-687164c46f7b/613ea0ce-ed14-4185-b3fd-36490441f889', 'name': 'sdb'}]}, '5d548a09-a397-4aac-8b1f-39002e014f5f': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/c7421014-7c5f-45ad-a948-caa83b8ce3e7/ae0ba893-69af-4b67-a262-b739596d5c95', 'name': 'sda'}]}, '168b01b1-5ec8-41dd-808e-fa9f66cea718': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/b9b7902a-7a62-4826-bfda-dff260b9fcd1/d05db17c-9908-4bfb-a74b-4aa944510a56', 'name': 'vda'}, {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/564b3848-b6d5-4deb-910f-5b6f2fdbccc5/4f89ff25-2d3b-40b9-9bbc-9a6b6995346c', 'name': 'vdb'}, {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/738e0704-8484-483b-ae67-091715496152/2f811423-6bab-4966-9c00-9d3b72429328', 'name': 'vdc'}]}}} from=::1,57576 (api:52)
2018-03-27 12:03:20,475+0200 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:573)
2018-03-27 12:03:21,292+0200 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=a35602b2-7d5c-4e87-86cd-ede17c62488f (api:46)
2018-03-27 12:03:21,292+0200 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=a35602b2-7d5c-4e87-86cd-ede17c62488f (api:52)
2018-03-27 12:03:21,293+0200 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:674)

So i see no error.

But in messages :
Mar 27 12:01:43 pfm-srv-virt-2 libvirtd: 2018-03-27 10:01:43.569+0000: 71793: error : qemuDomainAgentAvailable:6030 : Guest agent is not responding: QEMU guest agent is not connected

I have restarted libvirtd and vdsmd services.

Is there something else to do ?

Regards 

-------------------------------------------------------------------------------------------------
FreeMail powered by mail.fr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180327/8c719fdc/attachment.html>


More information about the Users mailing list