Host non-responsive after engine failure

--=_fa659c011be33add9dea7033694abeb7 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi,=0A=0AI had an electrical failure on the server hosting the engine.= =0AAfter the reboot it was able to gain access to it again, log into the= GUI, but the currently online node is not leaving "not responsive" stat= us.=0AOf course, the network storage paths are still mounted, the VMs ar= e running, but I can't gain control again.=0A=0AIn vdsmd.log, I have a l= ot of messages like this one :=0A2018-03-27 12:03:11,281+0200 INFO (vmre= covery) [vds] recovery: waiting for storage pool to go up (clientIF:674)= =0A2018-03-27 12:03:16,286+0200 INFO (vmrecovery) [vdsm.api] START getCo= nnectedStoragePoolsList(options=3DNone) from=3Dinternal, task_id=3Db90f5= 50e-ee68-4a91-a7c6-3b60f11c3978 (api:46)=0A2018-03-27 12:03:16,286+0200= INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return= =3D{'poollist': []} from=3Dinternal, task_id=3Db90f550e-ee68-4a91-a7c6-3= b60f11c3978 (api:52)=0A2018-03-27 12:03:16,287+0200 INFO (vmrecovery) [v= ds] recovery: waiting for storage pool to go up (clientIF:674)=0A2018-03= -27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] START repoStats(domai= ns=3D()) from=3Dinternal, task_id=3D067714b4-8172-4eec-92bb-6ac16586a657= (api:46)=0A2018-03-27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] FI= NISH repoStats return=3D{} from=3Dinternal, task_id=3D067714b4-8172-4eec= -92bb-6ac16586a657 (api:52)=0A2018-03-27 12:03:18,413+0200 INFO (periodi= c/3) [vdsm.api] START multipath_health() from=3Dinternal, task_id=3De974= 21fb-5d5a-4291-9231-94bc1961cc49 (api:46)=0A2018-03-27 12:03:18,413+0200= INFO (periodic/3) [vdsm.api] FINISH multipath_health return=3D{} from= =3Dinternal, task_id=3De97421fb-5d5a-4291-9231-94bc1961cc49 (api:52)=0A2= 018-03-27 12:03:20,458+0200 INFO (jsonrpc/6) [api.host] START getAllVmSt= ats() from=3D::1,57576 (api:46)=0A2018-03-27 12:03:20,462+0200 INFO (jso= nrpc/6) [api.host] FINISH getAllVmStats return=3D{'status': {'message':= 'Done', 'code': 0}, 'statsList': (suppressed)} from=3D::1,57576 (api:52= )=0A2018-03-27 12:03:20,464+0200 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer= ] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:573)= =0A2018-03-27 12:03:20,474+0200 INFO (jsonrpc/7) [api.host] START getAll= VmIoTunePolicies() from=3D::1,57576 (api:46)=0A2018-03-27 12:03:20,475+0= 200 INFO (jsonrpc/7) [api.host] FINISH getAllVmIoTunePolicies return=3D{= 'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'c33= a30ba-7fe8-4ff4-aeac-80cb396b9670': {'policy': [], 'current_values': [{'= ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec':= 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L},= 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07ef= a4fe-06bc-498e-8f42-035461aef900/images/593f6f61-cb7f-4c53-b6e7-617964c2= 22e9/329b2e8b-6cf9-4b39-9190-14a32697ce44', 'name': 'sda'}]}, 'e8a90739-= 7737-413e-8edc-a373192f4476': {'policy': [], 'current_values': [{'ioTune= ': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'r= ead_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path'= : '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06= bc-498e-8f42-035461aef900/images/97e078f7-69c6-46c2-b620-26474cd65929/bb= b4a1fb-5594-4750-be71-c6b55dca3257', 'name': 'vda'}]}, '3aec5ce4-691f-48= 7c-a916-aa7f7a664d8c': {'policy': [], 'current_values': [{'ioTune': {'wr= ite_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_byt= es_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhe= v/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e= -8f42-035461aef900/images/46a65a1b-d00a-452d-ab9b-70862bb5c053/a4d2ad44-= 5577-4412-9a8c-819d1f12647a', 'name': 'sda'}, {'ioTune': {'write_bytes_s= ec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L= , 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-cent= er/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-03546= 1aef900/images/0c3a13ce-8f7a-4034-a8cc-12f795b8aa17/c48e0e37-e54b-4ca3-b= 3ed-b66ead9fad44', 'name': 'sdb'}]}, '5de1de8f-ac01-459f-b4b8-6d1ed05c8c= a3': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L= , 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'writ= e_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/= 10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900= /images/320ac81c-7db7-4ec0-a271-755e91442b6a/8bfc95c5-318c-43dd-817f-6c7= a8a7a5b43', 'name': 'sda'}, {'ioTune': {'write_bytes_sec': 0L, 'total_io= ps_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec'= : 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.13= 2:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/e7a= d86bb-3c63-466b-82cf-687164c46f7b/613ea0ce-ed14-4185-b3fd-36490441f889',= 'name': 'sdb'}]}, '5d548a09-a397-4aac-8b1f-39002e014f5f': {'policy': []= , 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec'= : 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, '= total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volu= me2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/c7421014-7= c5f-45ad-a948-caa83b8ce3e7/ae0ba893-69af-4b67-a262-b739596d5c95', 'name'= : 'sda'}]}, '168b01b1-5ec8-41dd-808e-fa9f66cea718': {'policy': [], 'curr= ent_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, '= read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_b= ytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovi= rt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/b9b7902a-7a62-482= 6-bfda-dff260b9fcd1/d05db17c-9908-4bfb-a74b-4aa944510a56', 'name': 'vda'= }, {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_s= ec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec':= 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1= /07efa4fe-06bc-498e-8f42-035461aef900/images/564b3848-b6d5-4deb-910f-5b6= f2fdbccc5/4f89ff25-2d3b-40b9-9bbc-9a6b6995346c', 'name': 'vdb'}, {'ioTun= e': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, '= read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path= ': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-0= 6bc-498e-8f42-035461aef900/images/738e0704-8484-483b-ae67-091715496152/2= f811423-6bab-4966-9c00-9d3b72429328', 'name': 'vdc'}]}}} from=3D::1,5757= 6 (api:52)=0A2018-03-27 12:03:20,475+0200 INFO (jsonrpc/7) [jsonrpc.Json= RpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 second= s (__init__:573)=0A2018-03-27 12:03:21,292+0200 INFO (vmrecovery) [vdsm.= api] START getConnectedStoragePoolsList(options=3DNone) from=3Dinternal,= task_id=3Da35602b2-7d5c-4e87-86cd-ede17c62488f (api:46)=0A2018-03-27 12= :03:21,292+0200 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStorageP= oolsList return=3D{'poollist': []} from=3Dinternal, task_id=3Da35602b2-7= d5c-4e87-86cd-ede17c62488f (api:52)=0A2018-03-27 12:03:21,293+0200 INFO= (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientI= F:674)=0A=0ASo i see no error.=0A=0ABut in messages :=0AMar 27 12:01:43= pfm-srv-virt-2 libvirtd: 2018-03-27 10:01:43.569+0000: 71793: error : q= emuDomainAgentAvailable:6030 : Guest agent is not responding: QEMU guest= agent is not connected=0A=0AI have restarted libvirtd and vdsmd service= s.=0A=0AIs there something else to do ?=0A=0ARegards =0A=0A-------------= ------------------------------------------------------------------------= ------------=0AFreeMail powered by mail.fr --=_fa659c011be33add9dea7033694abeb7 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <div><span style=3D"font-family: arial, helvetica,sans-serif; font-size:= 10pt; color: #000000;">Hi,<br /><br />I had an electrical failure on th= e server hosting the engine.<br />After the reboot it was able to gain a= ccess to it again, log into the GUI, but the currently online node is no= t leaving "not responsive" status.<br />Of course, the network storage p= aths are still mounted, the VMs are running, but I can't gain control ag= ain.<br /><br />In vdsmd.log, I have a lot of messages like this one :<b= r />2018-03-27 12:03:11,281+0200 INFO (vmrecovery) [vds] recovery:= waiting for storage pool to go up (clientIF:674)<br />2018-03-27 12:03:= 16,286+0200 INFO (vmrecovery) [vdsm.api] START getConnectedStorage= PoolsList(options=3DNone) from=3Dinternal, task_id=3Db90f550e-ee68-4a91-= a7c6-3b60f11c3978 (api:46)<br />2018-03-27 12:03:16,286+0200 INFO = (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return=3D{'= poollist': []} from=3Dinternal, task_id=3Db90f550e-ee68-4a91-a7c6-3b60f1= 1c3978 (api:52)<br />2018-03-27 12:03:16,287+0200 INFO (vmrecovery= ) [vds] recovery: waiting for storage pool to go up (clientIF:674)<br />= 2018-03-27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] START re= poStats(domains=3D()) from=3Dinternal, task_id=3D067714b4-8172-4eec-92bb= -6ac16586a657 (api:46)<br />2018-03-27 12:03:18,413+0200 INFO (per= iodic/3) [vdsm.api] FINISH repoStats return=3D{} from=3Dinternal, task_i= d=3D067714b4-8172-4eec-92bb-6ac16586a657 (api:52)<br />2018-03-27 12:03:= 18,413+0200 INFO (periodic/3) [vdsm.api] START multipath_health()= from=3Dinternal, task_id=3De97421fb-5d5a-4291-9231-94bc1961cc49 (api:46= )<br />2018-03-27 12:03:18,413+0200 INFO (periodic/3) [vdsm.api] F= INISH multipath_health return=3D{} from=3Dinternal, task_id=3De97421fb-5= d5a-4291-9231-94bc1961cc49 (api:52)<br />2018-03-27 12:03:20,458+0200 IN= FO (jsonrpc/6) [api.host] START getAllVmStats() from=3D::1,57576 (= api:46)<br />2018-03-27 12:03:20,462+0200 INFO (jsonrpc/6) [api.ho= st] FINISH getAllVmStats return=3D{'status': {'message': 'Done', 'code':= 0}, 'statsList': (suppressed)} from=3D::1,57576 (api:52)<br />2018-03-2= 7 12:03:20,464+0200 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC c= all Host.getAllVmStats succeeded in 0.01 seconds (__init__:573)<br />201= 8-03-27 12:03:20,474+0200 INFO (jsonrpc/7) [api.host] START getAll= VmIoTunePolicies() from=3D::1,57576 (api:46)<br />2018-03-27 12:03:20,47= 5+0200 INFO (jsonrpc/7) [api.host] FINISH getAllVmIoTunePolicies r= eturn=3D{'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dic= t': {'c33a30ba-7fe8-4ff4-aeac-80cb396b9670': {'policy': [], 'current_val= ues': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_io= ps_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_se= c': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms= __1/07efa4fe-06bc-498e-8f42-035461aef900/images/593f6f61-cb7f-4c53-b6e7-= 617964c222e9/329b2e8b-6cf9-4b39-9190-14a32697ce44', 'name': 'sda'}]}, 'e= 8a90739-7737-413e-8edc-a373192f4476': {'policy': [], 'current_values': [= {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec'= : 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}= , 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07e= fa4fe-06bc-498e-8f42-035461aef900/images/97e078f7-69c6-46c2-b620-26474cd= 65929/bbb4a1fb-5594-4750-be71-c6b55dca3257', 'name': 'vda'}]}, '3aec5ce4= -691f-487c-a916-aa7f7a664d8c': {'policy': [], 'current_values': [{'ioTun= e': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, '= read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path= ': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-0= 6bc-498e-8f42-035461aef900/images/46a65a1b-d00a-452d-ab9b-70862bb5c053/a= 4d2ad44-5577-4412-9a8c-819d1f12647a', 'name': 'sda'}, {'ioTune': {'write= _bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_= sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/d= ata-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f= 42-035461aef900/images/0c3a13ce-8f7a-4034-a8cc-12f795b8aa17/c48e0e37-e54= b-4ca3-b3ed-b66ead9fad44', 'name': 'sdb'}]}, '5de1de8f-ac01-459f-b4b8-6d= 1ed05c8ca3': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_= sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0= L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-cen= ter/mnt/10.100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-0354= 61aef900/images/320ac81c-7db7-4ec0-a271-755e91442b6a/8bfc95c5-318c-43dd-= 817f-6c7a8a7a5b43', 'name': 'sda'}, {'ioTune': {'write_bytes_sec': 0L, '= total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_i= ops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.= 100.2.132:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/im= ages/e7ad86bb-3c63-466b-82cf-687164c46f7b/613ea0ce-ed14-4185-b3fd-364904= 41f889', 'name': 'sdb'}]}, '5d548a09-a397-4aac-8b1f-39002e014f5f': {'pol= icy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_i= ops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec= ': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.1= 32:_volume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/c7= 421014-7c5f-45ad-a948-caa83b8ce3e7/ae0ba893-69af-4b67-a262-b739596d5c95'= , 'name': 'sda'}]}, '168b01b1-5ec8-41dd-808e-fa9f66cea718': {'policy': [= ], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec= ': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L,= 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_vo= lume2_ovirt__vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/b9b7902a= -7a62-4826-bfda-dff260b9fcd1/d05db17c-9908-4bfb-a74b-4aa944510a56', 'nam= e': 'vda'}, {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 're= ad_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_byt= es_sec': 0L}, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt= __vms__1/07efa4fe-06bc-498e-8f42-035461aef900/images/564b3848-b6d5-4deb-= 910f-5b6f2fdbccc5/4f89ff25-2d3b-40b9-9bbc-9a6b6995346c', 'name': 'vdb'},= {'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec= ': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L= }, 'path': '/rhev/data-center/mnt/10.100.2.132:_volume2_ovirt__vms__1/07= efa4fe-06bc-498e-8f42-035461aef900/images/738e0704-8484-483b-ae67-091715= 496152/2f811423-6bab-4966-9c00-9d3b72429328', 'name': 'vdc'}]}}} from=3D= ::1,57576 (api:52)<br />2018-03-27 12:03:20,475+0200 INFO (jsonrpc= /7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeed= ed in 0.00 seconds (__init__:573)<br />2018-03-27 12:03:21,292+0200 INFO= (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(option= s=3DNone) from=3Dinternal, task_id=3Da35602b2-7d5c-4e87-86cd-ede17c62488= f (api:46)<br />2018-03-27 12:03:21,292+0200 INFO (vmrecovery) [vd= sm.api] FINISH getConnectedStoragePoolsList return=3D{'poollist': []} fr= om=3Dinternal, task_id=3Da35602b2-7d5c-4e87-86cd-ede17c62488f (api:52)<b= r />2018-03-27 12:03:21,293+0200 INFO (vmrecovery) [vds] recovery:= waiting for storage pool to go up (clientIF:674)<br /><br />So i see no= error.<br /><br />But in messages :<br />Mar 27 12:01:43 pfm-srv-virt-2= libvirtd: 2018-03-27 10:01:43.569+0000: 71793: error : qemuDomainAgentA= vailable:6030 : Guest agent is not responding: QEMU guest agent is not c= onnected<br /><br /><br />I have restarted libvirtd and vdsmd services.<= br /><br />Is there something else to do ?<br /><br />Regards</span></di= v>=0A <br/><hr>FreeMail powered by <a href=3D"https:/= /mail.fr" target=3D"_blank">mail.fr</a>=0A --=_fa659c011be33add9dea7033694abeb7--
participants (1)
-
spfma.tech@e.mail.fr