HA Broker fails after 4.2 upgrade

MainThread::INFO::2017-12-20 23:06:22,153::monitor::49::ovirt_hosted_engin= e_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-= load<br>MainThread::INFO::2017-12-20 23:06:22,154::monitor::49::ovirt_hoste= d_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonit= or mgmt-bridge<br>MainThread::INFO::2017-12-20 23:06:22,154::monitor::49::o= virt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loade= d submonitor ping<br>MainThread::INFO::2017-12-20 23:06:22,155::monitor::49= ::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Lo= aded submonitor storage-domain<br><br><div><br></div><div>The VDSM log has = alot of JSON errors with the storage fai2017-12-20 23:13:00,311-0500 INFO&n= bsp; (jsonrpc/6) [vdsm.api] FINISH getStorageDomainInfo error=3DStorage dom= ain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=3D::1,5= 4630, task_id=3Dff009157-48f3-480c-b8fe-b8d0a791c922 (api:50)<br>2017-12-20= 23:13:00,312-0500 ERROR (jsonrpc/6) [storage.TaskManager.Task] (Task=3D'ff= 009157-48f3-480c-b8fe-b8d0a791c922') Unexpected error (task:875)<br>2017-12= -20 23:13:00,314-0500 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH getStor= ageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6a-= 9d41-c742d763e1cc',) (dispatcher:82)<br>2017-12-20 23:13:00,314-0500 INFO&n= bsp; (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo fai= led (error 358) in 0.48 seconds (__init__:573)<br> raise = convert_to_error(kind, result)<br>2017-12-20 23:13:03,092-0500 INFO (= jsonrpc/3) [vdsm.api] FINISH getStorageDomainInfo error=3DStorage domain do= es not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=3D::1,54632, = task_id=3D39e022e5-db99-4bc4-88e1-9a218104b3c7 (api:50)<br>2017-12-20 23:13= :03,093-0500 ERROR (jsonrpc/3) [storage.TaskManager.Task] (Task=3D'39e022e5= -db99-4bc4-88e1-9a218104b3c7') Unexpected error (task:875)<br>2017-12-20 23= :13:03,095-0500 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH getStorageDom= ainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c= 742d763e1cc',) (dispatcher:82)<br>2017-12-20 23:13:03,095-0500 INFO (= jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (e= rror 358) in 0.49 seconds (__init__:573)<br> raise conver= t_to_error(kind, result)<br>2017-12-20 23:13:07,568-0500 INFO (jsonrp= c/4) [vdsm.api] FINISH getStorageDomainInfo error=3DStorage domain does not= exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=3D::1,54640, task_i= d=3Dc1b1b1a1-a7e6-494a-bda6-19c617820dec (api:50)<br>2017-12-20 23:13:07,56= 9-0500 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task=3D'c1b1b1a1-a7e6-= 494a-bda6-19c617820dec') Unexpected error (task:875)<br>2017-12-20 23:13:07= ,571-0500 ERROR (jsonrpc/4) [storage.Dispatcher] FINISH getStorageDomainInf= o error=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d76= 3e1cc',) (dispatcher:82)<br>2017-12-20 23:13:07,571-0500 INFO (jsonrp= c/4) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 3=
------=_Part_2689726_1513871286.1513829627905 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello all, I just upgraded my OVIRT instance to 4.2, the engine completed successfully= , however after I upgraded the hosts the HA Broker will not start.=C2=A0 Th= e 2 hosts are running CentOS 7.4, running gluster and CTDB.=C2=A0 The VIPS = are up and can be reached from both hosts as well as I can mount the gluste= r storage. =C2=A0=20 The error from the agent.log:=20 MainThread::INFO::2017-12-20 21:02:19,219::agent::67::ovirt_hosted_engine_h= a.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.2.2 started MainThread::INFO::2017-12-20 21:02:19,346::hosted_engine::243::ovirt_hosted= _engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certific= ate common name: hm3svr01.hm3.loc MainThread::INFO::2017-12-20 21:02:20,478::hosted_engine::525::ovirt_hosted= _engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Initializ= ing ha-broker connection MainThread::INFO::2017-12-20 21:02:20,482::brokerlink::77::ovirt_hosted_eng= ine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor ping, op= tions {'addr': '192.168.3.1'} MainThread::ERROR::2017-12-20 21:02:20,483::hosted_engine::538::ovirt_hoste= d_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Failed t= o start necessary monitors MainThread::ERROR::2017-12-20 21:02:20,485::agent::144::ovirt_hosted_engine= _ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last): =C2=A0 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/= agent.py", line 131, in _run_agent =C2=A0=C2=A0=C2=A0 return action(he) =C2=A0 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/= agent.py", line 55, in action_proper =C2=A0=C2=A0=C2=A0 return he.start_monitoring() =C2=A0 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/= hosted_engine.py", line 416, in start_monitoring =C2=A0=C2=A0=C2=A0 self._initialize_broker() =C2=A0 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/= hosted_engine.py", line 535, in _initialize_broker =C2=A0=C2=A0=C2=A0 m.get('options', {})) =C2=A0 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/br= okerlink.py", line 83, in start_monitor =C2=A0=C2=A0=C2=A0 .format(type, options, e)) RequestError: Failed to start monitor ping, options {'addr': '192.168.x.x'}= : [Errno 2] No such file or directory The broker.log: MainThread::INFO::2017-12-20 23:06:19,405::monitor::50::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submon= itors MainThread::INFO::2017-12-20 23:06:20,324::storage_backends::346::ovirt_hos= ted_engine_ha.lib.storage_backends::(connect) Connecting the storage MainThread::INFO::2017-12-20 23:06:20,325::storage_server::252::ovirt_hoste= d_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Conn= ecting storage server MainThread::INFO::2017-12-20 23:06:20,849::storage_server::259::ovirt_hoste= d_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Conn= ecting storage server MainThread::WARNING::2017-12-20 23:06:20,913::storage_broker::96::ovirt_hos= ted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect= vdsm storage: Connection to storage server failed=20 MainThread::INFO::2017-12-20 23:06:22,087::broker::45::ovirt_hosted_engine_= ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.2.2 started MainThread::INFO::2017-12-20 23:06:22,088::monitor::40::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonito= rs in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/s ubmonitors MainThread::INFO::2017-12-20 23:06:22,089::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-l= oad MainThread::INFO::2017-12-20 23:06:22,093::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-l= oad-no-engine MainThread::INFO::2017-12-20 23:06:22,146::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engin= e-health MainThread::INFO::2017-12-20 23:06:22,147::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-f= ree MainThread::INFO::2017-12-20 23:06:22,147::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-l= oad MainThread::INFO::2017-12-20 23:06:22,148::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-= bridge MainThread::INFO::2017-12-20 23:06:22,149::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,149::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor stora= ge-domain MainThread::INFO::2017-12-20 23:06:22,150::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-l= oad MainThread::INFO::2017-12-20 23:06:22,151::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-l= oad-no-engine MainThread::INFO::2017-12-20 23:06:22,152::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engin= e-health MainThread::INFO::2017-12-20 23:06:22,153::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-f= ree MainThread::INFO::2017-12-20 23:06:22,153::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-l= oad MainThread::INFO::2017-12-20 23:06:22,154::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-= bridge MainThread::INFO::2017-12-20 23:06:22,154::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,155::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor stora= ge-domain The VDSM log has alot of JSON errors with the storage fai2017-12-20 23:13:0= 0,311-0500 INFO=C2=A0 (jsonrpc/6) [vdsm.api] FINISH getStorageDomainInfo er= ror=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1c= c',) from=3D::1,54630, task_id=3Dff009157-48f3-480c-b8fe-b8d0a791c922 (api:= 50) 2017-12-20 23:13:00,312-0500 ERROR (jsonrpc/6) [storage.TaskManager.Task] (= Task=3D'ff009157-48f3-480c-b8fe-b8d0a791c922') Unexpected error (task:875) 2017-12-20 23:13:00,314-0500 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH = getStorageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571= e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:00,314-0500 INFO=C2=A0 (jsonrpc/6) [jsonrpc.JsonRpcServer]= RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init_= _:573) =C2=A0=C2=A0=C2=A0 raise convert_to_error(kind, result) 2017-12-20 23:13:03,092-0500 INFO=C2=A0 (jsonrpc/3) [vdsm.api] FINISH getSt= orageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6= a-9d41-c742d763e1cc',) from=3D::1,54632, task_id=3D39e022e5-db99-4bc4-88e1-= 9a218104b3c7 (api:50) 2017-12-20 23:13:03,093-0500 ERROR (jsonrpc/3) [storage.TaskManager.Task] (= Task=3D'39e022e5-db99-4bc4-88e1-9a218104b3c7') Unexpected error (task:875) 2017-12-20 23:13:03,095-0500 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH = getStorageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571= e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:03,095-0500 INFO=C2=A0 (jsonrpc/3) [jsonrpc.JsonRpcServer]= RPC call StorageDomain.getInfo failed (error 358) in 0.49 seconds (__init_= _:573) =C2=A0=C2=A0=C2=A0 raise convert_to_error(kind, result) 2017-12-20 23:13:07,568-0500 INFO=C2=A0 (jsonrpc/4) [vdsm.api] FINISH getSt= orageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6= a-9d41-c742d763e1cc',) from=3D::1,54640, task_id=3Dc1b1b1a1-a7e6-494a-bda6-= 19c617820dec (api:50) 2017-12-20 23:13:07,569-0500 ERROR (jsonrpc/4) [storage.TaskManager.Task] (= Task=3D'c1b1b1a1-a7e6-494a-bda6-19c617820dec') Unexpected error (task:875) 2017-12-20 23:13:07,571-0500 ERROR (jsonrpc/4) [storage.Dispatcher] FINISH = getStorageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571= e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:07,571-0500 INFO=C2=A0 (jsonrpc/4) [jsonrpc.JsonRpcServer]= RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init_= _:573) =C2=A0=C2=A0=C2=A0 raise convert_to_error(kind, result) 2017-12-20 23:13:10,323-0500 INFO=C2=A0 (jsonrpc/0) [vdsm.api] FINISH getSt= orageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6= a-9d41-c742d763e1cc',) from=3D::1,54642, task_id=3D6354fa3d-933c-4fd0-9301-= 00f8abd29ec7 (api:50) 2017-12-20 23:13:10,323-0500 ERROR (jsonrpc/0) [storage.TaskManager.Task] (= Task=3D'6354fa3d-933c-4fd0-9301-00f8abd29ec7') Unexpected error (task:875) 2017-12-20 23:13:10,325-0500 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH = getStorageDomainInfo error=3DStorage domain does not exist: (u'1cc6cc89-571= e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:10,326-0500 INFO=C2=A0 (jsonrpc/0) [jsonrpc.JsonRpcServer]= RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init_= _:573) ling Any help is appreciated.=C2=A0=20 thanks Andy =C2=A0=20 ------=_Part_2689726_1513871286.1513829627905 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <html><head></head><body><div style=3D"font-family:Helvetica Neue, Helvetic= a, Arial, sans-serif;font-size:10px;"><div>Hello all,</div><div><br></div><= div>I just upgraded my OVIRT instance to 4.2, the engine completed successf= ully, however after I upgraded the hosts the HA Broker will not start. = ; The 2 hosts are running CentOS 7.4, running gluster and CTDB. The V= IPS are up and can be reached from both hosts as well as I can mount the gl= uster storage. </div><div><br></div><div>The error from the agent.lo= g: </div><div><br></div><div>MainThread::INFO::2017-12-20 21:02:19,219::age= nt::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine= -ha agent 2.2.2 started<br>MainThread::INFO::2017-12-20 21:02:19,346::hoste= d_engine::243::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_g= et_hostname) Found certificate common name: hm3svr01.hm3.loc<br>MainThread:= :INFO::2017-12-20 21:02:20,478::hosted_engine::525::ovirt_hosted_engine_ha.= agent.hosted_engine.HostedEngine::(_initialize_broker) Initializing ha-brok= er connection<br>MainThread::INFO::2017-12-20 21:02:20,482::brokerlink::77:= :ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting= monitor ping, options {'addr': '192.168.3.1'}<br>MainThread::ERROR::2017-1= 2-20 21:02:20,483::hosted_engine::538::ovirt_hosted_engine_ha.agent.hosted_= engine.HostedEngine::(_initialize_broker) Failed to start necessary monitor= s<br>MainThread::ERROR::2017-12-20 21:02:20,485::agent::144::ovirt_hosted_e= ngine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last):= <br> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/ag= ent/agent.py", line 131, in _run_agent<br> return action(= he)<br> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha= /agent/agent.py", line 55, in action_proper<br> return he= .start_monitoring()<br> File "/usr/lib/python2.7/site-packages/ovirt_= hosted_engine_ha/agent/hosted_engine.py", line 416, in start_monitoring<br>= self._initialize_broker()<br> File "/usr/lib/pytho= n2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 535= , in _initialize_broker<br> m.get('options', {}))<br>&nbs= p; File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/broker= link.py", line 83, in start_monitor<br> .format(type, opt= ions, e))<br>RequestError: Failed to start monitor ping, options {'addr': '= 192.168.x.x'}: [Errno 2] No such file or directory<br><div><br></div><div><= br></div><div>The broker.log:</div><div><br></div><div>MainThread::INFO::20= 17-12-20 23:06:19,405::monitor::50::ovirt_hosted_engine_ha.broker.monitor.M= onitor::(_discover_submonitors) Finished loading submonitors<br>MainThread:= :INFO::2017-12-20 23:06:20,324::storage_backends::346::ovirt_hosted_engine_= ha.lib.storage_backends::(connect) Connecting the storage<br>MainThread::IN= FO::2017-12-20 23:06:20,325::storage_server::252::ovirt_hosted_engine_ha.li= b.storage_server.StorageServer::(connect_storage_server) Connecting storage= server<br>MainThread::INFO::2017-12-20 23:06:20,849::storage_server::259::= ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_s= erver) Connecting storage server<br>MainThread::WARNING::2017-12-20 23:06:2= 0,913::storage_broker::96::ovirt_hosted_engine_ha.broker.storage_broker.Sto= rageBroker::(__init__) Can't connect vdsm storage: Connection to storage se= rver failed <br>MainThread::INFO::2017-12-20 23:06:22,087::broker::45::ovir= t_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broke= r 2.2.2 started<br>MainThread::INFO::2017-12-20 23:06:22,088::monitor::40::= ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Sear= ching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engi= ne_ha/broker/s<br>ubmonitors<br>MainThread::INFO::2017-12-20 23:06:22,089::= monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_subm= onitors) Loaded submonitor cpu-load<br>MainThread::INFO::2017-12-20 23:06:2= 2,093::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discov= er_submonitors) Loaded submonitor cpu-load-no-engine<br>MainThread::INFO::2= 017-12-20 23:06:22,146::monitor::49::ovirt_hosted_engine_ha.broker.monitor.= Monitor::(_discover_submonitors) Loaded submonitor engine-health<br>MainThr= ead::INFO::2017-12-20 23:06:22,147::monitor::49::ovirt_hosted_engine_ha.bro= ker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free<br>= MainThread::INFO::2017-12-20 23:06:22,147::monitor::49::ovirt_hosted_engine= _ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-l= oad<br>MainThread::INFO::2017-12-20 23:06:22,148::monitor::49::ovirt_hosted= _engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonito= r mgmt-bridge<br>MainThread::INFO::2017-12-20 23:06:22,149::monitor::49::ov= irt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded= submonitor ping<br>MainThread::INFO::2017-12-20 23:06:22,149::monitor::49:= :ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loa= ded submonitor storage-domain<br>MainThread::INFO::2017-12-20 23:06:22,150:= :monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_sub= monitors) Loaded submonitor cpu-load<br>MainThread::INFO::2017-12-20 23:06:= 22,151::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_disco= ver_submonitors) Loaded submonitor cpu-load-no-engine<br>MainThread::INFO::= 2017-12-20 23:06:22,152::monitor::49::ovirt_hosted_engine_ha.broker.monitor= .Monitor::(_discover_submonitors) Loaded submonitor engine-health<br>MainTh= read::INFO::2017-12-20 23:06:22,153::monitor::49::ovirt_hosted_engine_ha.br= oker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free<br= 58) in 0.48 seconds (__init__:573)<br> raise convert_to_e= rror(kind, result)<br>2017-12-20 23:13:10,323-0500 INFO (jsonrpc/0) [= vdsm.api] FINISH getStorageDomainInfo error=3DStorage domain does not exist= : (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=3D::1,54642, task_id=3D63= 54fa3d-933c-4fd0-9301-00f8abd29ec7 (api:50)<br>2017-12-20 23:13:10,323-0500= ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task=3D'6354fa3d-933c-4fd0-9= 301-00f8abd29ec7') Unexpected error (task:875)<br>2017-12-20 23:13:10,325-0= 500 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getStorageDomainInfo erro= r=3DStorage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc'= ,) (dispatcher:82)<br>2017-12-20 23:13:10,326-0500 INFO (jsonrpc/0) [= jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in= 0.48 seconds (__init__:573)<br><br><div>ling</div><div><br></div><div><br>= </div><div>Any help is appreciated. </div><div><br></div><div>thanks = Andy<br></div></div><div><br></div><div><br></div> </div><div><br></d= iv><div><br></div><br></div></div></body></html> ------=_Part_2689726_1513871286.1513829627905--

2017-12-21 5:13 GMT+01:00 Andy <farkey_2000@yahoo.com>:
Hello all,
I just upgraded my OVIRT instance to 4.2, the engine completed successfully, however after I upgraded the hosts the HA Broker will not start. The 2 hosts are running CentOS 7.4, running gluster and CTDB. The VIPS are up and can be reached from both hosts as well as I can mount the gluster storage.
The error from the agent.log:
MainThread::INFO::2017-12-20 21:02:19,219::agent::67:: ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.2.2 started MainThread::INFO::2017-12-20 21:02:19,346::hosted_engine:: 243::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: hm3svr01.hm3.loc MainThread::INFO::2017-12-20 21:02:20,478::hosted_engine:: 525::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_initialize_broker) Initializing ha-broker connection MainThread::INFO::2017-12-20 21:02:20,482::brokerlink::77:: ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor ping, options {'addr': '192.168.3.1'} MainThread::ERROR::2017-12-20 21:02:20,483::hosted_engine:: 538::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_initialize_broker) Failed to start necessary monitors MainThread::ERROR::2017-12-20 21:02:20,485::agent::144:: ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 416, in start_monitoring self._initialize_broker() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 535, in _initialize_broker m.get('options', {})) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 83, in start_monitor .format(type, options, e)) RequestError: Failed to start monitor ping, options {'addr': '192.168.x.x'}: [Errno 2] No such file or directory
The broker.log:
MainThread::INFO::2017-12-20 23:06:19,405::monitor::50:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submonitors MainThread::INFO::2017-12-20 23:06:20,324::storage_ backends::346::ovirt_hosted_engine_ha.lib.storage_backends::(connect) Connecting the storage MainThread::INFO::2017-12-20 23:06:20,325::storage_server:: 252::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2017-12-20 23:06:20,849::storage_server:: 259::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::WARNING::2017-12-20 23:06:20,913::storage_broker:: 96::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Connection to storage server failed MainThread::INFO::2017-12-20 23:06:22,087::broker::45:: ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.2.2 started MainThread::INFO::2017-12-20 23:06:22,088::monitor::40:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site- packages/ovirt_hosted_engine_ha/broker/s ubmonitors MainThread::INFO::2017-12-20 23:06:22,089::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2017-12-20 23:06:22,093::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2017-12-20 23:06:22,146::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2017-12-20 23:06:22,147::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2017-12-20 23:06:22,147::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2017-12-20 23:06:22,148::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2017-12-20 23:06:22,149::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,149::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2017-12-20 23:06:22,150::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2017-12-20 23:06:22,151::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2017-12-20 23:06:22,152::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2017-12-20 23:06:22,153::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2017-12-20 23:06:22,153::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2017-12-20 23:06:22,154::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2017-12-20 23:06:22,154::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,155::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain
The VDSM log has alot of JSON errors with the storage fai2017-12-20 23:13:00,311-0500 INFO (jsonrpc/6) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54630, task_id=ff009157-48f3-480c-b8fe-b8d0a791c922 (api:50) 2017-12-20 23:13:00,312-0500 ERROR (jsonrpc/6) [storage.TaskManager.Task] (Task='ff009157-48f3-480c-b8fe-b8d0a791c922') Unexpected error (task:875) 2017-12-20 23:13:00,314-0500 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:00,314-0500 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:03,092-0500 INFO (jsonrpc/3) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54632, task_id=39e022e5-db99-4bc4-88e1-9a218104b3c7 (api:50) 2017-12-20 23:13:03,093-0500 ERROR (jsonrpc/3) [storage.TaskManager.Task] (Task='39e022e5-db99-4bc4-88e1-9a218104b3c7') Unexpected error (task:875) 2017-12-20 23:13:03,095-0500 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:03,095-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.49 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:07,568-0500 INFO (jsonrpc/4) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54640, task_id=c1b1b1a1-a7e6-494a-bda6-19c617820dec (api:50) 2017-12-20 23:13:07,569-0500 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='c1b1b1a1-a7e6-494a-bda6-19c617820dec') Unexpected error (task:875) 2017-12-20 23:13:07,571-0500 ERROR (jsonrpc/4) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:07,571-0500 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:10,323-0500 INFO (jsonrpc/0) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54642, task_id=6354fa3d-933c-4fd0-9301-00f8abd29ec7 (api:50) 2017-12-20 23:13:10,323-0500 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='6354fa3d-933c-4fd0-9301-00f8abd29ec7') Unexpected error (task:875) 2017-12-20 23:13:10,325-0500 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:10,326-0500 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573)
ling
Any help is appreciated.
thanks Andy
Adding relevant developers. Andy, do you mind open a bug on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-hosted-engine-ha to track this?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
-- SANDRO BONAZZOLA ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

On Thu, Dec 21, 2017 at 5:13 AM, Andy <farkey_2000@yahoo.com> wrote:
Hello all,
I just upgraded my OVIRT instance to 4.2, the engine completed successfully, however after I upgraded the hosts the HA Broker will not start. The 2 hosts are running CentOS 7.4, running gluster and CTDB. The VIPS are up and can be reached from both hosts as well as I can mount the gluster storage.
The error from the agent.log:
MainThread::INFO::2017-12-20 21:02:19,219::agent::67:: ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.2.2 started MainThread::INFO::2017-12-20 21:02:19,346::hosted_engine:: 243::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: hm3svr01.hm3.loc MainThread::INFO::2017-12-20 21:02:20,478::hosted_engine:: 525::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_initialize_broker) Initializing ha-broker connection MainThread::INFO::2017-12-20 21:02:20,482::brokerlink::77:: ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor ping, options {'addr': '192.168.3.1'} MainThread::ERROR::2017-12-20 21:02:20,483::hosted_engine:: 538::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_initialize_broker) Failed to start necessary monitors MainThread::ERROR::2017-12-20 21:02:20,485::agent::144:: ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 416, in start_monitoring self._initialize_broker() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 535, in _initialize_broker m.get('options', {})) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 83, in start_monitor .format(type, options, e)) RequestError: Failed to start monitor ping, options {'addr': '192.168.x.x'}: [Errno 2] No such file or directory
This simply means that the broker is not ready.
The broker.log:
MainThread::INFO::2017-12-20 23:06:19,405::monitor::50:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submonitors MainThread::INFO::2017-12-20 23:06:20,324::storage_ backends::346::ovirt_hosted_engine_ha.lib.storage_backends::(connect) Connecting the storage MainThread::INFO::2017-12-20 23:06:20,325::storage_server:: 252::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2017-12-20 23:06:20,849::storage_server:: 259::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::WARNING::2017-12-20 23:06:20,913::storage_broker:: 96::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Connection to storage server failed MainThread::INFO::2017-12-20 23:06:22,087::broker::45:: ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.2.2 started MainThread::INFO::2017-12-20 23:06:22,088::monitor::40:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site- packages/ovirt_hosted_engine_ha/broker/s ubmonitors MainThread::INFO::2017-12-20 23:06:22,089::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2017-12-20 23:06:22,093::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2017-12-20 23:06:22,146::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2017-12-20 23:06:22,147::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2017-12-20 23:06:22,147::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2017-12-20 23:06:22,148::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2017-12-20 23:06:22,149::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,149::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2017-12-20 23:06:22,150::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2017-12-20 23:06:22,151::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2017-12-20 23:06:22,152::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2017-12-20 23:06:22,153::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2017-12-20 23:06:22,153::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2017-12-20 23:06:22,154::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2017-12-20 23:06:22,154::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,155::monitor::49:: ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain
Could you please change in /etc/ovirt-hosted-engine-ha/broker-log.conf from [logger_root] level=INFO to [logger_root] level=DEBUG restart the broker service, wait a few minutes and then share its debug log?
The VDSM log has alot of JSON errors with the storage fai2017-12-20 23:13:00,311-0500 INFO (jsonrpc/6) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54630, task_id=ff009157-48f3-480c-b8fe-b8d0a791c922 (api:50) 2017-12-20 23:13:00,312-0500 ERROR (jsonrpc/6) [storage.TaskManager.Task] (Task='ff009157-48f3-480c-b8fe-b8d0a791c922') Unexpected error (task:875) 2017-12-20 23:13:00,314-0500 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:00,314-0500 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:03,092-0500 INFO (jsonrpc/3) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54632, task_id=39e022e5-db99-4bc4-88e1-9a218104b3c7 (api:50) 2017-12-20 23:13:03,093-0500 ERROR (jsonrpc/3) [storage.TaskManager.Task] (Task='39e022e5-db99-4bc4-88e1-9a218104b3c7') Unexpected error (task:875) 2017-12-20 23:13:03,095-0500 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:03,095-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.49 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:07,568-0500 INFO (jsonrpc/4) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54640, task_id=c1b1b1a1-a7e6-494a-bda6-19c617820dec (api:50) 2017-12-20 23:13:07,569-0500 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='c1b1b1a1-a7e6-494a-bda6-19c617820dec') Unexpected error (task:875) 2017-12-20 23:13:07,571-0500 ERROR (jsonrpc/4) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:07,571-0500 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:10,323-0500 INFO (jsonrpc/0) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54642, task_id=6354fa3d-933c-4fd0-9301-00f8abd29ec7 (api:50) 2017-12-20 23:13:10,323-0500 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='6354fa3d-933c-4fd0-9301-00f8abd29ec7') Unexpected error (task:875) 2017-12-20 23:13:10,325-0500 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:10,326-0500 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573)
ling
Any help is appreciated.
thanks Andy
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Btw lacking vdsm logs here this seems to be the same issue Jason Brooks just reported here too. Hosted engine is trying to get storage info from VDSM and gets error instead.. -- Martin Sivak SLA / oVirt On Thu, Dec 21, 2017 at 9:02 AM, Simone Tiraboschi <stirabos@redhat.com> wrote:
On Thu, Dec 21, 2017 at 5:13 AM, Andy <farkey_2000@yahoo.com> wrote:
Hello all,
I just upgraded my OVIRT instance to 4.2, the engine completed successfully, however after I upgraded the hosts the HA Broker will not start. The 2 hosts are running CentOS 7.4, running gluster and CTDB. The VIPS are up and can be reached from both hosts as well as I can mount the gluster storage.
The error from the agent.log:
MainThread::INFO::2017-12-20 21:02:19,219::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.2.2 started MainThread::INFO::2017-12-20 21:02:19,346::hosted_engine::243::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: hm3svr01.hm3.loc MainThread::INFO::2017-12-20 21:02:20,478::hosted_engine::525::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Initializing ha-broker connection MainThread::INFO::2017-12-20 21:02:20,482::brokerlink::77::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor ping, options {'addr': '192.168.3.1'} MainThread::ERROR::2017-12-20 21:02:20,483::hosted_engine::538::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Failed to start necessary monitors MainThread::ERROR::2017-12-20 21:02:20,485::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 416, in start_monitoring self._initialize_broker() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 535, in _initialize_broker m.get('options', {})) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 83, in start_monitor .format(type, options, e)) RequestError: Failed to start monitor ping, options {'addr': '192.168.x.x'}: [Errno 2] No such file or directory
This simply means that the broker is not ready.
The broker.log:
MainThread::INFO::2017-12-20 23:06:19,405::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submonitors MainThread::INFO::2017-12-20 23:06:20,324::storage_backends::346::ovirt_hosted_engine_ha.lib.storage_backends::(connect) Connecting the storage MainThread::INFO::2017-12-20 23:06:20,325::storage_server::252::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2017-12-20 23:06:20,849::storage_server::259::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::WARNING::2017-12-20 23:06:20,913::storage_broker::96::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Connection to storage server failed MainThread::INFO::2017-12-20 23:06:22,087::broker::45::ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.2.2 started MainThread::INFO::2017-12-20 23:06:22,088::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/s ubmonitors MainThread::INFO::2017-12-20 23:06:22,089::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2017-12-20 23:06:22,093::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2017-12-20 23:06:22,146::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2017-12-20 23:06:22,147::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2017-12-20 23:06:22,147::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2017-12-20 23:06:22,148::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2017-12-20 23:06:22,149::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,149::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2017-12-20 23:06:22,150::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2017-12-20 23:06:22,151::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2017-12-20 23:06:22,152::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2017-12-20 23:06:22,153::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2017-12-20 23:06:22,153::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2017-12-20 23:06:22,154::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2017-12-20 23:06:22,154::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2017-12-20 23:06:22,155::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain
Could you please change in /etc/ovirt-hosted-engine-ha/broker-log.conf from [logger_root] level=INFO to [logger_root] level=DEBUG
restart the broker service, wait a few minutes and then share its debug log?
The VDSM log has alot of JSON errors with the storage fai2017-12-20 23:13:00,311-0500 INFO (jsonrpc/6) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54630, task_id=ff009157-48f3-480c-b8fe-b8d0a791c922 (api:50) 2017-12-20 23:13:00,312-0500 ERROR (jsonrpc/6) [storage.TaskManager.Task] (Task='ff009157-48f3-480c-b8fe-b8d0a791c922') Unexpected error (task:875) 2017-12-20 23:13:00,314-0500 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:00,314-0500 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:03,092-0500 INFO (jsonrpc/3) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54632, task_id=39e022e5-db99-4bc4-88e1-9a218104b3c7 (api:50) 2017-12-20 23:13:03,093-0500 ERROR (jsonrpc/3) [storage.TaskManager.Task] (Task='39e022e5-db99-4bc4-88e1-9a218104b3c7') Unexpected error (task:875) 2017-12-20 23:13:03,095-0500 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:03,095-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.49 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:07,568-0500 INFO (jsonrpc/4) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54640, task_id=c1b1b1a1-a7e6-494a-bda6-19c617820dec (api:50) 2017-12-20 23:13:07,569-0500 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='c1b1b1a1-a7e6-494a-bda6-19c617820dec') Unexpected error (task:875) 2017-12-20 23:13:07,571-0500 ERROR (jsonrpc/4) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:07,571-0500 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573) raise convert_to_error(kind, result) 2017-12-20 23:13:10,323-0500 INFO (jsonrpc/0) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54642, task_id=6354fa3d-933c-4fd0-9301-00f8abd29ec7 (api:50) 2017-12-20 23:13:10,323-0500 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='6354fa3d-933c-4fd0-9301-00f8abd29ec7') Unexpected error (task:875) 2017-12-20 23:13:10,325-0500 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82) 2017-12-20 23:13:10,326-0500 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573)
ling
Any help is appreciated.
thanks Andy
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hello! I have a test system with one phisical host and hosted engine running on it. Storage is gluster but hosted engine mount it as nfs. After the upgrade gluster no longer activate nfs. The command "gluster volume set engine nfs.disable off" doesn't help. How I can re-enable nfs? O better how I can migrate self hosted engine to native glusterfs?

Yep will do momentarily Thanks Andy
On Dec 21, 2017, at 4:14 AM, Stefano Danzi <s.danzi@hawai.it> wrote:
Hello! I have a test system with one phisical host and hosted engine running on it. Storage is gluster but hosted engine mount it as nfs.
After the upgrade gluster no longer activate nfs. The command "gluster volume set engine nfs.disable off" doesn't help.
How I can re-enable nfs? O better how I can migrate self hosted engine to native glusterfs? _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (6)
-
Andy
-
Andy Kress
-
Martin Sivak
-
Sandro Bonazzola
-
Simone Tiraboschi
-
Stefano Danzi