
--Sig_/OFa_N1mJE9IiD6RcZ/UU=3U Content-Type: multipart/mixed; boundary="MP_/jM5Fd0UN+ExN0tOGWEkYY0g" --MP_/jM5Fd0UN+ExN0tOGWEkYY0g Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Thu, 21 Jul 2016 14:43:50 -0400 Robert wrote: RS> So after some debugging with Simone on irc, we've determined that the i= ssue RS> is the agent timing out trying to communicate with the broker. The prob= lem RS> is that we have no idea why. So more detail attached. The agent is sending: MainThread::hosted_engine::436::ovirt_hosted_engine_ha.agent.hosted_engi= ne.HostedEngine ::(start_monitoring) Processing engine state <ovirt_hosted_engine_ha.agen= t.states.ReinitializeFSM object at 0x15d8c30> MainThread::brokerlink::111::= ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ::(notify) Trying: notify time=3D1469129518.85 type=3Dstate_transition de= tail=3DStartState-ReinitializeFSM hostname=3D'poseidon.netsec' MainThread::= brokerlink::273::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ::(_communicate) Sending request: notify time=3D1469129518.85 type=3Dstat= e_transition detail=3DStartState-ReinitializeFSM hostname=3D'poseidon.netse= c' Which the broker sees: Thread-1::util::69::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ::(socket_readline) socket_readline in blocking mode Thread-1::listener::163::ovirt_hosted_engine_ha.broker.listener.ConnectionH= andler ::(handle) Input: notify time=3D1469129518.85 type=3Dstate_transition = detail=3DStartState-ReinitializeFSM hostname=3D'poseidon.netsec' It then refreshes the local config file: Thread-1::config::251::ovirt_hosted_engine_ha.broker.notifications.Notifica= tions.config ::(refresh_local_conf_file) Reading 'broker.conf' from '/rhev/data-cen= ter/mnt/ovirt-nfs.netsec:_ovirt_hosted-engine/2daba0ab-2b3d-4026-bcfc-1cd07= 1c30038/images/a04a45b9-e780-4104-ad4b-d5901a5490c4/34a7 Which succeeds: Thread-1::config::271::ovirt_hosted_engine_ha.broker.notifications.Notifica= tions.config ::(refresh_local_conf_file) Writing to '/var/lib/ovirt-hosted-engine-h= a/broker.conf' Thread-1::config::278::ovirt_hosted_engine_ha.broker.notifications.Notifica= tions.config ::(refresh_local_conf_file) local conf file was correctly written And then .... nothing. It just hangs. Nothing more is logged Thread-1. Robert --=20 Senior Software Engineer @ Parsons Robert --=20 Senior Software Engineer @ Parsons --MP_/jM5Fd0UN+ExN0tOGWEkYY0g Content-Type: text/x-log Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename=agent.log ***************************************************************************= ************************************************************* MainThread::DEBUG::2016-07-21 15:31:58,847::hosted_engine::436::ovirt_hoste= d_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Processing= engine state <ovirt_hosted_engine_ha.agent.states.ReinitializeFSM object a= t 0x15d8c30> MainThread::INFO::2016-07-21 15:31:58,847::brokerlink::111::ovirt_hosted_en= gine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=3D146912951= 8.85 type=3Dstate_transition detail=3DStartState-ReinitializeFSM hostname= =3D'poseidon.netsec' MainThread::DEBUG::2016-07-21 15:31:58,847::brokerlink::273::ovirt_hosted_e= ngine_ha.lib.brokerlink.BrokerLink::(_communicate) Sending request: notify = time=3D1469129518.85 type=3Dstate_transition detail=3DStartState-Reinitiali= zeFSM hostname=3D'poseidon.netsec' MainThread::DEBUG::2016-07-21 15:31:58,848::util::77::ovirt_hosted_engine_h= a.lib.brokerlink.BrokerLink::(socket_readline) socket_readline with 30.0 se= conds timeout MainThread::DEBUG::2016-07-21 15:32:28,866::util::88::ovirt_hosted_engine_h= a.lib.brokerlink.BrokerLink::(socket_readline) Connection timeout while rea= ding from socket MainThread::ERROR::2016-07-21 15:32:28,867::brokerlink::279::ovirt_hosted_e= ngine_ha.lib.brokerlink.BrokerLink::(_communicate) Connection closed: Conne= ction timed out MainThread::DEBUG::2016-07-21 15:32:28,867::brokerlink::86::ovirt_hosted_en= gine_ha.lib.brokerlink.BrokerLink::(disconnect) Closing connection to ha-br= oker MainThread::ERROR::2016-07-21 15:32:28,867::agent::205::ovirt_hosted_engine= _ha.agent.agent.Agent::(_run_agent) Error: 'Failed to start monitor state_t= ransition, options {'hostname': 'poseidon.netsec'}: Connection timed out' -= trying to restart agent ***************************************************************************= ************************************************************* --MP_/jM5Fd0UN+ExN0tOGWEkYY0g Content-Type: text/x-log Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename=broker.log ***************************************************************************= ***************************************************************************= ********** Thread-1::util::69::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ::(socket_readline) socket_readline in blocking mode Thread-1::listener::163::ovirt_hosted_engine_ha.broker.listener.ConnectionH= andler ::(handle) Input: notify time=3D1469129518.85 type=3Dstate_transition = detail=3DStartState-ReinitializeFSM hostname=3D'poseidon.netsec' Thread-1::listener::238::ovirt_hosted_engine_ha.broker.listener.ConnectionH= andler ::(_dispatch) Request type notify from 139793244509952 Thread-1::notifications::46::ovirt_hosted_engine_ha.broker.notifications.No= tifications ::(notify) nofity: {'hostname': 'poseidon.netsec', 'type': 'state_tran= sition', 'detail': 'StartState-ReinitializeFSM', 'time': '1469129518.85'} Thread-1::config::251::ovirt_hosted_engine_ha.broker.notifications.Notifica= tions.config ::(refresh_local_conf_file) Reading 'broker.conf' from '/rhev/data-cen= ter/mnt/ovirt-nfs.netsec:_ovirt_hosted-engine/2daba0ab-2b3d-4026-bcfc-1cd07= 1c30038/images/a04a45b9-e780-4104-ad4b-d5901a5490c4/34a7c70e-d6ca-482f-b414= -d458f7f5f9de' Thread-1::heconflib::69::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) executing: 'sudo -u vdsm dd if=3D/rhev/data-center/mn= t/ovirt-nfs.netsec:_ovirt_hosted-engine/2daba0ab-2b3d-4026-bcfc-1cd071c3003= 8/images/a04a45b9-e780-4104-ad4b-d5901a5490c4/34a7c70e-d6ca-482f-b414-d458f= 7f5f9de bs=3D4k' Thread-1::heconflib::70::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) executing: 'tar -tvf -' Thread-1::heconflib::88::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) stdout: -rw-r--r-- 0/0 7 1969-12-31 19:= 00 version -rw-r--r-- 0/0 2572 1969-12-31 19:00 fhanswers.conf -rw-r--r-- 0/0 861 1969-12-31 19:00 hosted-engine.conf -rw-r--r-- 0/0 182 1969-12-31 19:00 broker.conf -rw-r--r-- 0/0 1315 1969-12-31 19:00 vm.conf Thread-1::heconflib::89::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) stderr:=20 Thread-1::heconflib::138::ovirt_hosted_engine_ha.broker.notifications.Notif= ications.config ::(extractConfFile) extracting 'broker.conf' from '/rhev/data-center/m= nt/ovirt-nfs.netsec:_ovirt_hosted-engine/2daba0ab-2b3d-4026-bcfc-1cd071c300= 38/images/a04a45b9-e780-4104-ad4b-d5901a5490c4/34a7c70e-d6ca-482f-b414-d458= f7f5f9de' Thread-1::heconflib::69::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) executing: 'sudo -u vdsm dd if=3D/rhev/data-center/mn= t/ovirt-nfs.netsec:_ovirt_hosted-engine/2daba0ab-2b3d-4026-bcfc-1cd071c3003= 8/images/a04a45b9-e780-4104-ad4b-d5901a5490c4/34a7c70e-d6ca-482f-b414-d458f= 7f5f9de bs=3D4k' Thread-1::heconflib::70::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) executing: 'tar -xOf - broker.conf' Thread-1::heconflib::88::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) stdout: [email] smtp-server =3D localhost smtp-port =3D 25 destination-emails =3D root@localhost source-email =3D root@localhost [notify] state_transition =3D maintenance|start|stop|migrate|up|down Thread-1::heconflib::89::ovirt_hosted_engine_ha.broker.notifications.Notifi= cations.config ::(_dd_pipe_tar) stderr:=20 Thread-1::config::271::ovirt_hosted_engine_ha.broker.notifications.Notifica= tions.config ::(refresh_local_conf_file) Writing to '/var/lib/ovirt-hosted-engine-h= a/broker.conf' Thread-1::config::278::ovirt_hosted_engine_ha.broker.notifications.Notifica= tions.config ::(refresh_local_conf_file) local conf file was correctly written ***************************************************************************= ***************************************************************************= ********** --MP_/jM5Fd0UN+ExN0tOGWEkYY0g-- --Sig_/OFa_N1mJE9IiD6RcZ/UU=3U Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAleRKtkACgkQ7/fVLLY1mnj5dQCfd1FvZp3TAyUPfedpha/cWhAp MUMAnjp5St9/R3z+JrTGmhzQI0PKlkZ9 =nwrC -----END PGP SIGNATURE----- --Sig_/OFa_N1mJE9IiD6RcZ/UU=3U--