Btw lacking vdsm logs here this seems to be the same issue Jason
Brooks just reported here too. Hosted engine is trying to get storage
info from VDSM and gets error instead..
--
Martin Sivak
SLA / oVirt
On Thu, Dec 21, 2017 at 9:02 AM, Simone Tiraboschi <stirabos(a)redhat.com> wrote:
On Thu, Dec 21, 2017 at 5:13 AM, Andy <farkey_2000(a)yahoo.com> wrote:
>
> Hello all,
>
> I just upgraded my OVIRT instance to 4.2, the engine completed
> successfully, however after I upgraded the hosts the HA Broker will not
> start. The 2 hosts are running CentOS 7.4, running gluster and CTDB. The
> VIPS are up and can be reached from both hosts as well as I can mount the
> gluster storage.
>
> The error from the agent.log:
>
> MainThread::INFO::2017-12-20
> 21:02:19,219::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
> ovirt-hosted-engine-ha agent 2.2.2 started
> MainThread::INFO::2017-12-20
>
21:02:19,346::hosted_engine::243::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
> Found certificate common name: hm3svr01.hm3.loc
> MainThread::INFO::2017-12-20
>
21:02:20,478::hosted_engine::525::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
> Initializing ha-broker connection
> MainThread::INFO::2017-12-20
>
21:02:20,482::brokerlink::77::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Starting monitor ping, options {'addr': '192.168.3.1'}
> MainThread::ERROR::2017-12-20
>
21:02:20,483::hosted_engine::538::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
> Failed to start necessary monitors
> MainThread::ERROR::2017-12-20
> 21:02:20,485::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
> Traceback (most recent call last):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 131, in _run_agent
> return action(he)
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 55, in action_proper
> return he.start_monitoring()
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 416, in start_monitoring
> self._initialize_broker()
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 535, in _initialize_broker
> m.get('options', {}))
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 83, in start_monitor
> .format(type, options, e))
> RequestError: Failed to start monitor ping, options {'addr':
> '192.168.x.x'}: [Errno 2] No such file or directory
This simply means that the broker is not ready.
>
>
>
> The broker.log:
>
> MainThread::INFO::2017-12-20
>
23:06:19,405::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Finished loading submonitors
> MainThread::INFO::2017-12-20
>
23:06:20,324::storage_backends::346::ovirt_hosted_engine_ha.lib.storage_backends::(connect)
> Connecting the storage
> MainThread::INFO::2017-12-20
>
23:06:20,325::storage_server::252::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
> MainThread::INFO::2017-12-20
>
23:06:20,849::storage_server::259::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server)
> Connecting storage server
> MainThread::WARNING::2017-12-20
>
23:06:20,913::storage_broker::96::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> Can't connect vdsm storage: Connection to storage server failed
> MainThread::INFO::2017-12-20
> 23:06:22,087::broker::45::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> ovirt-hosted-engine-ha broker 2.2.2 started
> MainThread::INFO::2017-12-20
>
23:06:22,088::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Searching for submonitors in
> /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/s
> ubmonitors
> MainThread::INFO::2017-12-20
>
23:06:22,089::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor cpu-load
> MainThread::INFO::2017-12-20
>
23:06:22,093::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor cpu-load-no-engine
> MainThread::INFO::2017-12-20
>
23:06:22,146::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor engine-health
> MainThread::INFO::2017-12-20
>
23:06:22,147::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor mem-free
> MainThread::INFO::2017-12-20
>
23:06:22,147::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor mem-load
> MainThread::INFO::2017-12-20
>
23:06:22,148::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor mgmt-bridge
> MainThread::INFO::2017-12-20
>
23:06:22,149::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor ping
> MainThread::INFO::2017-12-20
>
23:06:22,149::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor storage-domain
> MainThread::INFO::2017-12-20
>
23:06:22,150::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor cpu-load
> MainThread::INFO::2017-12-20
>
23:06:22,151::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor cpu-load-no-engine
> MainThread::INFO::2017-12-20
>
23:06:22,152::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor engine-health
> MainThread::INFO::2017-12-20
>
23:06:22,153::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor mem-free
> MainThread::INFO::2017-12-20
>
23:06:22,153::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor mem-load
> MainThread::INFO::2017-12-20
>
23:06:22,154::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor mgmt-bridge
> MainThread::INFO::2017-12-20
>
23:06:22,154::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor ping
> MainThread::INFO::2017-12-20
>
23:06:22,155::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> Loaded submonitor storage-domain
>
Could you please change in /etc/ovirt-hosted-engine-ha/broker-log.conf
from
[logger_root]
level=INFO
to
[logger_root]
level=DEBUG
restart the broker service, wait a few minutes and then share its debug log?
>
>
> The VDSM log has alot of JSON errors with the storage fai2017-12-20
> 23:13:00,311-0500 INFO (jsonrpc/6) [vdsm.api] FINISH getStorageDomainInfo
> error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54630,
> task_id=ff009157-48f3-480c-b8fe-b8d0a791c922 (api:50)
> 2017-12-20 23:13:00,312-0500 ERROR (jsonrpc/6) [storage.TaskManager.Task]
> (Task='ff009157-48f3-480c-b8fe-b8d0a791c922') Unexpected error (task:875)
> 2017-12-20 23:13:00,314-0500 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82)
> 2017-12-20 23:13:00,314-0500 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
> call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573)
> raise convert_to_error(kind, result)
> 2017-12-20 23:13:03,092-0500 INFO (jsonrpc/3) [vdsm.api] FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54632,
> task_id=39e022e5-db99-4bc4-88e1-9a218104b3c7 (api:50)
> 2017-12-20 23:13:03,093-0500 ERROR (jsonrpc/3) [storage.TaskManager.Task]
> (Task='39e022e5-db99-4bc4-88e1-9a218104b3c7') Unexpected error (task:875)
> 2017-12-20 23:13:03,095-0500 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82)
> 2017-12-20 23:13:03,095-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
> call StorageDomain.getInfo failed (error 358) in 0.49 seconds (__init__:573)
> raise convert_to_error(kind, result)
> 2017-12-20 23:13:07,568-0500 INFO (jsonrpc/4) [vdsm.api] FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54640,
> task_id=c1b1b1a1-a7e6-494a-bda6-19c617820dec (api:50)
> 2017-12-20 23:13:07,569-0500 ERROR (jsonrpc/4) [storage.TaskManager.Task]
> (Task='c1b1b1a1-a7e6-494a-bda6-19c617820dec') Unexpected error (task:875)
> 2017-12-20 23:13:07,571-0500 ERROR (jsonrpc/4) [storage.Dispatcher] FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82)
> 2017-12-20 23:13:07,571-0500 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC
> call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573)
> raise convert_to_error(kind, result)
> 2017-12-20 23:13:10,323-0500 INFO (jsonrpc/0) [vdsm.api] FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) from=::1,54642,
> task_id=6354fa3d-933c-4fd0-9301-00f8abd29ec7 (api:50)
> 2017-12-20 23:13:10,323-0500 ERROR (jsonrpc/0) [storage.TaskManager.Task]
> (Task='6354fa3d-933c-4fd0-9301-00f8abd29ec7') Unexpected error (task:875)
> 2017-12-20 23:13:10,325-0500 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH
> getStorageDomainInfo error=Storage domain does not exist:
> (u'1cc6cc89-571e-4b6a-9d41-c742d763e1cc',) (dispatcher:82)
> 2017-12-20 23:13:10,326-0500 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
> call StorageDomain.getInfo failed (error 358) in 0.48 seconds (__init__:573)
>
> ling
>
>
> Any help is appreciated.
>
> thanks Andy
>
>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users