[ovirt-users] oVIRT 4.0.1 Hosted Engine Agent stops and can't be started anymore

Sandro Bonazzola sbonazzo at redhat.com
Mon Jul 25 08:56:17 UTC 2016


On Sun, Jul 24, 2016 at 12:19 AM, Matt . <yamakasi.014 at gmail.com> wrote:

> Hi Guys,
>
> I'm having an issues just at once that my HA Agent stops running and
> can't be restarted or won't start anymore after the following error.
>
> I'm able to start the HE manually on the commandline on each host.
>
> Exception: Failed to start monitoring domain
> (sd_uuid=4093ad17-bef5-4e4b-9a16-259a98e20321, host_id=1):
> timeout during domain acquisition
> MainThread::WARNING::2016-07-22
>
> 13:20:05,059::hosted_engine::477::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::
> (start_monitoring) Error while monitoring engine: Failed to start
> monitoring domain
> (sd_uuid=4093ad17-bef5-4e4b-9a16-259a98e20321, host_id=1): timeout
> during domain acquisition
> MainThread::WARNING::2016-07-22
>
> 13:20:05,059::hosted_engine::480::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::
> (start_monitoring) Unexpected error
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 451, in start_monitoring
>     self._initialize_domain_monitor()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 831, in _initialize_domain_monitor
>     raise Exception(msg)
> Exception: Failed to start monitoring domain
> (sd_uuid=4093ad17-bef5-4e4b-9a16-259a98e20321, host_id=1):
> timeout during domain acquisition
> MainThread::ERROR::2016-07-22
>
> 13:20:05,060::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::
> (start_monitoring) Shutting down the agent because of 3 failures in a row!
> MainThread::INFO::2016-07-22
>
> 13:20:07,096::hosted_engine::860::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::
> (_get_domain_monitor_status) VDSM domain monitor status: PENDING
> MainThread::INFO::2016-07-22
>
> 13:20:07,122::hosted_engine::786::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::
> (_stop_domain_monitor) Failed to stop monitoring domain
> (sd_uuid=4093ad17-bef5-4e4b-9a16-259a98e20321):
> Error 900 from stopMonitoringDomain: Storage domain is member of pool:
> 'domain=4093ad17-bef5-4e4b-9a16-259a98e20321'
> MainThread::INFO::2016-07-22
> 13:20:07,129::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
> Agent shutting down
>
> This part concerns me actually:
>
> Error 900 from stopMonitoringDomain: Storage domain is member of pool:
> 'domain=4093ad17-bef5-4e4b-9a16-259a98e20321'
>
>
Can you please provide full sos report from the failing host?
Has this host been upgraded from a previous version or freshly installed?




>
> I'm running oVirt 4.0.1
>
> Or do you guys want a bugreport for this ?
>
>
> Thanks!
>
> Matt
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160725/53ebad8b/attachment-0001.html>


More information about the Users mailing list