[ovirt-users] [Call for feedback] did you install/update to 4.1.0?

Simone Tiraboschi stirabos at redhat.com
Fri Feb 3 12:39:43 UTC 2017


I see there an ERROR on stopMonitoringDomain but I cannot see the
correspondent  startMonitoringDomain; could you please look for it?

On Fri, Feb 3, 2017 at 1:16 PM, Ralf Schenk <rs at databay.de> wrote:

> Hello,
>
> attached is my vdsm.log from the host with hosted-engine-ha around the
> time-frame of agent timeout that is not working anymore for engine (it
> works in Ovirt and is active). It simply isn't working for engine-ha
> anymore after Update.
>
> At 2017-02-02 19:25:34,248 you'll find an error corresponoding to agent
> timeout error.
>
> Bye
>
>
>
> Am 03.02.2017 um 11:28 schrieb Simone Tiraboschi:
>
> 3. Three of my hosts have the hosted engine deployed for ha. First all
>>> three where marked by a crown (running was gold and others where silver).
>>> After upgrading the 3 Host deployed hosted engine ha is not active anymore.
>>>
>>> I can't get this host back with working ovirt-ha-agent/broker. I already
>>> rebooted, manually restarted the services but It isn't able to get cluster
>>> state according to
>>> "hosted-engine --vm-status". The other hosts state the host status as
>>> "unknown stale-data"
>>>
>>> I already shut down all agents on all hosts and issued a "hosted-engine
>>> --reinitialize-lockspace" but that didn't help.
>>>
>>> Agents stops working after a timeout-error according to log:
>>>
>>> MainThread::INFO::2017-02-02 19:24:52,040::hosted_engine::8
>>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>> VDSM domain monitor status: PENDING
>>> MainThread::INFO::2017-02-02 19:24:59,185::hosted_engine::8
>>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>> VDSM domain monitor status: PENDING
>>> MainThread::INFO::2017-02-02 19:25:06,333::hosted_engine::8
>>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>> VDSM domain monitor status: PENDING
>>> MainThread::INFO::2017-02-02 19:25:13,554::hosted_engine::8
>>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>> VDSM domain monitor status: PENDING
>>> MainThread::INFO::2017-02-02 19:25:20,710::hosted_engine::8
>>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>> VDSM domain monitor status: PENDING
>>> MainThread::INFO::2017-02-02 19:25:27,865::hosted_engine::8
>>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>> VDSM domain monitor status: PENDING
>>> MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::8
>>> 15::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>> Failed to start monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96,
>>> host_id=3): timeout during domain acquisition
>>> MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::4
>>> 69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>> Error while monitoring engine: Failed to start monitoring domain
>>> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout
>>> during domain acquisition
>>> MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::4
>>> 72::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>> Unexpected error
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>> line 443, in start_monitoring
>>>     self._initialize_domain_monitor()
>>>   File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>> line 816, in _initialize_domain_monitor
>>>     raise Exception(msg)
>>> Exception: Failed to start monitoring domain
>>> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout
>>> during domain acquisition
>>> MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::4
>>> 85::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>> Shutting down the agent because of 3 failures in a row!
>>> MainThread::INFO::2017-02-02 19:25:32,087::hosted_engine::8
>>> 41::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>> VDSM domain monitor status: PENDING
>>> MainThread::INFO::2017-02-02 19:25:34,250::hosted_engine::7
>>> 69::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
>>> Failed to stop monitoring domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96):
>>> Storage domain is member of pool: u'domain=7c8deaa8-be02-4aaf-b9
>>> b4-ddc8da99ad96'
>>> MainThread::INFO::2017-02-02 19:25:34,254::agent::143::ovir
>>> t_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down
>>>
>> Simone, Martin, can you please follow up on this?
>>
>
> Ralph, could you please attach vdsm logs from on of your hosts for the
> relevant time frame?
>
>
> --
>
>
> *Ralf Schenk*
> fon +49 (0) 24 05 / 40 83 70 <+49%202405%20408370>
> fax +49 (0) 24 05 / 40 83 759 <+49%202405%204083759>
> mail *rs at databay.de* <rs at databay.de>
>
> *Databay AG*
> Jens-Otto-Krag-Straße 11
> D-52146 Würselen
> *www.databay.de* <http://www.databay.de>
>
> Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
> Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
> Philipp Hermanns
> Aufsichtsratsvorsitzender: Wilhelm Dohmen
> ------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170203/65f929a4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logo_databay_email.gif
Type: image/gif
Size: 1250 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170203/65f929a4/attachment-0001.gif>


More information about the Users mailing list