[ovirt-users] HA agent fails to start

Richard Neuboeck hawk at tbi.univie.ac.at
Thu Apr 14 10:51:29 UTC 2016


On 04/13/2016 10:00 AM, Simone Tiraboschi wrote:
> On Wed, Apr 13, 2016 at 9:38 AM, Richard Neuboeck <hawk at tbi.univie.ac.at> wrote:
>> The answers file shows the setup time of both machines.
>>
>> On both machines hosted-engine.conf got rotated right before I wrote
>> this mail. Is it possible that I managed to interrupt the rotation with
>> the reboot so the backup was accurate but the update not yet written to
>> hosted-engine.conf?
> 
> AFAIK we don't have any rotation mechanism for that file; something
> else you have in place on that host?

Those machines are all CentOS 7.2 minimal installs. The only
adaptation I do is installing vim, removing postfix and installing
exim, removing firewalld and installing iptables-service. Then I add
the oVirt repos (3.6 and 3.6-snapshot) and deploy the host.

But checking lsof shows that 'ovirt-ha-agent --no-daemon' has access
to the config file (and the one ending with ~):

# lsof | grep 'hosted-engine.conf~'
ovirt-ha- 193446                   vdsm  351u      REG
253,0        1021            135070683
/etc/ovirt-hosted-engine/hosted-engine.conf~


>> [root at cube-two ~]# ls -l /etc/ovirt-hosted-engine
>> total 16
>> -rw-r--r--. 1 root root 3252 Apr  8 10:35 answers.conf
>> -rw-r--r--. 1 root root 1021 Apr 13 09:31 hosted-engine.conf
>> -rw-r--r--. 1 root root 1021 Apr 13 09:30 hosted-engine.conf~
>>
>> [root at cube-three ~]# ls -l /etc/ovirt-hosted-engine
>> total 16
>> -rw-r--r--. 1 root root 3233 Apr 11 08:02 answers.conf
>> -rw-r--r--. 1 root root 1002 Apr 13 09:31 hosted-engine.conf
>> -rw-r--r--. 1 root root 1002 Apr 13 09:31 hosted-engine.conf~
>>
>> On 12.04.16 16:01, Simone Tiraboschi wrote:
>>> Everything seams fine here,
>>> /etc/ovirt-hosted-engine/hosted-engine.conf seams to be correctly
>>> created with the right name.
>>> Can you please check the latest modification time of your
>>> /etc/ovirt-hosted-engine/hosted-engine.conf~ and compare it with the
>>> setup time?
>>>
>>> On Tue, Apr 12, 2016 at 2:34 PM, Richard Neuboeck <hawk at tbi.univie.ac.at> wrote:
>>>> On 04/12/2016 11:32 AM, Simone Tiraboschi wrote:
>>>>> On Mon, Apr 11, 2016 at 8:11 AM, Richard Neuboeck <hawk at tbi.univie.ac.at> wrote:
>>>>>> Hi oVirt Group,
>>>>>>
>>>>>> in my attempts to get all aspects of oVirt 3.6 up and running I
>>>>>> stumbled upon something I'm not sure how to fix:
>>>>>>
>>>>>> Initially I installed a hosted engine setup. After that I added
>>>>>> another HA host (with hosted-engine --deploy). The host was
>>>>>> registered in the Engine correctly and HA agent came up as expected.
>>>>>>
>>>>>> However if I reboot the second host (through the Engine UI or
>>>>>> manually) HA agent fails to start. The reason seems to be that
>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf is empty. The backup
>>>>>> file ending with ~ exists though.
>>>>>
>>>>> Can you please attach hosted-engine-setup logs from your additional hosts?
>>>>> AFAIK our code will never take a ~ ending backup of that file.
>>>>
>>>> ovirt-hosted-engine-setup logs from both additional hosts are
>>>> attached to this mail.
>>>>
>>>>>
>>>>>> Here are the log messages from the journal:
>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at systemd[1]: Starting oVirt
>>>>>> Hosted Engine High Availability Monitoring Agent...
>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at ovirt-ha-agent[3747]:
>>>>>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:ovirt-hosted-engine-ha
>>>>>> agent 1.3.5.3-0.0.master started
>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at ovirt-ha-agent[3747]:
>>>>>> INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Found
>>>>>> certificate common name: cube-two.tbi.univie.ac.at
>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at ovirt-ha-agent[3747]:
>>>>>> ovirt-ha-agent
>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Hosted
>>>>>> Engine is not configured. Shutting down.
>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at ovirt-ha-agent[3747]:
>>>>>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Hosted
>>>>>> Engine is not configured. Shutting down.
>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at ovirt-ha-agent[3747]:
>>>>>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at systemd[1]:
>>>>>> ovirt-ha-agent.service: main process exited, code=exited, status=255/n/a
>>>>>>
>>>>>> If I restore the configuration from the backup file and manually
>>>>>> restart the HA agent it's working properly.
>>>>>>
>>>>>> For testing purposes I added a third HA host which turn out to
>>>>>> behave exactly the same.
>>>>>>
>>>>>> Any help would be appreciated!
>>>>>> Thanks
>>>>>> Cheers
>>>>>> Richard
>>>>>>
>>>>>> --
>>>>>> /dev/null
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>
>>>>
>>>> --
>>>> /dev/null
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>


-- 
/dev/null

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160414/89b82d75/attachment-0001.sig>


More information about the Users mailing list