This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--F2XcxfuPVr2biavwoNS4KW2vbhuOlobi1
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
On 04/14/2016 02:14 PM, Simone Tiraboschi wrote:
On Thu, Apr 14, 2016 at 12:51 PM, Richard Neuboeck
<hawk(a)tbi.univie.ac.at> wrote:
> On 04/13/2016 10:00 AM, Simone Tiraboschi wrote:
>> On Wed, Apr 13, 2016 at 9:38 AM, Richard Neuboeck <hawk(a)tbi.univie.ac=
=2Eat> wrote:
>>> The answers file shows the setup time of both machines.
>>>
>>> On both machines hosted-engine.conf got rotated right before I wrote=
>>> this mail. Is it possible that I managed to interrupt the
rotation w=
ith
>>> the reboot so the backup was accurate but the update not
yet written=
to
>>> hosted-engine.conf?
>>
>> AFAIK we don't have any rotation mechanism for that file; something
>> else you have in place on that host?
>
> Those machines are all CentOS 7.2 minimal installs. The only
> adaptation I do is installing vim, removing postfix and installing
> exim, removing firewalld and installing iptables-service. Then I add
> the oVirt repos (3.6 and 3.6-snapshot) and deploy the host.
>
> But checking lsof shows that 'ovirt-ha-agent --no-daemon' has access
> to the config file (and the one ending with ~):
>
> # lsof | grep 'hosted-engine.conf~'
> ovirt-ha- 193446 vdsm 351u REG
> 253,0 1021 135070683
> /etc/ovirt-hosted-engine/hosted-engine.conf~
=20
This is not that much relevant if the file was renamed after
ovirt-ha-agent opened it.
Try this:
=20
[root@c72he20160405h1 ovirt-hosted-engine-setup]# tail -n1 -f
/etc/ovirt-hosted-engine/hosted-engine.conf &
[1] 28866
[root@c72he20160405h1 ovirt-hosted-engine-setup]# port=3D
=20
[root@c72he20160405h1 ovirt-hosted-engine-setup]# lsof | grep hosted-en=
gine.conf
tail 28866 root 3r REG
253,0 1014 1595898 /etc/ovirt-hosted-engine/hosted-engine.conf
[root@c72he20160405h1 ovirt-hosted-engine-setup]# mv
/etc/ovirt-hosted-engine/hosted-engine.conf
/etc/ovirt-hosted-engine/hosted-engine.conf_123
[root@c72he20160405h1 ovirt-hosted-engine-setup]# lsof | grep hosted-en=
gine.conf
tail 28866 root 3r REG
253,0 1014 1595898
/etc/ovirt-hosted-engine/hosted-engine.conf_123
[root@c72he20160405h1 ovirt-hosted-engine-setup]#
=20
I've issued the commands you suggested but I don't know how that
helps to find the process accessing the config files.
After moving the hosted-engine.conf file the HA agent crashed
logging the information that the config file is not available.
Here is the output from every command:
# tail -n1 -f /etc/ovirt-hosted-engine/hosted-engine.conf &
[1] 167865
[root@cube-two ~]# port=3D
# lsof | grep hosted-engine.conf
ovirt-ha- 166609 vdsm 5u REG
253,0 1021 134433491
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 7u REG
253,0 1021 134433453
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 8u REG
253,0 1021 134433489
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 9u REG
253,0 1021 134433493
/etc/ovirt-hosted-engine/hosted-engine.conf~
ovirt-ha- 166609 vdsm 10u REG
253,0 1021 134433495
/etc/ovirt-hosted-engine/hosted-engine.conf
tail 167865 root 3r REG
253,0 1021 134433493
/etc/ovirt-hosted-engine/hosted-engine.conf~
# mv /etc/ovirt-hosted-engine/hosted-engine.conf
/etc/ovirt-hosted-engine/hosted-engine.conf_123
# lsof | grep hosted-engine.conf
ovirt-ha- 166609 vdsm 5u REG
253,0 1021 134433491
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 7u REG
253,0 1021 134433453
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 8u REG
253,0 1021 134433489
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 9u REG
253,0 1021 134433493
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 10u REG
253,0 1021 134433495
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
ovirt-ha- 166609 vdsm 12u REG
253,0 1021 134433498
/etc/ovirt-hosted-engine/hosted-engine.conf~
ovirt-ha- 166609 vdsm 13u REG
253,0 1021 134433499
/etc/ovirt-hosted-engine/hosted-engine.conf_123
tail 167865 root 3r REG
253,0 1021 134433493
/etc/ovirt-hosted-engine/hosted-engine.conf (deleted)
The issue is understanding who renames that file on your host.
=46rom what I've seen so far it looks like a child of vdsm accesses
/etc/ovirt-hosted-engine/hosted-engine.conf periodically but is not
responsible for the ~ file.
# auditctl -w /etc/ovirt-hosted-engine/hosted-engine.conf
and
# auditctl -w /etc/ovirt-hosted-engine/hosted-engine.conf~
auditd.log shows this:
type=3DSYSCALL msg=3Daudit(1460639783.613:482590): arch=3Dc000003e
syscall=3D2 success=3Dyes exit=3D75 a0=3D7f29b400f0b0 a1=3D0 a2=3D1b6 a3=3D=
24
items=3D1 ppid=3D1 pid=3D3701 auid=3D4294967295 uid=3D36 gid=3D36 euid=3D=
36
suid=3D36 fsuid=3D36 egid=3D36 sgid=3D36 fsgid=3D36 tty=3D(none) ses=3D42=
94967295
comm=3D"jsonrpc.Executo" exe=3D"/usr/bin/python2.7"
subj=3Dsystem_u:system_r:virtd_t:s0-s0:c0.c1023 key=3D(null)
type=3DCWD msg=3Daudit(1460639783.613:482590): cwd=3D"/"
type=3DPATH msg=3Daudit(1460639783.613:482590): item=3D0
name=3D"/etc/ovirt-hosted-engine/hosted-engine.conf" inode=3D134433499
dev=3Dfd:00 mode=3D0100644 ouid=3D0 ogid=3D0 rdev=3D00:00
obj=3Dsystem_u:object_r:etc_t:s0 objtype=3DNORMAL
Now that the HA agent is dead I'm removing the ~ file and starting
the HA agent again. The ~ file immediately appears again.
# rm hosted-engine.conf~
rm: remove regular file =E2=80=98hosted-engine.conf~=E2=80=99? y
[root@cube-two ovirt-hosted-engine]# ls -l
total 6800
-rw-r--r--. 1 root root 3252 Apr 8 10:35 answers.conf
-rw-r--r--. 1 root root 6948582 Apr 14 14:48 ha-trace.log
-rw-r--r--. 1 root root 1021 Apr 14 15:07 hosted-engine.conf
-rw-r--r--. 1 root root 413 Apr 8 10:35 iptables.example
[root@cube-two ovirt-hosted-engine]# systemctl start ovirt-ha-agent
[root@cube-two ovirt-hosted-engine]# ls -l
total 6804
-rw-r--r--. 1 root root 3252 Apr 8 10:35 answers.conf
-rw-r--r--. 1 root root 6948582 Apr 14 14:48 ha-trace.log
-rw-r--r--. 1 root root 1021 Apr 14 15:18 hosted-engine.conf
-rw-r--r--. 1 root root 1021 Apr 14 15:07 hosted-engine.conf~
-rw-r--r--. 1 root root 413 Apr 8 10:35 iptables.example
The auditd.log shows that ~ file is moved into place but not what
issued the mv:
type=3DCONFIG_CHANGE msg=3Daudit(1460639919.277:482750): auid=3D429496729=
5
ses=3D4294967295 op=3D"updated_rules"
path=3D"/etc/ovirt-hosted-engine/hosted-engine.conf~" key=3D(null)
list=3D4 res=3D1
type=3DSYSCALL msg=3Daudit(1460639919.277:482751): arch=3Dc000003e
syscall=3D82 success=3Dyes exit=3D0 a0=3D7ffe4b3c0e90 a1=3D7ffe4b3bf920
a2=3D7f68083a2778 a3=3D7ffe4b3bf680 items=3D5 ppid=3D170233 pid=3D170234
auid=3D4294967295 uid=3D0 gid=3D0 euid=3D0 suid=3D0 fsuid=3D0 eg
id=3D0 sgid=3D0 fsgid=3D0 tty=3D(none) ses=3D4294967295 comm=3D"mv"
exe=3D"/usr/bin/mv" subj=3Dsystem_u:system_r:unconfined_service_t:s0
key=3D(null)
type=3DCWD msg=3Daudit(1460639919.277:482751): cwd=3D"/"
type=3DPATH msg=3Daudit(1460639919.277:482751): item=3D0
name=3D"/etc/ovirt-hosted-engine/" inode=3D69555 dev=3Dfd:00 mode=3D04075=
5
ouid=3D0 ogid=3D0 rdev=3D00:00 obj=3Dsystem_u:object_r:etc_t:s0 objtype=3D=
PARENT
type=3DPATH msg=3Daudit(1460639919.277:482751): item=3D1
name=3D"/etc/ovirt-hosted-engine/" inode=3D69555 dev=3Dfd:00 mode=3D04075=
5
ouid=3D0 ogid=3D0 rdev=3D00:00 obj=3Dsystem_u:object_r:etc_t:s0 objtype=3D=
PARENT
type=3DPATH msg=3Daudit(1460639919.277:482751): item=3D2
name=3D"/etc/ovirt-hosted-engine/hosted-engine.conf" inode=3D134433453
dev=3Dfd:00 mode=3D0100644 ouid=3D0 ogid=3D0 rdev=3D00:00
obj=3Dsystem_u:object_r:etc_t:s0 objtype=3DDELETE
type=3DPATH msg=3Daudit(1460639919.277:482751): item=3D3
name=3D"/etc/ovirt-hosted-engine/hosted-engine.conf~" inode=3D134433499
dev=3Dfd:00 mode=3D0100644 ouid=3D0 ogid=3D0 rdev=3D00:00
obj=3Dsystem_u:object_r:etc_t:s0 objtype=3DDELETE
type=3DPATH msg=3Daudit(1460639919.277:482751): item=3D4
name=3D"/etc/ovirt-hosted-engine/hosted-engine.conf~" inode=3D134433453
dev=3Dfd:00 mode=3D0100644 ouid=3D0 ogid=3D0 rdev=3D00:00
obj=3Dsystem_u:object_r:etc_t:s0 objtype=3DCREATE
As a thumb rule, if a file name is appended with a tilde~, it only
means that it is a backup created by a text editor or similar program.
If anyone except myself would have access to these systems I would
guess the same. But since I'm not editing anything in
/etc/ovirt-hosted-engine there must be another reason. And there is.
Aside from auditd I tried to strace the whole thing just to make
sure it comes from the HA agent.
[root@cube-two ~]# strace -o ha-trace.log -f
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
Looking at the trace log I found this:
183409 statfs("/etc/ovirt-hosted-engine/.", {f_type=3D0x58465342,
f_bsize=3D4096, f_blocks=3D13100800, f_bfree=3D12523576,
f_bavail=3D12523576, f_files=3D52428800, f_ffree=3D52379892,
f_fsid=3D{64768, 0}, f_namelen=3D255, f_frsize=3D4096}) =3D 0
183409 rename("/etc/ovirt-hosted-engine/hosted-engine.conf",
"/etc/ovirt-hosted-engine/hosted-engine.conf~") =3D 0
183409 rename("/var/lib/ovirt-hosted-engine-ha/tmpNjTElr",
"/etc/ovirt-hosted-engine/hosted-engine.conf") =3D 0
183409 newfstatat(AT_FDCWD,
"/etc/ovirt-hosted-engine/hosted-engine.conf",
{st_mode=3DS_IFREG|0600, st_size=3D1021, ...}, AT_SYMLINK_NOFOLLOW) =3D 0=
183409 open("/etc/ovirt-hosted-engine/hosted-engine.conf",
O_RDONLY|O_NOFOLLOW) =3D 3
Putting it all together I started reading the HA agent sources and
found the function _wrote_updated_conf_file in
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/upgrade.py
which issues a mv -b which creates the ~ file.
The question now is why is this done so frequently. Especially
considering since there are no modifications to the file. Is this
behavior normal?
[root@cube-two ~]# diff /etc/ovirt-hosted-engine/hosted-engine.conf*
[root@cube-two ~]#
>>> [root@cube-two ~]# ls -l /etc/ovirt-hosted-engine
>>> total 16
>>> -rw-r--r--. 1 root root 3252 Apr 8 10:35 answers.conf
>>> -rw-r--r--. 1 root root 1021 Apr 13 09:31 hosted-engine.conf
>>> -rw-r--r--. 1 root root 1021 Apr 13 09:30 hosted-engine.conf~
>>>
>>> [root@cube-three ~]# ls -l /etc/ovirt-hosted-engine
>>> total 16
>>> -rw-r--r--. 1 root root 3233 Apr 11 08:02 answers.conf
>>> -rw-r--r--. 1 root root 1002 Apr 13 09:31 hosted-engine.conf
>>> -rw-r--r--. 1 root root 1002 Apr 13 09:31 hosted-engine.conf~
>>>
>>> On 12.04.16 16:01, Simone Tiraboschi wrote:
>>>> Everything seams fine here,
>>>> /etc/ovirt-hosted-engine/hosted-engine.conf seams to be correctly
>>>> created with the right name.
>>>> Can you please check the latest modification time of your
>>>> /etc/ovirt-hosted-engine/hosted-engine.conf~ and compare it with th=
e
>>>> setup time?
>>>>
>>>> On Tue, Apr 12, 2016 at 2:34 PM, Richard Neuboeck <hawk(a)tbi.univie.=
ac.at> wrote:
>>>>> On 04/12/2016 11:32 AM, Simone Tiraboschi wrote:
>>>>>> On Mon, Apr 11, 2016 at 8:11 AM, Richard Neuboeck
<hawk(a)tbi.univi=
e.ac.at> wrote:
>>>>>>> Hi oVirt Group,
>>>>>>>
>>>>>>> in my attempts to get all aspects of oVirt 3.6 up and running
I
>>>>>>> stumbled upon something I'm not sure how to fix:
>>>>>>>
>>>>>>> Initially I installed a hosted engine setup. After that I
added
>>>>>>> another HA host (with hosted-engine --deploy). The host was
>>>>>>> registered in the Engine correctly and HA agent came up as
expec=
ted.
>>>>>>>
>>>>>>> However if I reboot the second host (through the Engine UI
or
>>>>>>> manually) HA agent fails to start. The reason seems to be
that
>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf is empty. The
backup=
>>>>>>> file ending with ~ exists though.
>>>>>>
>>>>>> Can you please attach hosted-engine-setup logs from your
addition=
al hosts?
>>>>>> AFAIK our code will never take a ~ ending
backup of that file.
>>>>>
>>>>> ovirt-hosted-engine-setup logs from both additional hosts are
>>>>> attached to this mail.
>>>>>
>>>>>>
>>>>>>> Here are the log messages from the journal:
>>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at systemd[1]:
Starting o=
Virt
>>>>>>> Hosted Engine High Availability
Monitoring Agent...
>>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at
ovirt-ha-agent[3747]:
>>>>>>>
INFO:ovirt_hosted_engine_ha.agent.agent.Agent:ovirt-hosted-engin=
e-ha
>>>>>>> agent 1.3.5.3-0.0.master started
>>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at
ovirt-ha-agent[3747]:
>>>>>>>
INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Fou=
nd
>>>>>>> certificate common name:
cube-two.tbi.univie.ac.at
>>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at
ovirt-ha-agent[3747]:
>>>>>>> ovirt-ha-agent
>>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR
Ho=
sted
>>>>>>> Engine is not configured. Shutting down.
>>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at
ovirt-ha-agent[3747]:
>>>>>>>
ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Ho=
sted
>>>>>>> Engine is not configured. Shutting down.
>>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at
ovirt-ha-agent[3747]:
>>>>>>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting
dow=
n
>>>>>>> Apr 11 07:29:39 cube-two.tbi.univie.ac.at
systemd[1]:
>>>>>>> ovirt-ha-agent.service: main process exited, code=3Dexited,
stat=
us=3D255/n/a
>>>>>>>
>>>>>>> If I restore the configuration from the backup file and
manually=
>>>>>>> restart the HA agent it's working
properly.
>>>>>>>
>>>>>>> For testing purposes I added a third HA host which turn out
to
>>>>>>> behave exactly the same.
>>>>>>>
>>>>>>> Any help would be appreciated!
>>>>>>> Thanks
>>>>>>> Cheers
>>>>>>> Richard
>>>>>>>
>>>>>>> --
>>>>>>> /dev/null
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users(a)ovirt.org
>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> /dev/null
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>
>
>
> --
> /dev/null
>
--=20
/dev/null
--F2XcxfuPVr2biavwoNS4KW2vbhuOlobi1
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAEBCgAGBQJXD6NkAAoJEA7XCanqEVqIrFYP/24cFJI1LirkC/5twF/3WvDT
2jljJyYeoOncwBLKO6K+/QWTnn+iPLLwRyUOvzaN6DMiO/AcWFXSlP9hmM+n2iPi
hrdx8k8rwz+T79XPhWkqJrL4JThP4Nnhsz68xUSs3MkrOGyeDbxzeFU/ov2kMIVV
mCKPWHQIUzPPv5I23lLm1oPnhlC4dzduex0WCmRFjZjfmyqV77vSwFHYXtcbCr/W
pJj/ot99afG3ongVyUuiu+pfxVmoAfOQuYYIXa0eCt88MKuZlVLX7VaDLv4WTkF2
t8ownUHz+S0LBOENRNlJ6RdwviK1HVdMZJWOc0glBSaUhpAgvVYijPcW2Fqa+v0G
z2diIPVrngL/32DjzOlkSyedf+V7vnXeEkaMN1hiBb7PpwnkTkWrMz7kWIs2O/VC
pf8H9Q1pUiBVruEjF761T6wK+xWrNC/BmbkW7GlG6RBanyODKT5i432TncIKjnoJ
3e9sTlP4kIpDAADKXuWcnHNjLxo7/lBjWImPwqYVrBpt3+omZpvXwa11oEHelXB2
3oYHwOCt7AY2YkYU2h5znIswUUDMqlyvXzTFg/VcsPdyeupNm6X2H2qQw9MooDho
4TnMIQHMoIJRArhKA+0MVwFtiABx06ymd9k2Dhxk6wcseOE150RZh8vT7p3GurDn
H3Wwz9ie8WHreukdQVmy
=Dj9F
-----END PGP SIGNATURE-----
--F2XcxfuPVr2biavwoNS4KW2vbhuOlobi1--