Could the libvirt-guest restart cause this issue?
Anyway can you try to remove
ovirt-ansible-roles-1.1.1-0.1.master.20170830145950.el7.centos.noarch.rpm
from tested and see if it reproduces?
On Thu, Aug 31, 2017 at 8:55 AM, Barak Korren <bkorren(a)redhat.com> wrote:
On 30 August 2017 at 22:20, Martin Perina <mperina(a)redhat.com>
wrote:
>
>>
>> So we're back in square one.
>> Another possible culprit may be ansible: Vdsm is stopped two seconds
>> after it logs to the host.
>>
>> Aug 30 11:26:24 lago-basic-suite-master-host-0 systemd: Starting
>> Session 10 of user root.
>> Aug 30 11:26:25 lago-basic-suite-master-host-0 python: ansible-setup
>> Invoked with filter=* gather_subset=['all']
>> fact_path=/etc/ansible/facts.d gather_timeout=10
>> Aug 30 11:26:25 lago-basic-suite-master-host-0 python: ansible-command
>> Invoked with warn=True executable=None _uses_shell=False
>> _raw_params=bash -c "rpm -qi vdsm | grep -oE
>> 'Version\\s+:\\s+[0-9\\.]+' | awk '{print $3}'"
removes=None
>> creates=None chdir=None
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 python: ansible-systemd
>> Invoked with no_block=False name=libvirt-guests enabled=True
>> daemon_reload=False state=started user=False masked=None
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Reloading.
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Cannot add
>> dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Stopped MOM
>> instance configured for VDSM purposes.
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Stopping
>> Virtual Desktop Server Manager...
>>
>>
>> could it be that it triggers a systemd-reload that makes systemd croak
>> on the vdsm-mom cycle?
>
>
> We are not restarting VDSM within ovirt-host-deploy Ansible role, the VDSM
> restart is performed in host-deploy part same as in previous versions.
>
> Within ovirt-host-deploy-firewalld we only enable and restart firewalld
> service.
>
comparing a successful add-host flow [1] to a failed one [2] we notice
that in the failed add host ansible logs in twice (session 10 and
session 11). Could it be somehow related? Notice that Session 11 uses
the OLD way (awk+grep based) to find vdsm's version.
Aug 30 05:55:53 lago-basic-suite-master-host-0 systemd-logind: New
session 10 of user root.
Aug 30 05:55:53 lago-basic-suite-master-host-0 systemd: Starting
Session 10 of user root.
Aug 30 05:55:53 lago-basic-suite-master-host-0 python: ansible-setup
Invoked with filter=* gather_subset=['all']
fact_path=/etc/ansible/facts.d gather_timeout=10
Aug 30 05:55:54 lago-basic-suite-master-host-0 python: ansible-command
Invoked with warn=True executable=None _uses_shell=False
_raw_params=bash -c "rpm -qi vdsm | grep -oE
'Version\\s+:\\s+[0-9\\.]+' | awk '{print $3}'" removes=None
creates=None chdir=None
Aug 30 05:55:54 lago-basic-suite-master-host-0 python: ansible-systemd
Invoked with no_block=False name=libvirt-guests enabled=True
daemon_reload=False state=started user=False masked=None
Aug 30 05:55:54 lago-basic-suite-master-host-0 systemd: Reloading.
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Cannot add
dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopped MOM
instance configured for VDSM purposes.
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopping
Virtual Desktop Server Manager...
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Starting
Suspend Active Libvirt Guests...
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Started
Suspend Active Libvirt Guests.
Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: libvirt
version: 2.0.0, package: 10.el7_3.9 (CentOS BuildSystem
<
http://bugs.centos.org>, 2017-05-25-20:52:28,
c1bm.rdu2.centos.org)
Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: hostname:
lago-basic-suite-master-host-0.lago.local
Aug 30 05:55:55 lago-basic-suite-master-host-0 vdsmd_init_common.sh:
vdsm: Running run_final_hooks
Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: End of file
while reading data: Input/output error
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopped
Virtual Desktop Server Manager.
Aug 30 05:55:55 lago-basic-suite-master-host-0 python: ansible-command
Invoked with warn=True executable=None _uses_shell=False
_raw_params=bash -c "rpm -q vdsm --qf '%{VERSION}'" removes=None
creates=None chdir=None
Aug 30 05:55:55 lago-basic-suite-master-host-0 python: ansible-systemd
Invoked with no_block=False name=iptables enabled=False
daemon_reload=False state=stopped user=False masked=None
[1]:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/2197/artifa...
[2]:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/2151/artifa...
--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. |
redhat.com/trusted