[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 2017-08-30 ] [add_hosts]

Ondra Machacek omachace at redhat.com
Thu Aug 31 07:57:41 UTC 2017


The difference between the two, is this patch:

 https://github.com/oVirt/ovirt-ansible/pull/62/files#diff-f7199e24fce211caec68e106db3fc4ddR28

Could the libvirt-guest restart cause this issue?

Anyway can you try to remove
ovirt-ansible-roles-1.1.1-0.1.master.20170830145950.el7.centos.noarch.rpm
from tested and see if it reproduces?

On Thu, Aug 31, 2017 at 8:55 AM, Barak Korren <bkorren at redhat.com> wrote:
> On 30 August 2017 at 22:20, Martin Perina <mperina at redhat.com> wrote:
>>
>>>
>>> So we're back in square one.
>>> Another possible culprit may be ansible: Vdsm is stopped two seconds
>>> after it logs to the host.
>>>
>>> Aug 30 11:26:24 lago-basic-suite-master-host-0 systemd: Starting
>>> Session 10 of user root.
>>> Aug 30 11:26:25 lago-basic-suite-master-host-0 python: ansible-setup
>>> Invoked with filter=* gather_subset=['all']
>>> fact_path=/etc/ansible/facts.d gather_timeout=10
>>> Aug 30 11:26:25 lago-basic-suite-master-host-0 python: ansible-command
>>> Invoked with warn=True executable=None _uses_shell=False
>>> _raw_params=bash -c "rpm -qi vdsm | grep -oE
>>> 'Version\\s+:\\s+[0-9\\.]+' | awk '{print $3}'" removes=None
>>> creates=None chdir=None
>>> Aug 30 11:26:26 lago-basic-suite-master-host-0 python: ansible-systemd
>>> Invoked with no_block=False name=libvirt-guests enabled=True
>>> daemon_reload=False state=started user=False masked=None
>>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Reloading.
>>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Cannot add
>>> dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
>>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Stopped MOM
>>> instance configured for VDSM purposes.
>>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Stopping
>>> Virtual Desktop Server Manager...
>>>
>>>
>>> could it be that it triggers a systemd-reload that makes systemd croak
>>> on the vdsm-mom cycle?
>>
>>
>> We are not restarting VDSM within ovirt-host-deploy Ansible role, the VDSM
>> restart is performed in host-deploy part same as in previous versions.
>>
>> Within ovirt-host-deploy-firewalld we only enable and restart firewalld
>> service.
>>
>
> comparing a successful add-host flow [1] to a failed one [2] we notice
> that in the failed add host ansible logs in twice (session 10 and
> session 11). Could it be somehow related? Notice that Session 11 uses
> the OLD way (awk+grep based) to find vdsm's version.
>
> Aug 30 05:55:53 lago-basic-suite-master-host-0 systemd-logind: New
> session 10 of user root.
> Aug 30 05:55:53 lago-basic-suite-master-host-0 systemd: Starting
> Session 10 of user root.
> Aug 30 05:55:53 lago-basic-suite-master-host-0 python: ansible-setup
> Invoked with filter=* gather_subset=['all']
> fact_path=/etc/ansible/facts.d gather_timeout=10
> Aug 30 05:55:54 lago-basic-suite-master-host-0 python: ansible-command
> Invoked with warn=True executable=None _uses_shell=False
> _raw_params=bash -c "rpm -qi vdsm | grep -oE
> 'Version\\s+:\\s+[0-9\\.]+' | awk '{print $3}'" removes=None
> creates=None chdir=None
> Aug 30 05:55:54 lago-basic-suite-master-host-0 python: ansible-systemd
> Invoked with no_block=False name=libvirt-guests enabled=True
> daemon_reload=False state=started user=False masked=None
> Aug 30 05:55:54 lago-basic-suite-master-host-0 systemd: Reloading.
> Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Cannot add
> dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
> Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopped MOM
> instance configured for VDSM purposes.
> Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopping
> Virtual Desktop Server Manager...
> Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Starting
> Suspend Active Libvirt Guests...
> Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Started
> Suspend Active Libvirt Guests.
> Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: libvirt
> version: 2.0.0, package: 10.el7_3.9 (CentOS BuildSystem
> <http://bugs.centos.org>, 2017-05-25-20:52:28, c1bm.rdu2.centos.org)
> Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: hostname:
> lago-basic-suite-master-host-0.lago.local
> Aug 30 05:55:55 lago-basic-suite-master-host-0 vdsmd_init_common.sh:
> vdsm: Running run_final_hooks
> Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: End of file
> while reading data: Input/output error
> Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopped
> Virtual Desktop Server Manager.
> Aug 30 05:55:55 lago-basic-suite-master-host-0 python: ansible-command
> Invoked with warn=True executable=None _uses_shell=False
> _raw_params=bash -c "rpm -q vdsm --qf '%{VERSION}'" removes=None
> creates=None chdir=None
> Aug 30 05:55:55 lago-basic-suite-master-host-0 python: ansible-systemd
> Invoked with no_block=False name=iptables enabled=False
> daemon_reload=False state=stopped user=False masked=None
>
> [1]: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/2197/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/messages/*view*/
>
>
> [2]: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/2151/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/messages/*view*/
>
> --
> Barak Korren
> RHV DevOps team , RHCE, RHCi
> Red Hat EMEA
> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted


More information about the Devel mailing list