So with ovirt-ansible-roles-1.1.0 (which is the last offically relased version) everything runs fine and host is added properly (tested several times on CentOS 7.3)

But we need to fix executing builds from github otherwise we cannot continue working with ovirt-ansible-roles in github:

1. If you add comment 'ci build please' to github PR, then build will be executed and the result will be a repo with RPM to be used in OST (but this build will not be passed to queue to be added to tested repo)

2. When PR is merged, then either automatically or by some other comment/action (available to maintainers only) build will be executed and if build is OK, it can be queued to be added to tested repo

Without above we just cannot continue working on ovirt-ansible-roles on github.

Thanks

Martin



On Thu, Aug 31, 2017 at 8:55 AM, Barak Korren <bkorren@redhat.com> wrote:
On 30 August 2017 at 22:20, Martin Perina <mperina@redhat.com> wrote:
>
>>
>> So we're back in square one.
>> Another possible culprit may be ansible: Vdsm is stopped two seconds
>> after it logs to the host.
>>
>> Aug 30 11:26:24 lago-basic-suite-master-host-0 systemd: Starting
>> Session 10 of user root.
>> Aug 30 11:26:25 lago-basic-suite-master-host-0 python: ansible-setup
>> Invoked with filter=* gather_subset=['all']
>> fact_path=/etc/ansible/facts.d gather_timeout=10
>> Aug 30 11:26:25 lago-basic-suite-master-host-0 python: ansible-command
>> Invoked with warn=True executable=None _uses_shell=False
>> _raw_params=bash -c "rpm -qi vdsm | grep -oE
>> 'Version\\s+:\\s+[0-9\\.]+' | awk '{print $3}'" removes=None
>> creates=None chdir=None
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 python: ansible-systemd
>> Invoked with no_block=False name=libvirt-guests enabled=True
>> daemon_reload=False state=started user=False masked=None
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Reloading.
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Cannot add
>> dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Stopped MOM
>> instance configured for VDSM purposes.
>> Aug 30 11:26:26 lago-basic-suite-master-host-0 systemd: Stopping
>> Virtual Desktop Server Manager...
>>
>>
>> could it be that it triggers a systemd-reload that makes systemd croak
>> on the vdsm-mom cycle?
>
>
> We are not restarting VDSM within ovirt-host-deploy Ansible role, the VDSM
> restart is performed in host-deploy part same as in previous versions.
>
> Within ovirt-host-deploy-firewalld we only enable and restart firewalld
> service.
>

comparing a successful add-host flow [1] to a failed one [2] we notice
that in the failed add host ansible logs in twice (session 10 and
session 11). Could it be somehow related? Notice that Session 11 uses
the OLD way (awk+grep based) to find vdsm's version.

Aug 30 05:55:53 lago-basic-suite-master-host-0 systemd-logind: New
session 10 of user root.
Aug 30 05:55:53 lago-basic-suite-master-host-0 systemd: Starting
Session 10 of user root.
Aug 30 05:55:53 lago-basic-suite-master-host-0 python: ansible-setup
Invoked with filter=* gather_subset=['all']
fact_path=/etc/ansible/facts.d gather_timeout=10
Aug 30 05:55:54 lago-basic-suite-master-host-0 python: ansible-command
Invoked with warn=True executable=None _uses_shell=False
_raw_params=bash -c "rpm -qi vdsm | grep -oE
'Version\\s+:\\s+[0-9\\.]+' | awk '{print $3}'" removes=None
creates=None chdir=None
Aug 30 05:55:54 lago-basic-suite-master-host-0 python: ansible-systemd
Invoked with no_block=False name=libvirt-guests enabled=True
daemon_reload=False state=started user=False masked=None
Aug 30 05:55:54 lago-basic-suite-master-host-0 systemd: Reloading.
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Cannot add
dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopped MOM
instance configured for VDSM purposes.
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopping
Virtual Desktop Server Manager...
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Starting
Suspend Active Libvirt Guests...
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Started
Suspend Active Libvirt Guests.
Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: libvirt
version: 2.0.0, package: 10.el7_3.9 (CentOS BuildSystem
<http://bugs.centos.org>, 2017-05-25-20:52:28, c1bm.rdu2.centos.org)
Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: hostname:
lago-basic-suite-master-host-0.lago.local
Aug 30 05:55:55 lago-basic-suite-master-host-0 vdsmd_init_common.sh:
vdsm: Running run_final_hooks
Aug 30 05:55:55 lago-basic-suite-master-host-0 journal: End of file
while reading data: Input/output error
Aug 30 05:55:55 lago-basic-suite-master-host-0 systemd: Stopped
Virtual Desktop Server Manager.
Aug 30 05:55:55 lago-basic-suite-master-host-0 python: ansible-command
Invoked with warn=True executable=None _uses_shell=False
_raw_params=bash -c "rpm -q vdsm --qf '%{VERSION}'" removes=None
creates=None chdir=None
Aug 30 05:55:55 lago-basic-suite-master-host-0 python: ansible-systemd
Invoked with no_block=False name=iptables enabled=False
daemon_reload=False state=stopped user=False masked=None

[1]: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/2197/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/messages/*view*/


[2]: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/2151/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/messages/*view*/

--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted