On Mon, Jan 27, 2020 at 12:30 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Jan 26, 2020 at 9:07 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:


On 1/26/20 12:17 PM, Marcin Sobczyk wrote:
Hi,

I recently posted and merged [1], which makes supervdsmd depend on libvirt's socket units.
Not sure how the machine ended up masked libvirt sockets though.
Could you please share with me the whole deployment logs off list?

Looking at ansible deployment log I can see that it started around 10:39:

2020-01-26 10:39:10 IST - TASK [Gathering Facts] *********************************************************
2020-01-26 10:39:16 IST - ok: [didi-centos8-host.lab.eng.tlv2.redhat.com]

but 'systemctl show libvirtd-tls.socket' on the host gives me:

StateChangeTimestamp=Sun 2020-01-26 08:43:36 IST
so I think that libvirt sockets were masked even before the deployment process started.
The uptime for the host is ~7 days - maybe it was contaminated before?

Can you reproduce this in a clean env?

Searched, and eventually managed to find:

I saw that /etc/systemd/system/libvirtd-tls.socket's timestamp is from when I ran 'dnf update'.
So I checked 'rpm -q --scripts' on all the packages I updated, and found this, in libvirt-daemon:

posttrans scriptlet (using /bin/sh):
if [ -f /var/lib/rpm-state/libvirt/restart ]; then
    # See if user has previously modified their install to
    # tell libvirtd to use --listen
    grep -E '^LIBVIRTD_ARGS=.*--listen' /etc/sysconfig/libvirtd 1>/dev/null 2>&1
    if test $? = 0
    then
        # Then lets keep honouring --listen and *not* use
        # systemd socket activation, because switching things
        # might confuse mgmt tool like puppet/ansible that
        # expect the old style libvirtd
        /bin/systemctl mask libvirtd.socket >/dev/null 2>&1 || :
        /bin/systemctl mask libvirtd-ro.socket >/dev/null 2>&1 || :
        /bin/systemctl mask libvirtd-admin.socket >/dev/null 2>&1 || :
        /bin/systemctl mask libvirtd-tls.socket >/dev/null 2>&1 || :
        /bin/systemctl mask libvirtd-tcp.socket >/dev/null 2>&1 || :
    else

So the flow is, more-or-less:

Install vdsm

Somehow start it. I think it's configured to start automatically, so it's enough to reboot


To clarify: I meant that something in vdsm's systemd conf, configures libvirt
(including adding 'LIBVIRTD_ARGS=--listen' to /etc/sysconfig/libvirtd).
 
dnf update libvirt-daemon

This masks libvirtd-tls.socket.

Please handle :-), thanks!
 

Regards, Marcin


Thanks, Marcin


[1] https://gerrit.ovirt.org/#/c/105334/

On 1/26/20 11:53 AM, Yedidyah Bar David wrote:
Hi all,

Tried now 'hosted-engine --deploy' on fully updated CentOS
8/ovirt-master-snapshot machine. It failed during adding the host to
the engine. engine.log has:

2020-01-26 10:41:47,825+02 ERROR
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [11ba00a7] Host installation
failed for host 'efd6cb8a-935d-4812-b35c-3fbde5651b5a',
'didi-centos8-host.lab.eng.tlv2.redhat.com': Task Start and enable
services failed to execute:
2020-01-26 10:41:47,836+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [11ba00a7] START,
SetVdsStatusVDSCommand(HostName =
didi-centos8-host.lab.eng.tlv2.redhat.com,
SetVdsStatusVDSCommandParameters:{hostId='efd6cb8a-935d-4812-b35c-3fbde5651b5a',
status='InstallFailed', nonOperationalReason='NONE',
stopSpmFailureLogged='false', maintenanceReason='null'}), log id:
4f107d5d
2020-01-26 10:41:47,901+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-1) [11ba00a7] FINISH,
SetVdsStatusVDSCommand, return: , log id: 4f107d5d
2020-01-26 10:41:48,002+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-1) [11ba00a7] EVENT_ID:
VDS_INSTALL_FAILED(505), Host
didi-centos8-host.lab.eng.tlv2.redhat.com installation failed. Task
Start and enable services failed to execute: .

The code emitting this error seems to be in
backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/common/utils/ansible/AnsibleRunnerHTTPClient.java
:

                             String.format("Task %1$s failed to
execute: %2$s", task, "") // stdout, stderr?

Seems like someone considered logging also stdout/stderr but failed to
make up their minds. Is this tracked somewhere?

ovirt-host-deploy-ansible-20200126103902-didi-centos8-host.lab.eng.tlv2.redhat.com-11ba00a7.log
has:

2020-01-26 10:41:38 IST - TASK [ovirt-host-deploy-vdsm : Start and
enable services] **********************
2020-01-26 10:41:47 IST -
2020-01-26 10:41:47 IST - {
   "status" : "OK",
   "msg" : "",
   "data" : {
     "event" : "runner_on_failed",
...
           "msg" : "Unable to start service vdsmd.service: Failed to
start vdsmd.service: Unit libvirtd-tcp.socket is masked.\n",
           "_ansible_item_label" : "vdsmd.service"

systemctl status libvirtd-tcp.socket indeed still says it's masked. Package is:

# rpm -qif /usr/lib/systemd/system/libvirtd-tcp.socket
Name        : libvirt-daemon
Version     : 5.6.0
Release     : 6.el8
Architecture: x86_64
Install Date: Mon 20 Jan 2020 08:23:12 AM IST
Group       : Unspecified
Size        : 1320922
License     : LGPLv2+
Signature   : RSA/SHA1, Wed 08 Jan 2020 11:06:38 AM IST, Key ID 695b5f7eff3e3445
Source RPM  : libvirt-5.6.0-6.el8.src.rpm
Build Date  : Wed 08 Jan 2020 11:06:04 AM IST
Build Host  : copr-builder-156909441.novalocal
Relocations : (not relocatable)
URL         : https://libvirt.org/
Summary     : Server side daemon and supporting files for libvirt library
Description :
Server side daemon required to manage the virtualization capabilities
of recent versions of Linux. Requires a hypervisor specific sub-RPM
for specific drivers.

Known issue?

Thanks and best regards,




--
Didi


--
Didi