Hi Giorgio,

Do you have a staging test (non production) environment?
I built a test ovirt-node-ng image that includes this package, and if you want you can download it from here:
https://jenkins.ovirt.org/job/ovirt-node-ng-image_standard-check-patch/176/artifact/check-patch.el7.x86_64/

If you do, please let us know if it resolved the issue for you,

Thanks in advance,

On Tue, May 12, 2020 at 6:57 PM Giorgio Biacchi <giorgio@di.unimi.it> wrote:
Il 12/05/2020 17:07, Dominik Holler ha scritto:
>
>
> On Tue, May 12, 2020 at 4:25 PM Giorgio Biacchi <giorgio@di.unimi.it
> <mailto:giorgio@di.unimi.it>> wrote:
>
>     On 5/12/20 12:28 PM, Dominik Holler wrote:
>      >
>      >
>      > On Tue, May 12, 2020 at 8:49 AM Giorgio Biacchi
>     <giorgio@di.unimi.it <mailto:giorgio@di.unimi.it>
>      > <mailto:giorgio@di.unimi.it <mailto:giorgio@di.unimi.it>>> wrote:
>      >
>      >     On 5/11/20 5:53 PM, Dominik Holler wrote:
>      >     >
>      >     >
>      >     > On Mon, May 11, 2020 at 12:31 PM Giorgio Biacchi
>      >     <giorgio@di.unimi.it <mailto:giorgio@di.unimi.it>
>     <mailto:giorgio@di.unimi.it <mailto:giorgio@di.unimi.it>>
>      >     > <mailto:giorgio@di.unimi.it <mailto:giorgio@di.unimi.it>
>     <mailto:giorgio@di.unimi.it <mailto:giorgio@di.unimi.it>>>> wrote:
>      >     >
>      >     >     Hi list,
>      >     >     I've spent a couple of days trying to understand why
>     this was
>      >     >     happening...
>      >     >
>      >     >     For the installation I have a well tested installation
>     server
>      >     with a
>      >     >     custom kickstart file to setup ssh keys and custom
>     hooks for
>      >     infiniband
>      >     >     and I'm installing Ovirt Node 4.3.9 via pxe, this is
>     particularly
>      >     >     useful
>      >     >     when I have to install a bunch of blades at once.. In
>     the past
>      >     I had no
>      >     >     issues and all was working like a charm until now when some
>      >     hardware
>      >     >     failed and I had to replace it.
>      >     >
>      >     >     As expected I have no issues in the node installation
>      >     process.. the
>      >     >     troubles begins when I try to add the node,
>     installation fails
>      >     and in
>      >     >     the UI I have an exclamation mark with the message
>     "Host has
>      >     no default
>      >     >     route." but I can ping and do ssh to the host from the
>      >     manager.. the
>      >     >     problem is somewhere else in the communication between the
>      >     engine and
>      >     >     vdsmd preventing the engine to refresh the host
>     capabilities.
>      >     >
>      >     >     So from the engine I tried:
>      >     >
>      >     >     [root@manager ~]# openssl s_client -connect
>     172.20.22.78:54321 <http://172.20.22.78:54321>
>      >     <http://172.20.22.78:54321>
>      >     >     <http://172.20.22.78:54321>
>      >     >     CONNECTED(00000003)
>      >     >     ---
>      >     >     Certificate chain
>      >     >       0 s:/CN=cn128.lagrange.di.unimi.it/O=VDSM
>     <http://cn128.lagrange.di.unimi.it/O=VDSM>
>      >     <http://cn128.lagrange.di.unimi.it/O=VDSM>
>      >     >     <http://cn128.lagrange.di.unimi.it/O=VDSM> Certificate
>      >     >         i:/CN=VDSM Certificate Authority
>      >     >       1 s:/CN=VDSM Certificate Authority
>      >     >         i:/CN=VDSM Certificate Authority
>      >     >     ---
>      >     >
>      >     >     The host has still the self signed vdsm certificate..
>     and on the
>      >     >     host in
>      >     >     vdsm.log I find:
>      >     >
>      >     >     2020-05-11 09:52:25,433+0000 ERROR (Reactor thread)
>      >     >     [ProtocolDetector.SSLHandshakeDispatcher] ssl
>     handshake: SSLError,
>      >     >     address: ::ffff:159.149.129.220 (sslutils:264)
>      >     >
>      >     >     So I tried to enroll the certificate from the UI and
>     from the
>      >     events
>      >     >     tab
>      >     >     I sow the enrolling was successful but:
>      >     >
>      >     >     [root@manager ~]# openssl s_client -connect
>     172.20.22.78:54321 <http://172.20.22.78:54321>
>      >     <http://172.20.22.78:54321>
>      >     >     <http://172.20.22.78:54321>
>      >     >
>      >     >     140084336994192:error:140790E5:SSL routines:ssl23_write:ssl
>      >     handshake
>      >     >     failure:s23_lib.c:177:
>      >     >     CONNECTED(00000003)
>      >     >     ---
>      >     >     no peer certificate available
>      >     >     ---
>      >     >
>      >     >     there's still some issue with the certificates.. so on the
>      >     host again:
>      >     >
>      >     >     [root@cn128 vdsm]# find /etc/pki/vdsm/ -type f -cmin -10|
>      >     xargs ls -l
>      >     >     -rw-------. 1 root kvm  1424 May 11 09:56
>      >     /etc/pki/vdsm/certs/cacert.pem
>      >     >     -rw-------. 1 root kvm  5108 May 11 09:57
>      >     >     /etc/pki/vdsm/certs/vdsmcert.pem
>      >     >     -r--r-----. 1 root kvm  1704 May 11 09:56
>      >     /etc/pki/vdsm/keys/vdsmkey.pem
>      >     >     -rw-r--r--. 1 root root 1424 May 11 09:57
>      >     >     /etc/pki/vdsm/libvirt-spice/ca-cert.pem
>      >     >     -rw-r--r--. 1 root root 5108 May 11 09:57
>      >     >     /etc/pki/vdsm/libvirt-spice/server-cert.pem
>      >     >     -r--r-----. 1 root root 1704 May 11 09:56
>      >     >     /etc/pki/vdsm/libvirt-spice/server-key.pem
>      >     >
>      >     >     It seems that cacert.pem and vdsmcert.pem have wrong
>     permissions..
>      >     >     let's
>      >     >     try to fix it..
>      >     >
>      >     >     [root@cn128 vdsm]# chown 36:36
>     /etc/pki/vdsm/certs/cacert.pem
>      >     >     /etc/pki/vdsm/certs/vdsmcert.pem
>      >     >
>      >     >     And now:
>      >     >
>      >     >     [root@manager ~]# openssl s_client -connect
>      >     172.20.22.78:54321| less
>      >     >     CONNECTED(00000003)
>      >     >     ---
>      >     >     Certificate chain
>      >     >       0 s:/O=lagrange.di.unimi.it/CN=172.20.22.78
>     <http://lagrange.di.unimi.it/CN=172.20.22.78>
>      >     <http://lagrange.di.unimi.it/CN=172.20.22.78>
>      >     >     <http://lagrange.di.unimi.it/CN=172.20.22.78>
>      >     >
>      >     >
>      >   
>        i:/C=US/O=lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941 <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     >   
>       <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     >       1
>      >     >
>      >   
>        s:/C=US/O=lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941 <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     >   
>       <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     >
>      >     >
>      >   
>        i:/C=US/O=lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941 <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     >   
>       <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>      >     >     ---
>      >     >
>      >     >     Now I can finally refresh the host capabilities and
>     setup the host
>      >     >     networks..
>      >     >
>      >     >     In attachment all the relevant logs, I don't know if I've
>      >     found some
>      >     >     bug.. this is the first time i had so many troubles
>     adding a
>      >     new host..
>      >     >     so I decided to share my experience with the list..
>      >     >
>      >     >
>      >     > Thanks for raising this.
>      >     >
>      >     > On adding the host there is an error about
>     vdsm-hook-nestedvt which I
>      >     > cannot interprete, maybe someone else can do.
>      >     > In vdsm.log I noticed a strange behavior of setupNetworks,
>     can you
>      >     > please share the corresponding supervdsm.log, too?
>      >     >
>      >     >
>      >     >
>      >     >     Cheers
>      >     >     --
>      >     >     gb
>      >     >
>      >     >     PGP Key: http://pgp.mit.edu/
>      >     >     Primary key fingerprint: C510 0765 943E EBED A4F2 69D3
>     16CC DC90
>      >     >     B9CB 0F34
>      >     >     _______________________________________________
>      >     >     Users mailing list -- users@ovirt.org
>     <mailto:users@ovirt.org> <mailto:users@ovirt.org
>     <mailto:users@ovirt.org>>
>      >     <mailto:users@ovirt.org <mailto:users@ovirt.org>
>     <mailto:users@ovirt.org <mailto:users@ovirt.org>>>
>      >     >     To unsubscribe send an email to users-leave@ovirt.org
>     <mailto:users-leave@ovirt.org>
>      >     <mailto:users-leave@ovirt.org <mailto:users-leave@ovirt.org>>
>      >     >     <mailto:users-leave@ovirt.org
>     <mailto:users-leave@ovirt.org> <mailto:users-leave@ovirt.org
>     <mailto:users-leave@ovirt.org>>>
>      >     >     Privacy Statement:
>     https://www.ovirt.org/privacy-policy.html
>      >     >     oVirt Code of Conduct:
>      >     > https://www.ovirt.org/community/about/community-guidelines/
>      >     >     List Archives:
>      >     >
>      >
>     https://lists.ovirt.org/archives/list/users@ovirt.org/message/6JTU3HB4WCI27WSLGEOSLMPYFU22EX5H/
>      >     >
>      >     Hi,
>      >     I don't think that the missing vdsm-hook-nestedvt is a
>     problem, in our
>      >     environment we have one engine but multiple clusters and that
>     hook is
>      >     only needed on one cluster to enable nested virtualization.
>      >
>      >     See attachment for supervdsm.log.
>      >
>      >
>      > Thanks, network config flows looked fine.
>      >
>      > Maybe
>      > https://bugzilla.redhat.com/1794485
>      > is the root for this issue?
>      >
>      >
>      >     Regards
>      >     --
>      >     gb
>      >
>      >     PGP Key: http://pgp.mit.edu/
>      >     Primary key fingerprint: C510 0765 943E EBED A4F2 69D3 16CC DC90
>      >     B9CB 0F34
>      >
>
>     I removed the file
>     /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/vdsmhooks/packages.d/vdsm-hook-nestedvt.centos
>     from the engine host ( the content of the file was "vdsm-hook-nestedvt"
>     ) and reinstalled another host and now the installation works correctly.
>
>
> This is a great hint. Do you have an idea where this file comes from?

Yes, it was a change made by another member of our staff to automate the
installation of that hook.. as far as I know this is the correct way to
add additional packages during the host installation, but I still have
no idea why the required package can not be found, even via yum install
as I wrote before.

So now the real question is: why can't I install vdsm-hook-nestedvt via yum?

And even if it's now clear that this is the reason why the installation
process fails I wasn't expecting such a big failure.. the hook itself
it's not strictly necessary to have a working host.. I was expecting a
warning more than a fail..

But at least I'm glad I've found the cause of the failure

>
>     So the problem is that during the host installation vdsm-hook-nestedvt
>     cannot be found/downloaded from the repos and this, somehow, breaks the
>     installation process, the certificate enrollment and so on..
>
>     As a matter of fact if I try:
>
>     [root@cn127 ~]# yum install vdsm-hook-nestedvt
>     Loaded plugins: enabled_repos_upload, fastestmirror, imgbased-persist,
>     package_upload, product-id,
>                    : search-disabled-repos, subscription-manager,
>     vdsmupgrade, versionlock
>     This system is not registered with an entitlement server. You can use
>     subscription-manager to register.
>     Loading mirror speeds from cached hostfile
>       * ovirt-4.3-epel: epel.mirror.far.fi <http://epel.mirror.far.fi>
>     No package vdsm-hook-nestedvt available.
>     Error: Nothing to do
>     Uploading Enabled Repositories Report
>     Cannot upload enabled repos report, is this client registered?
>
>     Thanks for the support.
>
>     --
>     gb
>
>     PGP Key: http://pgp.mit.edu/
>     Primary key fingerprint: C510 0765 943E EBED A4F2 69D3 16CC DC90
>     B9CB 0F34
>

--
gb

PGP Key: http://pgp.mit.edu/
Primary key fingerprint: C510 0765 943E EBED A4F2 69D3 16CC DC90 B9CB 0F34



--

Lev Veyde

Senior Software Engineer, RHCE | RHCVA | MCITP

Red Hat Israel

lev@redhat.com | lveyde@redhat.com