On Tue, May 12, 2020 at 8:49 AM Giorgio Biacchi <giorgio@di.unimi.it> wrote:
On 5/11/20 5:53 PM, Dominik Holler wrote:
>
>
> On Mon, May 11, 2020 at 12:31 PM Giorgio Biacchi <giorgio@di.unimi.it
> <mailto:giorgio@di.unimi.it>> wrote:
>
>     Hi list,
>     I've spent a couple of days trying to understand why this was
>     happening...
>
>     For the installation I have a well tested installation server with a
>     custom kickstart file to setup ssh keys and custom hooks for infiniband
>     and I'm installing Ovirt Node 4.3.9 via pxe, this is particularly
>     useful
>     when I have to install a bunch of blades at once.. In the past I had no
>     issues and all was working like a charm until now when some hardware
>     failed and I had to replace it.
>
>     As expected I have no issues in the node installation process.. the
>     troubles begins when I try to add the node, installation fails and in
>     the UI I have an exclamation mark with the message "Host has no default
>     route." but I can ping and do ssh to the host from the manager.. the
>     problem is somewhere else in the communication between the engine and
>     vdsmd preventing the engine to refresh the host capabilities.
>
>     So from the engine I tried:
>
>     [root@manager ~]# openssl s_client -connect 172.20.22.78:54321
>     <http://172.20.22.78:54321>
>     CONNECTED(00000003)
>     ---
>     Certificate chain
>       0 s:/CN=cn128.lagrange.di.unimi.it/O=VDSM
>     <http://cn128.lagrange.di.unimi.it/O=VDSM> Certificate
>         i:/CN=VDSM Certificate Authority
>       1 s:/CN=VDSM Certificate Authority
>         i:/CN=VDSM Certificate Authority
>     ---
>
>     The host has still the self signed vdsm certificate.. and on the
>     host in
>     vdsm.log I find:
>
>     2020-05-11 09:52:25,433+0000 ERROR (Reactor thread)
>     [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError,
>     address: ::ffff:159.149.129.220 (sslutils:264)
>
>     So I tried to enroll the certificate from the UI and from the events
>     tab
>     I sow the enrolling was successful but:
>
>     [root@manager ~]# openssl s_client -connect 172.20.22.78:54321
>     <http://172.20.22.78:54321>
>
>     140084336994192:error:140790E5:SSL routines:ssl23_write:ssl handshake
>     failure:s23_lib.c:177:
>     CONNECTED(00000003)
>     ---
>     no peer certificate available
>     ---
>
>     there's still some issue with the certificates.. so on the host again:
>
>     [root@cn128 vdsm]# find /etc/pki/vdsm/ -type f -cmin -10| xargs ls -l
>     -rw-------. 1 root kvm  1424 May 11 09:56 /etc/pki/vdsm/certs/cacert.pem
>     -rw-------. 1 root kvm  5108 May 11 09:57
>     /etc/pki/vdsm/certs/vdsmcert.pem
>     -r--r-----. 1 root kvm  1704 May 11 09:56 /etc/pki/vdsm/keys/vdsmkey.pem
>     -rw-r--r--. 1 root root 1424 May 11 09:57
>     /etc/pki/vdsm/libvirt-spice/ca-cert.pem
>     -rw-r--r--. 1 root root 5108 May 11 09:57
>     /etc/pki/vdsm/libvirt-spice/server-cert.pem
>     -r--r-----. 1 root root 1704 May 11 09:56
>     /etc/pki/vdsm/libvirt-spice/server-key.pem
>
>     It seems that cacert.pem and vdsmcert.pem have wrong permissions..
>     let's
>     try to fix it..
>
>     [root@cn128 vdsm]# chown 36:36 /etc/pki/vdsm/certs/cacert.pem
>     /etc/pki/vdsm/certs/vdsmcert.pem
>
>     And now:
>
>     [root@manager ~]# openssl s_client -connect 172.20.22.78:54321| less
>     CONNECTED(00000003)
>     ---
>     Certificate chain
>       0 s:/O=lagrange.di.unimi.it/CN=172.20.22.78
>     <http://lagrange.di.unimi.it/CN=172.20.22.78>
>        
>     i:/C=US/O=lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941
>     <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>       1
>     s:/C=US/O=lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941
>     <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>        
>     i:/C=US/O=lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941
>     <http://lagrange.di.unimi.it/CN=cn305.lagrange.di.unimi.it.35941>
>     ---
>
>     Now I can finally refresh the host capabilities and setup the host
>     networks..
>
>     In attachment all the relevant logs, I don't know if I've found some
>     bug.. this is the first time i had so many troubles adding a new host..
>     so I decided to share my experience with the list..
>
>
> Thanks for raising this.
>
> On adding the host there is an error about  vdsm-hook-nestedvt which I
> cannot interprete, maybe someone else can do.
> In vdsm.log I noticed a strange behavior of setupNetworks, can you
> please share the corresponding supervdsm.log, too?
>
>  
>
>     Cheers
>     --
>     gb
>
>     PGP Key: http://pgp.mit.edu/
>     Primary key fingerprint: C510 0765 943E EBED A4F2 69D3 16CC DC90
>     B9CB 0F34
>     _______________________________________________
>     Users mailing list -- users@ovirt.org <mailto:users@ovirt.org>
>     To unsubscribe send an email to users-leave@ovirt.org
>     <mailto:users-leave@ovirt.org>
>     Privacy Statement: https://www.ovirt.org/privacy-policy.html
>     oVirt Code of Conduct:
>     https://www.ovirt.org/community/about/community-guidelines/
>     List Archives:
>     https://lists.ovirt.org/archives/list/users@ovirt.org/message/6JTU3HB4WCI27WSLGEOSLMPYFU22EX5H/
>
Hi,
I don't think that the missing vdsm-hook-nestedvt is a problem, in our
environment we have one engine but multiple clusters and that hook is
only needed on one cluster to enable nested virtualization.

See attachment for supervdsm.log.


Thanks, network config flows looked fine.

Maybe
https://bugzilla.redhat.com/1794485
is the root for this issue?
 
Regards
--
gb

PGP Key: http://pgp.mit.edu/
Primary key fingerprint: C510 0765 943E EBED A4F2 69D3 16CC DC90 B9CB 0F34