Hi,


On Wed, Sep 8, 2021 at 1:36 AM Erdősi Péter <erdosi.peter@kifu.gov.hu> wrote:
Hello!

We're started the upgrade one of our oVirt clusters to the recent 4.4 minor version (4.4.8.5-1.el8) and we found a bug in the recent change of the aaa-ldap plugin.
This bug came along after the IPv4/IPv6 selection introduced in ovirt-engine-extension-aaa-ldap 1.4.4

We've dig down in the rabbit hole, and it looks like, our DNS solution is just not compatible with the plugin after this commit: https://github.com/oVirt/ovirt-engine-extension-aaa-ldap/commit/4c0f2e72df88ce653ce552057554465fb901820f

I don't know, how the industry actually use CNAME records, but we have a rule, that we use independet DNS names for services, and machines itself. This become handy, when you migrate/upgrade services, like LDAP to another machine, and I think, this should be supported.

Actually we haven't changed the way we resolve DNS records, we have just changed a way to detect if we should resolve to IPv4 or IPv6 or both. In previous version we decided to detect relevant IP version if an active interface with IP version existed on engine machine. But this was not reliable enough, so now we get an addresses of default gateway and we ask for A and/or AAAA record upon results we get.


Our LDAP server service domain is actually ldap-master.niif.hu and ldap.niif.hu, but in real, there are two servers, which actually have their own FQDNs.

The actual DNS structure:

host -t CNAME ldap-master.niif.hu
ldap-master.niif.hu is an alias for elm.niif.hu.

host -t CNAME ldap.niif.hu
ldap.niif.hu is an alias for holly.ldap.einfra.hu.

host elm.niif.hu
elm.niif.hu has address 193.225.14.175
elm.niif.hu has IPv6 address 2001:738:0:701::3

host holly.ldap.einfra.hu
holly.ldap.einfra.hu has address 193.224.92.6
holly.ldap.einfra.hu has IPv6 address 2001:738:0:431::6

The commit above (as far as I understand) only tries to resolve A and AAAA records in DNS, and drop the connection if it not found. Of course, the certificate only have ldap-master and ldap.niif.hu in it, so using holly end elm does not solve the problem (also, if the service will be migrated, the service domain will be kept, but not the machine's FQDN, since we cannot afford to shut down one of our LDAP server for a migration windows.

We've tried to downgrade the package to 1.4.3, which is works fine.

The actual error looks like this (engine.log)

2021-09-07 15:53:09,833+02 WARN  [org.ovirt.engine.extension.aaa.ldap.AuthnExtension] (default task-1) [] [ovirt-engine-extension-aaa-ldap.authn::NIIFLdap-authn] Cannot initialize LDAP framework, deferring initialization. Error: An error occurred while attempting to connect to server ldap-master.niif.hu:636:  IOException(LDAPException(resultCode=91 (connect error), errorMessage='An error occurred while attempting to establish a connection to server ldap-master.niif.hu/193.225.14.175:636:  UnknownHostException(ldap-master.niif.hu), ldapSDKVersion=4.0.14, revision=c0fb784eebf9d36a67c736d0428fb3577f2e25bb'))
2021-09-07 15:53:09,833+02 ERROR [org.ovirt.engine.core.sso.servlets.InteractiveAuthServlet] (default task-1) [] Internal Server Error: An error occurred while attempting to connect to server ldap-master.niif.hu:636:  IOException(LDAPException(resultCode=91 (connect error), errorMessage='An error occurred while attempting to establish a connection to server ldap-master.niif.hu/193.225.14.175:636:  UnknownHostException(ldap-master.niif.hu), ldapSDKVersion=4.0.14, revision=c0fb784eebf9d36a67c736d0428fb3577f2e25bb'))
2021-09-07 15:53:09,833+02 ERROR [org.ovirt.engine.core.sso.service.SsoService] (default task-1) [] An error occurred while attempting to connect to server ldap-master.niif.hu:636:  IOException(LDAPException(resultCode=91 (connect error), errorMessage='An error occurred while attempting to establish a connection to server ldap-master.niif.hu/193.225.14.175:636:  UnknownHostException(ldap-master.niif.hu), ldapSDKVersion=4.0.14, revision=c0fb784eebf9d36a67c736d0428fb3577f2e25bb'))
2021-09-07 15:53:09,854+02 ERROR [org.ovirt.engine.core.aaa.servlet.SsoPostLoginServlet] (default task-1) [] server_error: An error occurred while attempting to connect to server ldap-master.niif.hu:636:  IOException(LDAPException(resultCode=91 (connect error), errorMessage='An error occurred while attempting to establish a connection to server ldap-master.niif.hu/193.225.14.175:636:  UnknownHostException(ldap-master.niif.hu), ldapSDKVersion=4.0.14, revision=c0fb784eebf9d36a67c736d0428fb3577f2e25bb'))

So this UnknownHost exception is a bit confusing, but I guess you have hit the same issue as we found out with some of our DNS servers. If you try to fetch both A and AAAA records from the DNS server, it will just timeout in a while and unboundid-ldap-skd takes that as an UnknownHost exception. If you ask the same DNS server to fetch A and AAA records in separate requests, then both records are returned immediately. Unfortunately we haven't been able to find out why this happens on some of our testing servers while it works perfectly fine on others. But you can perform a test using attached Java test tool:

javac Test.java
java -cp . Test <DNS_SERVER_IP> <LDAP FQDN>

The test tool just asks for A address, then AAAA address and then both at once, so if you hit the above issue you will see an exception when asking for both records at once.

Anyway if you want to fix this issue in production you will either need to fix your DNS server (but again no idea what the issue is) or disable either IPv4 or IPv6 address usage in aaa ldap configuration file. For example if you would like to use only IPv4, you need to edit your /etc/ovirt-engine/aaa/<PROFILE>.properties and add following options:

pool.default.socketfactory.resolver.detectIPVersion = false
pool.default.socketfactory.resolver.supportIPv6 = false

Regards,
Martin


As we tried to run the setup tool, that is also looks broken (after that, we've copied the required files to /etc/ovirt-engine/aaa and /etc/ovirt-engine/extensions.d/ from other, working hosted engine) so we've tested the plugin itself, and the setup too, but no luck.

I think, this (CNAME in DNS) should be working with the plugin.

Could you please investigate this issue? (we're here to help test the repaired version/patch, if needed, but not feel the knowledge to create the patch ourself)

Regards:
 Peter

--

Erdősi Péter
Informatikus, IKT Fejlesztési Főosztály

Kormányzati Informatikai Fejlesztési Ügynökség
cím: 1134 Budapest, Váci út 35.
tel: +36 1 450 3080   e-mail: erdosi.peter@kifu.gov.hu

KIFÜ - www.kifu.gov.hu

_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/VUWRCEQWLJXC6RTPEOGXMT67XWURFNPT/


--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.