Hi all,

On Fri, Feb 14, 2020 at 6:45 PM Florian Nolden <f.nolden@xilloc.com> wrote:

Thanks, Fredy for your great help. Setting the Banner and PrintMotd options on all 3 nodes helped me to succeed with the installation.


Thanks a lot for the report!
 
Am Fr., 14. Feb. 2020 um 16:23 Uhr schrieb Fredy Sanchez <fredy.sanchez@modmed.com>:
Banner none
PrintMotd no

# systemctl restart sshd

That should be fixed in the ovirt-node images. 

I think I agree. Would you like to open a bug about this?

I wonder what we can/should do with EL7 hosts (non-ovirt-node).

Also need to check how 4.4 behaves - there, host-deploy was fully rewritten using ansible. No idea how sensitive ansible is to these banners (compared with otopi, which is very). Adding Dana.

Best regards,
 


If gluster installed successfully, you don't have to reinstall it.
Just run the hyperconverged install again from cockpit, and it will detect the existing gluster install, and ask you if you want to re-use it; re-using worked for me. Only thing I'd point out here is that gluster didn't enable in my servers automagically; I had to enable it and start it by hand before cockpit picked it up.
# systemctl enable glusterd --now
# systemctl status glusterd

Gluster was running fine for me. For me that was not needed. 

Also,
# tail -f /var/log/secure
while the install is going will help you see if there's a problem with ssh, other than the banners.

--
Fredy

On Fri, Feb 14, 2020 at 9:32 AM Florian Nolden <f.nolden@xilloc.com> wrote:

Am Fr., 14. Feb. 2020 um 12:21 Uhr schrieb Fredy Sanchez <fredy.sanchez@modmed.com>:
Hi Florian, 

In my case, Didi's suggestions got me thinking, and I ultimately traced this to the ssh banners; they must be disabled. You can do this in sshd_config. I do think that logging could be better for this issue, and that the host up check should incorporate things other than ssh, even if just a ping. Good luck.

Hi Fredy,

thanks for the reply.

I just have to uncomment "Banners none" in the /etc/ssh/sshd_config on all 3 nodes, and run redeploy in the cockpit?
Or have you also reinstalled the nodes and the gluster storage? 
--
Fredy

On Fri, Feb 14, 2020, 4:55 AM Florian Nolden <f.nolden@xilloc.com> wrote:
I'also stuck with that issue.

I have 
3x  HP ProLiant DL360 G7

1x 1gbit => as control network
3x 1gbit => bond0 as Lan
2x 10gbit => bond1 as gluster network

I installed on all 3 servers Ovirt Node 4.3.8
configured the networks using cockpit.
the installed the hosted engine with cockpit ->:
[ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up]
[ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": [{"address": "x-c01-n01.lan.xilloc.com", "affinity_labels": [], "auto_numa_status": "unknown", "certificate": {"organization": "lan.xilloc.com", "subject": "O=lan.xilloc.com,CN=x-c01-n01.lan.xilloc.com"}, "cluster": {"href": "/ovirt-engine/api/clusters/3dff6890-4e7b-11ea-90cb-00163e6a7afe", "id": "3dff6890-4e7b-11ea-90cb-00163e6a7afe"}, "comment": "", "cpu": {"speed": 0.0, "topology": {}}, "device_passthrough": {"enabled": false}, "devices": [], "external_network_provider_configurations": [], "external_status": "ok", "hardware_information": {"supported_rng_sources": []}, "hooks": [], "href": "/ovirt-engine/api/hosts/ded7aa60-4a5e-456e-b899-dd7fc25cc7b3", "id": "ded7aa60-4a5e-456e-b899-dd7fc25cc7b3", "katello_errata": [], "kdump_status": "unknown", "ksm": {"enabled": false}, "max_scheduling_memory": 0, "memory": 0, "name": "x-c01-n01.lan.xilloc.com", "network_attachments": [], "nics": [], "numa_nodes": [], "numa_supported": false, "os": {"custom_kernel_cmdline": ""}, "permissions": [], "port": 54321, "power_management": {"automatic_pm_enabled": true, "enabled": false, "kdump_detection": true, "pm_proxies": []}, "protocol": "stomp", "se_linux": {}, "spm": {"priority": 5, "status": "none"}, "ssh": {"fingerprint": "SHA256:lWc/BuE5WukHd95WwfmFW2ee8VPJ2VugvJeI0puMlh4", "port": 22}, "statistics": [], "status": "non_responsive", "storage_connection_extensions": [], "summary": {"total": 0}, "tags": [], "transparent_huge_pages": {"enabled": false}, "type": "ovirt_node", "unmanaged_networks": [], "update_available": false, "vgpu_placement": "consolidated"}]}, "attempts": 120, "changed": false, "deprecations": [{"msg": "The 'ovirt_host_facts' module has been renamed to 'ovirt_host_info', and the renamed one no longer returns ansible_facts", "version": "2.13"}]}


What is the best approach now to install a Ovirt Hostedengine?

Kind regards,

Florian Nolden

Head of IT at Xilloc Medical B.V.

www.xilloc.com “Get aHead with patient specific implants” 

Xilloc Medical B.V., Urmonderbaan 22 Gate 2, Building 110, 6167 RD Sittard-Geleen

—————————————————————————————————————

Disclaimer: The content of this e-mail, including any attachments, are confidential and are intended for the sole use of the individual or entity to which it is addressed. If you have received it by mistake please let us know by reply and then delete it from your system. Any distribution, copying or dissemination of this message is expected to conform to all legal stipulations governing the use of information.



Am Mo., 27. Jan. 2020 um 07:56 Uhr schrieb Yedidyah Bar David <didi@redhat.com>:
On Sun, Jan 26, 2020 at 8:45 PM Fredy Sanchez <fredy.sanchez@modmed.com> wrote:
Hi all,

[root@bric-ovirt-1 ~]# cat /etc/*release*
CentOS Linux release 7.7.1908 (Core)
[root@bric-ovirt-1 ~]# yum info ovirt-engine-appliance
Installed Packages
Name        : ovirt-engine-appliance
Arch        : x86_64
Version     : 4.3
Release     : 20191121.1.el7
Size        : 1.0 G
Repo        : installed
From repo   : ovirt-4.3

Same situation as https://bugzilla.redhat.com/show_bug.cgi?id=1787267. The error message almost everywhere is some red herring message about ansible

You are right that it's misleading, but were the errors below the only ones you got from ansible?
 
[ INFO  ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up]
[ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": []}, "attempts": 120, "changed": false, "deprecations": [{"msg": "The 'ovirt_host_facts' module has been renamed to 'ovirt_host_info', and the renamed one no longer returns ansible_facts", "version": "2.13"}]}
[ INFO  ] TASK [ovirt.hosted_engine_setup : Notify the user about a failure]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
          Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20200126170315-req4qb.log

But the "real" problem seems to be SSH related, as you can see below

Indeed
 
[root@bric-ovirt-1 ovirt-engine]# pwd
/var/log/ovirt-hosted-engine-setup/engine-logs-2020-01-26T17:19:28Z/ovirt-engine
[root@bric-ovirt-1 ovirt-engine]# grep -i error engine.log
2020-01-26 17:26:50,178Z ERROR [org.ovirt.engine.core.bll.hostdeploy.AddVdsCommand] (default task-1) [2341fd23-f0c7-4f1c-ad48-88af20c2d04b] Failed to establish session with host 'bric-ovirt-1.corp.modmed.com': SSH session closed during connection 'root@bric-ovirt-1.corp.modmed.com'

Please check/share the entire portion of engine.log, from where it starts to try to ssh til it gives up.
 
2020-01-26 17:26:50,205Z ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1) [] Operation Failed: [Cannot add Host. Connecting to host via SSH has failed, verify that the host is reachable (IP address, routable address etc.) You may refer to the engine.log file for further details.]

The funny thing is that the engine can indeed ssh to bric-ovirt-1 (physical host). See below

[root@bric-ovirt-1 ovirt-hosted-engine-setup]# cat /etc/hosts
192.168.1.52 bric-ovirt-engine.corp.modmed.com # temporary entry added by hosted-engine-setup for the bootstrap VM
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.130.0.50 bric-ovirt-engine bric-ovirt-engine.corp.modmed.com
10.130.0.51 bric-ovirt-1 bric-ovirt-1.corp.modmed.com
10.130.0.52 bric-ovirt-2 bric-ovirt-2.corp.modmed.com
10.130.0.53 bric-ovirt-3 bric-ovirt-3.corp.modmed.com
192.168.0.1 bric-ovirt-1gluster bric-ovirt-1gluster.corp.modmed.com
192.168.0.2 bric-ovirt-2gluster bric-ovirt-2gluster.corp.modmed.com
192.168.0.3 bric-ovirt-3gluster bric-ovirt-3gluster.corp.modmed.com
[root@bric-ovirt-1 ovirt-hosted-engine-setup]#

[root@bric-ovirt-1 ~]# ssh 192.168.1.52
Last login: Sun Jan 26 17:55:20 2020 from 192.168.1.1
[root@bric-ovirt-engine ~]#
[root@bric-ovirt-engine ~]#
[root@bric-ovirt-engine ~]# ssh bric-ovirt-1
Password:
Password:
Last failed login: Sun Jan 26 18:17:16 UTC 2020 from 192.168.1.52 on ssh:notty
There was 1 failed login attempt since the last successful login.
Last login: Sun Jan 26 18:16:46 2020
###################################################################
# UNAUTHORIZED ACCESS TO THIS SYSTEM IS PROHIBITED                #
#                                                                 #
# This system is the property of Modernizing Medicine, Inc.       #
# It is for authorized Company business purposes only.            #
# All connections are monitored and recorded.                     #
# Disconnect IMMEDIATELY if you are not an authorized user!       #
###################################################################
[root@bric-ovirt-1 ~]#
[root@bric-ovirt-1 ~]#
[root@bric-ovirt-1 ~]# exit
logout
Connection to bric-ovirt-1 closed.
[root@bric-ovirt-engine ~]#
[root@bric-ovirt-engine ~]#
[root@bric-ovirt-engine ~]# ssh bric-ovirt-1.corp.modmed.com
Password:
Last login: Sun Jan 26 18:17:22 2020 from 192.168.1.52
###################################################################
# UNAUTHORIZED ACCESS TO THIS SYSTEM IS PROHIBITED                #
#                                                                 #
# This system is the property of Modernizing Medicine, Inc.       #
# It is for authorized Company business purposes only.            #
# All connections are monitored and recorded.                     #
# Disconnect IMMEDIATELY if you are not an authorized user!       #
###################################################################

Can you please try this, from the engine machine:


If this outputs the above "PROHIBITED" note, you'll have to configure your
scripts etc. to not output it on non-interactive shells. Otherwise, this
confuses the engine - it can't really distinguish between your own output
and the output of the commands it runs there.
 
[root@bric-ovirt-1 ~]# exit
logout
Connection to bric-ovirt-1.corp.modmed.com closed.
[root@bric-ovirt-engine ~]#
[root@bric-ovirt-engine ~]#
[root@bric-ovirt-engine ~]# exit
logout
Connection to 192.168.1.52 closed.
[root@bric-ovirt-1 ~]#

So, what gives? I already disabled all ssh security in the physical host, and whitelisted all potential IPs from the engine using firewalld. Regardless, the engine can ssh to the host as root :-(. Is there maybe another user that's used for the "Wait for the host to be  up" SSH test? Yes, I tried both passwords and certificates.

No, that's root. You can also see that in the log.
 


Maybe what's really happening is that engine is not getting the right IP? bric-ovirt-engine is supposed to get 10.130.0.50, instead it never gets there, getting 192.168.1.52 from virbr0 in bric-ovirt-1. See below.

That's by design. For details, if interested, see "Hosted Engine 4.3 Deep Dive" presentation:

 

 --== HOST NETWORK CONFIGURATION ==--
          Please indicate the gateway IP address [10.130.0.1]
          Please indicate a nic to set ovirtmgmt bridge on: (p4p1, p5p1) [p4p1]:
--== VM CONFIGURATION ==--
You may specify a unicast MAC address for the VM or accept a randomly generated default [00:16:3e:17:1d:f8]:
          How should the engine VM network be configured (DHCP, Static)[DHCP]? static
          Please enter the IP address to be used for the engine VM []: 10.130.0.50
[ INFO  ] The engine VM will be configured to use 10.130.0.50/25
          Please provide a comma-separated list (max 3) of IP addresses of domain name servers for the engine VM
          Engine VM DNS (leave it empty to skip) [10.130.0.2,10.130.0.3]:
          Add lines for the appliance itself and for this host to /etc/hosts on the engine VM?
          Note: ensuring that this host could resolve the engine VM hostname is still up to you
          (Yes, No)[No] Yes

[root@bric-ovirt-1 ~]# ip addr
3: p4p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0a:f7:f1:c6:80 brd ff:ff:ff:ff:ff:ff
    inet 10.130.0.51/25 brd 10.130.0.127 scope global noprefixroute p4p1
       valid_lft forever preferred_lft forever
28: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:25:7b:6f brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.1/24 brd 192.168.1.255 scope global virbr0
       valid_lft forever preferred_lft forever
29: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:25:7b:6f brd ff:ff:ff:ff:ff:ff
30: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 1000
    link/ether fe:16:3e:17:1d:f8 brd ff:ff:ff:ff:ff:ff

The newly created engine VM does remain up even after hosted-engine --deploy errors out; just at the wrong IP. I haven't been able to make it get its real IP.

This happens only after the real engine VM is created, connected to the correct network.

The current engine vm you see is a libvirt VM connected to its default (internal) network.
 
At any rate, thank you very much for taking a look at my very long email. Any and all help would be really appreciated.

Good luck and best regards,
--
Didi
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/AZFPSDPBK3BJUB2NESCOWQ7FQT572Y5I/

CONFIDENTIALITY NOTICE: This e-mail message may contain material protected by the Health Insurance Portability and Accountability Act of 1996 and its implementing regulations and other state and federal laws and legal privileges. This message is only for the personal and confidential use of the individuals or organization to whom the message is addressed. If you are an unintended recipient, you have received this message in error, and any reading, distributing, copying or disclosure is unauthorized and strictly prohibited.  All recipients are hereby notified that any unauthorized receipt does not waive any confidentiality obligations or privileges. If you have received this message in error, please notify the sender immediately at the above email address and confirm that you have deleted or destroyed the message.




CONFIDENTIALITY NOTICE: This e-mail message may contain material protected by the Health Insurance Portability and Accountability Act of 1996 and its implementing regulations and other state and federal laws and legal privileges. This message is only for the personal and confidential use of the individuals or organization to whom the message is addressed. If you are an unintended recipient, you have received this message in error, and any reading, distributing, copying or disclosure is unauthorized and strictly prohibited.  All recipients are hereby notified that any unauthorized receipt does not waive any confidentiality obligations or privileges. If you have received this message in error, please notify the sender immediately at the above email address and confirm that you have deleted or destroyed the message.


--
Didi