New subject: Unable to deploy Hyperconverged Engine Node - v4.3.3

13 May 2019

      Hi everyone,

I am trying a Gluster Hyperconvergence deployment where the Gluster part has been completed successfully. All hosts are Centos 7.6.1810 (fresh install) and two HP DL20 G9 (for VM's) and one HP 120 G7 (which hosts the Gluster arbiter volumes). Unfortunately I am unable to deploy the Engine, both CLI and GUI approaches fail with the error below. On first sight it looks similar to https://lists.ovirt.org/pipermail/users/2018-March/087802.html but I've configured a static IP (same subnet as the host), no DHCP. I also tried to force ipv4 with "/usr/sbin/ovirt-hosted-engine-setup --4" but the very same error was thrown in every case when trying to deploy the engine:

[ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": [{"address": "sub.sub.domain.tld", "affinity_labels": [], "auto_numa_status": "unknown", "certificate": {"organization": "sub.domain.tld", "subject": "O=sub.domain.tld,CN=sub.sub.domain.tld"}, "cluster": {"href": "/ovirt-engine/api/clusters/f083f056-74fd-11e9-bba9-00163e522076", "id": "f083f056-74fd-11e9-bba9-00163e522076"}, "comment": "", "cpu": {"speed": 0.0, "topology": {}}, "device_passthrough": {"enabled": false}, "devices": [], "external_network_provider_configurations": [], "external_status": "ok", "hardware_information": {"supported_rng_sources": []}, "hooks": [], "href": "/ovirt-engine/api/hosts/dc4f5c15-4989-4454-ba46-3bd600796b69", "id": "dc4f5c15-4989-4454-ba46-3bd600796b69", "katello_errata": [], "kdump_status": "unknown", "ksm": {"enabled": false}, "max_scheduling_memory": 0, "memory": 0, "name": "sub.sub.domain.tld", "network_attachments": [], "nics": [], "numa_nodes": [], "numa_supported": fals
 e, "os": {"custom_kernel_cmdline": ""}, "permissions": [], "port": 54321, "power_management": {"automatic_pm_enabled": true, "enabled": false, "kdump_detection": true, "pm_proxies": []}, "protocol": "stomp", "se_linux": {}, "spm": {"priority": 5, "status": "none"}, "ssh": {"fingerprint": "SHA256:L8YyAMcxLFJEng+CoDympwkpMwoagcBafI4fpLP4Kk0", "port": 22}, "statistics": [], "status": "install_failed", "storage_connection_extensions": [], "summary": {"total": 0}, "tags": [], "transparent_huge_pages": {"enabled": false}, "type": "rhel", "unmanaged_networks": [], "update_available": false, "vgpu_placement": "consolidated"}]}, "attempts": 120, "changed": false}

Unfortunately I don't really have an idea where to check for what considering the error message. The to be deployed engine VM gets listed as KVM VM, is accessible through the bridge and seems to be started up completely, I can even access the Engine web interface (engine01.sub.domain.tld/ovirt-engine).

In /var/log/messages the following can be found ...

"May 13 12:40:55 host ansible-async_wrapper.py: 15505 still running (86015)
May 13 12:40:57 host python: ansible-ovirt_host_facts Invoked with all_content=False pattern=name=sub.sub.domain.tld fetch_nested=False nested_attributes=[] auth={'timeout': 0, 'url': 'https://engine01.sub.domain.tld/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': '8s-vELzQqNTR6l7-KRuqnYLE3sVwVWU5NxiNWzc-s2CllaQG_5YZ32fCFkVsAgwEyLWjPIOxvyS-_4js-VYFFQ', 'ca_file': None}"

... and after 120 attempts Ansible stops and fails with a deployment error. When re-trying after removing the VM and ovirt-hosted-engine-cleanup the very same error is thrown.

What is a bit weird is this entry in /var/log/ovirt-hosted-engine-setup/

./engine-logs-2019-05-13T12:26:20Z/ovirt-engine/engine.log:2019-05-13 12:34:40,369Z ERROR [org.ovirt.engine.core.uutils.ssh.SSHDialog] (EE-ManagedThreadFactory-engine-Thread-1) [12746235] SSH error running command root@sub.sub.domain.tld:'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True': RuntimeException: Unexpected error during execution: bash: /tmp/ovirt-pTVEEzlb8b/ovirt-host-deploy: Permission denied
./engine-logs-2019-05-13T12:26:20Z/ovirt-engine/engine.log:2019-05-13 12:34:40,406Z ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [12746235] EVENT_ID: VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during installation of Host sub.sub.domain.tld: Unexpected error during execution: bash: /tmp/ovirt-pTVEEzlb8b/ovirt-host-deploy: Permission denied

Could that be the cause and how can I fix it? What else do you guys need?

Thanks in advance, Martin

Unable to deploy Hyperconverged Engine Node - v4.3.3

anonmix＠gmail.com

Simone Tiraboschi

anon mix

anonmix＠gmail.com

Jason Brooks

anonmix＠gmail.com

anonmix＠gmail.com

tags

participants (4)