Hi Simone,

 

Yes the MAC address in answers.conf: OVEHOSTED_VM/vmMACAddr=

is added as a reservation to the DHCP server, so in theory 10.0.0.109 should be assigned. 

 

However perhaps DHCP is not working.  I have just changed to a static IP instead:

OVEHOSTED_VM/cloudinitVMStaticCIDR=str:10.0.0.109/24

(let me know if this isn’t the correct way)

 

My host fails to get an IP automatically from this DHCP server, so it is quite possible engine’s DHCP has been failing too.  Each time the host boots, I must type dhclient in order to receive an IP address.  Anyway, after changing this and re-running hosted-engine –deploy, failed due to:

 

[ INFO  ] TASK [Copy local VM disk to shared storage]

[ INFO  ] changed: [localhost]

[ INFO  ] TASK [show local_vm_ip.std_out_lines[0] that will be written to etc hosts]

[ INFO  ] ok: [localhost]

[ INFO  ] TASK [show FQDN]

[ INFO  ] ok: [localhost]

[ INFO  ] TASK [Clean /etc/hosts on the host]

[ ERROR ] fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: list object has no element 0\n\nThe error appears to have been in '/usr/share/ovirt-hosted-engine-setup/ansible/create_target_vm.yml': line 400, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n    debug: var=FQDN\n  - name: Clean /etc/hosts on the host\n    ^ here\n"}

 

I have just tried deploying using the webui, same error.  I suspect the “undefined variable” is local_vm_ip.std_out_lines[0].  My new debug task that tries to output this is:

  - name: show local_vm_ip.std_out_lines[0] that will be written to etc hosts

    debug: var=local_vm_ip.stdout_lines[0]

 

You can see the output of this above.  I think I was mistaken to suggest the value of this is localhost.  Localhost is just the machine this task ran on.  I don’t think list local_vm_ip.std_out_lines is defined.  Any more ideas?

 

Many thanks

 

From: Simone Tiraboschi <stirabos@redhat.com>
Sent: 09 October 2018 16:51
To: B Holmes <me@brendanh.com>
Cc: users <users@ovirt.org>
Subject: Re: [ovirt-users] Re: Diary of hosted engine install woes

 

 

On Tue, Oct 9, 2018 at 4:54 PM <me@brendanh.com> wrote:

I'ved added a record to the DNS server here:
ovirt-engine.example.com  10.0.0.109

 

OK, and how the engine VM will get that address?

Are you using DHCP? do you have a DHCP reservation for the MAC address you are using on the engine VM?

Are you configuring it with a static IP?

 


This IP address is on the physical network that the host is on (host is on 10.0.0.171).  I trust this is correct and I should not resolve to a natted IP instead.  I notice that regardless of this record, the name ovirt-engine.example.com resolves to a natted IP: 192.168.124.51 because the ansible script adds an entry to /etc/hosts:
192.168.124.51  ovirt-engine.example.com
While the script is running, if I I can successfully ping ovirt-engine.example.com, it responds on 192.168.124.51.  So as you say: "host can correctly resolve the name of the engine VM", but it's not the DNS record's IP.  If I remove the DNS record and run hosted-engine --deploy, I get error:
[ ERROR ] Host name is not valid: ovirt-engine.example.com did not resolve into an IP address

Anyway, I added back the DNS record and ran hosted-engine --deploy command, it failed at:
[ INFO  ] TASK [Clean /etc/hosts on the host]
[ ERROR ] fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: list object has no element 0\n\nThe error appears to have been in '/usr/share/ovirt-hosted-engine-setup/ansible/create_target_vm.yml': line 396, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n    changed_when: True\n  - name: Clean /etc/hosts on the host\n    ^ here\n"}

To debug, I added tasks to create_target_vm.yml that output the values of local_vm_ip.std_out_lines[0] and FQDN that are used in this task, then ran the usual deploy command again.  They are both localhost:
[ INFO  ] TASK [show local_vm_ip.std_out_lines[0] that will be written to etc hosts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [show FQDN]
[ INFO  ] ok: [localhost]

This time, it gets past [Clean /etc/hosts on the host], but hangs at [ INFO  ] TASK [Check engine VM health] same as before.

 

This is fine, the bootstrap local VM runs over a natted network then, once ready it will be shutdown and moved to the shared storage. At that point it will be restarted on your management network.

 

  I catted /etc/hosts while it was hanging and it contains:
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

The ovirt-engine.example.com has been deleted!  I pinged ovirt-engine.example.com and it now resolves to its IP on the physical network: 10.0.0.109.  So I added back this /etc/hosts entry:
192.168.124.51  ovirt-engine.example.com

 

Please avoid this.

 


It subsequently errored:
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.167559", "end": "2018-10-09 15:43:41.947274", "rc": 0, "start": "2018-10-09 15:43:41.779715", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=6810 (Tue Oct  9 15:43:36 2018)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=6810 (Tue Oct  9 15:43:37 2018)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\", \"hostname\": \"host\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"c5d76f8b\", \"local_conf_timestamp\": 6810, \"host-ts\": 6810}, \"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"conf_
 on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=6810 (Tue Oct  9 15:43:36 2018)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=6810 (Tue Oct  9 15:43:37 2018)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\", \"hostname\": \"host\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"c5d76f8b\", \"local_conf_timestamp\": 6810, \"host-ts\": 6810}, \"global_maintenance\": false}"]}

How can I check the hosted-engine's IP address to ensure name resolution is correct?

 

You can connect to that VM with VNC and check the IP there.

 

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SVBXIBLS5TSP7SZROSSE6JD5ICBZLV3E/