Hi Simone,
“restart from scratch deploying with a static IP”. Okay, I have reinstalled the host
using oVirt Node from scratch. I am assigning a static IP using the attached answers.conf
which contains:
OVEHOSTED_VM/cloudinitVMStaticCIDR=str:10.0.0.109/24
create_target_vm.yml and all other RedHat code is as-shipped. I’m getting the same
error:
[ INFO ] TASK [Copy /etc/hosts back to the Hosted Engine VM]
[ INFO ] changed: [localhost]
[ INFO ] TASK [Copy local VM disk to shared storage]
[ INFO ] changed: [localhost]
[ INFO ] TASK [Clean /etc/hosts on the host]
[ ERROR ] fatal: [localhost]: FAILED! => {"msg": "The task includes an
option with an undefined variable. The error was: list object has no element 0\n\nThe
error appears to have been in
'/usr/share/ovirt-hosted-engine-setup/ansible/create_target_vm.yml': line 396,
column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe
offending line appears to be:\n\n changed_when: True\n - name: Clean /etc/hosts on the
host\n ^ here\n"}
Any ideas? How can I debug this failure to assign an IP (undefined variable)?
Many thanks,
Brendan
From: Simone Tiraboschi <stirabos(a)redhat.com <mailto:stirabos@redhat.com> >
Sent: 10 October 2018 12:06
To: B Holmes <me(a)brendanh.com <mailto:me@brendanh.com> >
Cc: users <users(a)ovirt.org <mailto:users@ovirt.org> >
Subject: Re: [ovirt-users] Re: Diary of hosted engine install woes
On Tue, Oct 9, 2018 at 11:50 PM Brendan Holmes <me(a)brendanh.com
<mailto:me@brendanh.com> > wrote:
Hi Simone,
Yes the MAC address in answers.conf: OVEHOSTED_VM/vmMACAddr=
is added as a reservation to the DHCP server, so in theory 10.0.0.109 should be assigned.
However perhaps DHCP is not working. I have just changed to a static IP instead:
OVEHOSTED_VM/cloudinitVMStaticCIDR=str:10.0.0.109/24 <
http://10.0.0.109/24>
(let me know if this isn’t the correct way)
My host fails to get an IP automatically from this DHCP server, so it is quite possible
engine’s DHCP has been failing too. Each time the host boots, I must type dhclient in
order to receive an IP address. Anyway, after changing this and re-running hosted-engine
–deploy, failed due to:
[ INFO ] TASK [Copy local VM disk to shared storage]
[ INFO ] changed: [localhost]
[ INFO ] TASK [show local_vm_ip.std_out_lines[0] that will be written to etc hosts]
[ INFO ] ok: [localhost]
[ INFO ] TASK [show FQDN]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Clean /etc/hosts on the host]
[ ERROR ] fatal: [localhost]: FAILED! => {"msg": "The task includes an
option with an undefined variable. The error was: list object has no element 0\n\nThe
error appears to have been in
'/usr/share/ovirt-hosted-engine-setup/ansible/create_target_vm.yml': line 400,
column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe
offending line appears to be:\n\n debug: var=FQDN\n - name: Clean /etc/hosts on the
host\n ^ here\n"}
I have just tried deploying using the webui, same error. I suspect the “undefined
variable” is local_vm_ip.std_out_lines[0]. My new debug task that tries to output this
is:
- name: show local_vm_ip.std_out_lines[0] that will be written to etc hosts
debug: var=local_vm_ip.stdout_lines[0]
You can see the output of this above. I think I was mistaken to suggest the value of this
is localhost. Localhost is just the machine this task ran on. I don’t think list
local_vm_ip.std_out_lines is defined. Any more ideas?
The issue is on a task that isn't part of the code we are shipping.
I can just suggest to simply reinstall the rpm to get rid of any modification and restart
from scratch deploying with a static IP if your DHCP server is not properly working.
Many thanks
From: Simone Tiraboschi < <mailto:stirabos@redhat.com> stirabos(a)redhat.com>
Sent: 09 October 2018 16:51
To: B Holmes < <mailto:me@brendanh.com> me(a)brendanh.com>
Cc: users < <mailto:users@ovirt.org> users(a)ovirt.org>
Subject: Re: [ovirt-users] Re: Diary of hosted engine install woes
On Tue, Oct 9, 2018 at 4:54 PM <me(a)brendanh.com <mailto:me@brendanh.com> >
wrote:
I'ved added a record to the DNS server here:
ovirt-engine.example.com <
http://ovirt-engine.example.com> 10.0.0.109
OK, and how the engine VM will get that address?
Are you using DHCP? do you have a DHCP reservation for the MAC address you are using on
the engine VM?
Are you configuring it with a static IP?
This IP address is on the physical network that the host is on (host is on 10.0.0.171). I
trust this is correct and I should not resolve to a natted IP instead. I notice that
regardless of this record, the name
ovirt-engine.example.com
<
http://ovirt-engine.example.com> resolves to a natted IP: 192.168.124.51 because
the ansible script adds an entry to /etc/hosts:
192.168.124.51
ovirt-engine.example.com <
http://ovirt-engine.example.com>
While the script is running, if I I can successfully ping
ovirt-engine.example.com
<
http://ovirt-engine.example.com> , it responds on 192.168.124.51. So as you say:
"host can correctly resolve the name of the engine VM", but it's not the DNS
record's IP. If I remove the DNS record and run hosted-engine --deploy, I get error:
[ ERROR ] Host name is not valid:
ovirt-engine.example.com
<
http://ovirt-engine.example.com> did not resolve into an IP address
Anyway, I added back the DNS record and ran hosted-engine --deploy command, it failed at:
[ INFO ] TASK [Clean /etc/hosts on the host]
[ ERROR ] fatal: [localhost]: FAILED! => {"msg": "The task includes an
option with an undefined variable. The error was: list object has no element 0\n\nThe
error appears to have been in
'/usr/share/ovirt-hosted-engine-setup/ansible/create_target_vm.yml': line 396,
column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe
offending line appears to be:\n\n changed_when: True\n - name: Clean /etc/hosts on the
host\n ^ here\n"}
To debug, I added tasks to create_target_vm.yml that output the values of
local_vm_ip.std_out_lines[0] and FQDN that are used in this task, then ran the usual
deploy command again. They are both localhost:
[ INFO ] TASK [show local_vm_ip.std_out_lines[0] that will be written to etc hosts]
[ INFO ] ok: [localhost]
[ INFO ] TASK [show FQDN]
[ INFO ] ok: [localhost]
This time, it gets past [Clean /etc/hosts on the host], but hangs at [ INFO ] TASK [Check
engine VM health] same as before.
This is fine, the bootstrap local VM runs over a natted network then, once ready it will
be shutdown and moved to the shared storage. At that point it will be restarted on your
management network.
I catted /etc/hosts while it was hanging and it contains:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
The
ovirt-engine.example.com <
http://ovirt-engine.example.com> has been deleted! I
pinged
ovirt-engine.example.com <
http://ovirt-engine.example.com> and it now
resolves to its IP on the physical network: 10.0.0.109. So I added back this /etc/hosts
entry:
192.168.124.51
ovirt-engine.example.com <
http://ovirt-engine.example.com>
Please avoid this.
It subsequently errored:
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120,
"changed": true, "cmd": ["hosted-engine",
"--vm-status", "--json"], "delta":
"0:00:00.167559", "end": "2018-10-09 15:43:41.947274",
"rc": 0, "start": "2018-10-09 15:43:41.779715",
"stderr": "", "stderr_lines": [], "stdout":
"{\"1\": {\"conf_on_shared_storage\": true,
\"live-data\": true, \"extra\":
\"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=6810 (Tue Oct 9
15:43:36 2018)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=6810
<file://nhost-id=1/nscore=3400/nvm_conf_refresh_time=6810> (Tue Oct 9 15:43:37
2018)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\
<file://nconf_on_shared_storage=True/nmaintenance=False/nstate=EngineStarting/nstopped=False/n/>
", \"hostname\": \"host\", \"host-id\": 1,
\"engine-status\": {\"reason\": \"failed liveliness check\",
\"health\": \"bad\", \"vm\": \"up\",
\"detail\": \"Up\"}, \"score\": 3400, \"stopped\":
false, \"maintenance\": false, \"crc32\": \"c5d76f8b\",
\"local_conf_timestamp\": 6810, \"host-ts\": 6810},
\"global_maintenance\": false}", "stdout_lines":
["{\"1\": {\"conf_
on_shared_storage\": true, \"live-data\": true, \"extra\":
\"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=6810 (Tue Oct 9
15:43:36 2018)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=6810
<file://nhost-id=1/nscore=3400/nvm_conf_refresh_time=6810> (Tue Oct 9 15:43:37
2018)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\
<file://nconf_on_shared_storage=True/nmaintenance=False/nstate=EngineStarting/nstopped=False/n/>
", \"hostname\": \"host\", \"host-id\": 1,
\"engine-status\": {\"reason\": \"failed liveliness check\",
\"health\": \"bad\", \"vm\": \"up\",
\"detail\": \"Up\"}, \"score\": 3400, \"stopped\":
false, \"maintenance\": false, \"crc32\": \"c5d76f8b\",
\"local_conf_timestamp\": 6810, \"host-ts\": 6810},
\"global_maintenance\": false}"]}
How can I check the hosted-engine's IP address to ensure name resolution is correct?
You can connect to that VM with VNC and check the IP there.
_______________________________________________
Users mailing list -- users(a)ovirt.org <mailto:users@ovirt.org>
To unsubscribe send an email to users-leave(a)ovirt.org <mailto:users-leave@ovirt.org>
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SVBXIBLS5TS...