On Sun, Jul 3, 2016 at 5:57 AM, Kevin Hung <khung(a)nullaxiom.com> wrote:
Looks like there still needs to be some work done on oVirt 4.0 Node
and
ovirt-hosted-engine-setup before it's ready for general consumption. I have
spent days trying to get this to work, and only got it running (on one host)
after encountering 8 serious issues (7 below and the initial glusterfs one).
I have not been able to successfully deploy a second host (see issue 7
below). I will be moving back to deploying hosts using CentOS (with either
oVirt 4.0 or oVirt 3.6) as I need a working oVirt deployment up and running.
In case anyone is interested in reproducing the issues, I used the Node ISO
here [1] and the latest (7/2/2016) engine appliance OVA here [2]. Those seem
to be the "official" files as far as I can tell (which is difficult as the
documentation is not clear).
List of issues:
1. The error I mentioned seems to be an problem with the code. I bypassed it
by deleting /usr/libexec/vdsm/hooks/before_network_setup/50_fcoe.
2. ovirt-hosted-engine-setup is unable to connect to the vdsm service if the
FQDN of the node is not resolvable (i.e. if a DNS server is not entered in
the initial setup). This should be checked in either the initial oVirt Node
setup process or the beginning of ovirt-hosted-engine-setup.
3. The management bridge does not get created properly when the server is
set up with a manually configured DNS server and running NetworkManager (the
default on Node). It seems like a bug has been filed for this back in 2014.
[3]
4. Using cloud-init with default values to customize the engine appliance
can fail on the line "Creating/refreshing DWH database schema" if it takes
longer than 600 seconds to return output. This may apply to any other step
that takes a long time to complete. The VM no longer appears to be exist
after the setup exits that so I am unable to debug.
600 seconds seams more than a reasonable time to create an empty DB,
if it requires more than 10 minutes for a simple/short operation there
is probably something strange with the storage.
5. Without using cloud-init, the setup creates an engine VM that I
cannot
log into (it does not seem to use the engine admin password or a blank
password).
Yes, the engine VM host-name and its root password are configured via
cloud-init and there is not default password.
If you want to avoid using cloud-init you have to reset the root
password of the engine VM as for any el7 machine.
6. Destroying the VM (option 4) leaves the files intact on the
shared
storage so I cannot restart setup without deleting those first. This may be
intentional, but the use of kvm terminology (destroy for power off) is not
common, not to mention that "virsh -r list --all" does not list the VM
anymore.
On failures, there is not just the engine VM disk but a whole storage
domain for hosted-engine which also contains ancillary disks.
Re-deploying over a dirty storage is not supported so please clean up
the whole storage domain on failures.
7. Unable to deploy second host through web UI (error "Failed to
configure
management network on host node2 due to setup networks failure.") or using
This is not hosted-engine specific:
https://bugzilla.redhat.com/show_bug.cgi?id=1350763
ovirt-hosted-engine-setup (it looks like it can't connect to or
doesn't
start the broker service).
8. Random errors to stderr: "vcpu0 unhandled rdmsr" (this seems to be an
Are you running in a nested env?
upstream bug) and "multipath: error getting device" (this
has been an issue
for years with oVirt and seems to be due to multipathing being on by default
even for systems where that does not apply).
[1]
http://resources.ovirt.org/pub/ovirt-4.0/iso/ovirt-node-ng-installer/ovir...
[2]
http://jenkins.ovirt.org/view/All/job/ovirt-appliance_ovirt-4.0_build-art...
[3]
https://bugzilla.redhat.com/show_bug.cgi?id=1160423
On 7/1/2016 8:37 PM, Kevin Hung wrote:
>
> It looks like I'm now getting an error when the deployment tries to
> configure the management bridge.
>
> Setup log:
>
> 2016-07-01 20:29:47 INFO otopi.plugins.gr_he_common.network.bridge
> bridge._misc:
> 372 Configuring the management bridge
> 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge
> bridge._misc
> :384 networks: {'ovirtmgmt': {'nic': 'eno1',
'ipaddr': u'192.168.1.211',
> 'netmask': u'255.255.255.0', 'bootproto': u'none',
'gateway':
> u'192.168.1.1', 'defaultRoute': True}}
> 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge
> bridge._misc
> :385 bonds: {}
> 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge
> bridge._misc
> :386 options: {'connectivityCheck': False}
> 2016-07-01 20:29:48 DEBUG otopi.context context._executeMethod:142 method
> exception
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in
> _executeMethod
> method['method']()
> File
>
"/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py",
> line 387, in _misc
> _setupNetworks(conn, networks, bonds, options)
> File
>
"/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py",
> line 405, in _setupNetworks
> 'message: "%s"' % (networks, code, message))
> RuntimeError: Failed to setup networks {'ovirtmgmt': {'nic':
'eno1',
> 'ipaddr': u'192.168.1.211', 'netmask':
u'255.255.255.0', 'bootproto':
> u'none', 'gateway': u'192.168.1.1', 'defaultRoute':
True}}. Error code: "78"
> message: "Hook error: Hook Error: ('Traceback (most recent call last):\n
> File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in
> <module>\n from vdsm.netconfpersistence import
> RunningConfig\nImportError: No module named netconfpersistence\n',)"
> 2016-07-01 20:29:48 ERROR otopi.context context._executeMethod:151 Failed
> to execute stage 'Misc configuration': Failed to setup networks
> {'ovirtmgmt': {'nic': 'eno1', 'ipaddr':
u'192.168.1.211', 'netmask':
> u'255.255.255.0', 'bootproto': u'none', 'gateway':
u'192.168.1.1',
> 'defaultRoute': True}}. Error code: "78" message: "Hook error:
Hook Error:
> ('Traceback (most recent call last):\n File
> "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in
> <module>\n from vdsm.netconfpersistence import
> RunningConfig\nImportError: No module named netconfpersistence\n',)"
>
>
> On 7/1/2016 5:21 PM, Kevin Hung wrote:
>>
>> Thank you Sahina, that was the issue. I upgraded my glusterfs server to
>> 3.7.11 and I was able to continue with the deployment. I am seeing other
>> issues with deployment, but I will look into those myself first. Bug has
>> been logged [1].
>>
>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1352165
>>
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users