[ovirt-users] Unable to set up oVirt 4.0 HE using glusterfs storage

Simone Tiraboschi stirabos at redhat.com
Mon Jul 4 08:38:27 UTC 2016


On Sun, Jul 3, 2016 at 5:57 AM, Kevin Hung <khung at nullaxiom.com> wrote:
> Looks like there still needs to be some work done on oVirt 4.0 Node and
> ovirt-hosted-engine-setup before it's ready for general consumption. I have
> spent days trying to get this to work, and only got it running (on one host)
> after encountering 8 serious issues (7 below and the initial glusterfs one).
> I have not been able to successfully deploy a second host (see issue 7
> below). I will be moving back to deploying hosts using CentOS (with either
> oVirt 4.0 or oVirt 3.6) as I need a working oVirt deployment up and running.
>
> In case anyone is interested in reproducing the issues, I used the Node ISO
> here [1] and the latest (7/2/2016) engine appliance OVA here [2]. Those seem
> to be the "official" files as far as I can tell (which is difficult as the
> documentation is not clear).
>
> List of issues:
> 1. The error I mentioned seems to be an problem with the code. I bypassed it
> by deleting /usr/libexec/vdsm/hooks/before_network_setup/50_fcoe.
> 2. ovirt-hosted-engine-setup is unable to connect to the vdsm service if the
> FQDN of the node is not resolvable (i.e. if a DNS server is not entered in
> the initial setup). This should be checked in either the initial oVirt Node
> setup process or the beginning of ovirt-hosted-engine-setup.
> 3. The management bridge does not get created properly when the server is
> set up with a manually configured DNS server and running NetworkManager (the
> default on Node). It seems like a bug has been filed for this back in 2014.
> [3]
> 4. Using cloud-init with default values to customize the engine appliance
> can fail on the line "Creating/refreshing DWH database schema" if it takes
> longer than 600 seconds to return output. This may apply to any other step
> that takes a long time to complete. The VM no longer appears to be exist
> after the setup exits that so I am unable to debug.

600 seconds seams more than a reasonable time to create an empty DB,
if it requires more than 10 minutes for a simple/short operation there
is probably something strange with the storage.

> 5. Without using cloud-init, the setup creates an engine VM that I cannot
> log into (it does not seem to use the engine admin password or a blank
> password).

Yes, the engine VM host-name and its root password are configured via
cloud-init and there is not default password.
If you want to avoid using cloud-init you have to reset the root
password of the engine VM as for any el7 machine.

> 6. Destroying the VM (option 4) leaves the files intact on the shared
> storage so I cannot restart setup without deleting those first. This may be
> intentional, but the use of kvm terminology (destroy for power off) is not
> common, not to mention that "virsh -r list --all" does not list the VM
> anymore.

On failures, there is not just the engine VM disk but a whole storage
domain for hosted-engine which also contains ancillary disks.
Re-deploying over a dirty storage is not supported so please clean up
the whole storage domain on failures.

> 7. Unable to deploy second host through web UI (error "Failed to configure
> management network on host node2 due to setup networks failure.") or using

This is not hosted-engine specific:
https://bugzilla.redhat.com/show_bug.cgi?id=1350763

> ovirt-hosted-engine-setup (it looks like it can't connect to or doesn't
> start the broker service).
> 8. Random errors to stderr: "vcpu0 unhandled rdmsr" (this seems to be an

Are you running in a nested env?

> upstream bug) and "multipath: error getting device" (this has been an issue
> for years with oVirt and seems to be due to multipathing being on by default
> even for systems where that does not apply).
>
> [1]
> http://resources.ovirt.org/pub/ovirt-4.0/iso/ovirt-node-ng-installer/ovirt-node-ng-installer-ovirt-4.0-2016062412.iso
> [2]
> http://jenkins.ovirt.org/view/All/job/ovirt-appliance_ovirt-4.0_build-artifacts-el7-x86_64/
> [3] https://bugzilla.redhat.com/show_bug.cgi?id=1160423
>
>
> On 7/1/2016 8:37 PM, Kevin Hung wrote:
>>
>> It looks like I'm now getting an error when the deployment tries to
>> configure the management bridge.
>>
>> Setup log:
>>
>> 2016-07-01 20:29:47 INFO otopi.plugins.gr_he_common.network.bridge
>> bridge._misc:
>> 372 Configuring the management bridge
>> 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge
>> bridge._misc
>> :384 networks: {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211',
>> 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway':
>> u'192.168.1.1', 'defaultRoute': True}}
>> 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge
>> bridge._misc
>> :385 bonds: {}
>> 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge
>> bridge._misc
>> :386 options: {'connectivityCheck': False}
>> 2016-07-01 20:29:48 DEBUG otopi.context context._executeMethod:142 method
>> exception
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in
>> _executeMethod
>>     method['method']()
>>   File
>> "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py",
>> line 387, in _misc
>>     _setupNetworks(conn, networks, bonds, options)
>>   File
>> "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py",
>> line 405, in _setupNetworks
>>     'message: "%s"' % (networks, code, message))
>> RuntimeError: Failed to setup networks {'ovirtmgmt': {'nic': 'eno1',
>> 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto':
>> u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78"
>> message: "Hook error: Hook Error: ('Traceback (most recent call last):\n
>> File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in
>> <module>\n    from vdsm.netconfpersistence import
>> RunningConfig\nImportError: No module named netconfpersistence\n',)"
>> 2016-07-01 20:29:48 ERROR otopi.context context._executeMethod:151 Failed
>> to execute stage 'Misc configuration': Failed to setup networks
>> {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask':
>> u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1',
>> 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error:
>> ('Traceback (most recent call last):\n File
>> "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in
>> <module>\n    from vdsm.netconfpersistence import
>> RunningConfig\nImportError: No module named netconfpersistence\n',)"
>>
>>
>> On 7/1/2016 5:21 PM, Kevin Hung wrote:
>>>
>>> Thank you Sahina, that was the issue. I upgraded my glusterfs server to
>>> 3.7.11 and I was able to continue with the deployment. I am seeing other
>>> issues with deployment, but I will look into those myself first. Bug has
>>> been logged [1].
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1352165
>>>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



More information about the Users mailing list