[ovirt-users] Failure during self-hosted deployment: exception configuring management bridge

David Sommerseth davids at redhat.com
Tue May 13 04:59:47 EDT 2014


On 13/05/14 00:35, Bob Doolittle wrote:
> Also - is there a bugID for this new issue?
> 
> The one I quoted is supposed to only affect non-existent device names.
> Why is this affecting valid device names as well, and only in the VDSM
> context?

Antonio may correct me here, but I believe it's caused by vdsm using
libnl-1.x and py-ethtool using libnl3.  We've discovered an issue with
this combination, where libnl-1.x is able to invalidate the netlink
socket libnl3 gives py-ethtool; rendering py-ethtool useless.

This issue is somewhat tracked in this bz:
<https://bugzilla.redhat.com/show_bug.cgi?id=1078312>

This is actually quite a delicate issue, as I believe there are some
fixes in vdsm, py-ethtool have some patches to improve the error
handling (which should help vdsm too) and we're waiting for an official
libnl3 update to tackle the socket handling better.

I have hopes that once the libnl3 fixes gets out, much of this will be
solved.


David S.


> On 05/12/2014 06:21 PM, Dan Kenigsberg wrote:
>> On Mon, May 12, 2014 at 05:53:10PM -0400, Bob Doolittle wrote:
>>> On 05/12/2014 02:49 PM, Bob Doolittle wrote:
>>>> Hi,
>>>>
>>>> I'm trying to set up a fresh system on F19, using oVirt 3.4.
>>>>
>>>> When running hosted-engine --deploy, it fails during "Configuring the
>>>> management bridge". The ovirt-hosted-engine-setup log shows:
>>>>
>>>> 2014-05-12 13:59:35 INFO
>>>> otopi.plugins.ovirt_hosted_engine_setup.network.bridge bridge._misc:196
>>>> Configuring the management bridge
>>>> 2014-05-12 13:59:35 DEBUG otopi.context context._executeMethod:152
>>>> method
>>>> exception
>>>> Traceback (most recent call last):
>>>>   File "/usr/lib/python2.7/site-packages/otopi/context.py", line
>>>> 142, in
>>>> _executeMethod
>>>>     method['method']()
>>>>   File
>>>> "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/network/bridge.py",
>>>>
>>>> line 201, in _misc
>>>>     ].s.getVdsCapabilities()['info']['nics'][nics]
>>>> KeyError: 'info'
>>>> 2014-05-12 13:59:35 ERROR otopi.context context._executeMethod:161
>>>> Failed
>>>> to execute stage 'Misc configuration': 'info'
>>>>
>>>>
>>>> The vdsm.log shows:
>>>>
>>>> Thread-14::DEBUG::2014-05-12
>>>> 13:59:35,840::BindingXMLRPC::1067::vds::(wrapper) client
>>>> [127.0.0.1]::call
>>>> getCapabilities with () {}
>>>> Thread-14::DEBUG::2014-05-12 13:59:35,875::utils::642::root::(execCmd)
>>>> '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None)
>>>> Thread-14::DEBUG::2014-05-12 13:59:35,879::utils::662::root::(execCmd)
>>>> SUCCESS: <err> = ''; <rc> = 0
>>>> Thread-14::ERROR::2014-05-12
>>>> 13:59:35,882::BindingXMLRPC::1086::vds::(wrapper) unexpected error
>>>> Traceback (most recent call last):
>>>>   File "/usr/share/vdsm/BindingXMLRPC.py", line 1070, in wrapper
>>>>     res = f(*args, **kwargs)
>>>>   File "/usr/share/vdsm/BindingXMLRPC.py", line 393, in getCapabilities
>>>>     ret = api.getCapabilities()
>>>>   File "/usr/share/vdsm/API.py", line 1185, in getCapabilities
>>>>     c = caps.get()
>>>>   File "/usr/share/vdsm/caps.py", line 369, in get
>>>>     caps.update(netinfo.get())
>>>>   File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>>> 566, in
>>>> get
>>>>     d['nics'][dev.name] = _nicinfo(dev.name, paddr)
>>>>   File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>>> 516, in
>>>> _nicinfo
>>>>     info = _devinfo(nic)
>>>>   File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>>> 536, in
>>>> _devinfo
>>>>     ipv4addr, ipv4netmask, ipv6addrs = getIpInfo(dev)
>>>>   File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>>> 317, in
>>>> getIpInfo
>>>>     ipv6addrs = devInfo.get_ipv6_addresses()
>>>> SystemError: error return without exception set
>>>>
>>>>
>>>> I have two NICs - a wireless NIC which is disabled, and an ethernet NIC
>>>> "p3p1" which is statically configured via network-scripts.
>>>>
>>>> I've also attached the output of "ip addr".
>>>>
>>>> I also notice some disturbing looking messages in the vdsm log during
>>>> setupMultipath, including "Panic: Error initializing IRS" and then
>>>> subsequent lvm-related errors during StorageRefresh. Those did not
>>>> abort
>>>> the deployment, however. What do those failures indicate?
>>> This looks a lot like a new manifestation of:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1057772
>> Which version of Vdsm are you using? ovirt-3.4.1's vdsm-4.14.7 should
>> have fixed the that problem.
>>
>>> I even instrumented the code in
>>> /usr/lib64/python2.7/site-packages/vdsm/netinfo.py
>>>
>>> The device name ("p3p1") being passed in is correct (I even tried
>>> setting
>>> the string directly), but the returned object is empty.
>>>
>>> If I start python by hand and run ethtool.get_interfaces_info("p3p1") it
>>> returns the correct data.
>>>
>>> So it seems as though the code is somehow environmentally sensitive.
>>> I'm not
>>> sure what it is about my environment that would cause issues here
>>> however,
>>> since presumably this is working for others...
>> I'm afraid this has recently been tickled by a relase of python-ethtool
>> to Fedora 19.
> 



More information about the Users mailing list