Also - is there a bugID for this new issue?
The one I quoted is supposed to only affect non-existent device names.
Why is this affecting valid device names as well, and only in the VDSM
context?
Antonio may correct me here, but I believe it's caused by vdsm using
libnl-1.x and py-ethtool using libnl3. We've discovered an issue with
this combination, where libnl-1.x is able to invalidate the netlink
socket libnl3 gives py-ethtool; rendering py-ethtool useless.
This issue is somewhat tracked in this bz:
<
This is actually quite a delicate issue, as I believe there are some
fixes in vdsm, py-ethtool have some patches to improve the error
handling (which should help vdsm too) and we're waiting for an official
libnl3 update to tackle the socket handling better.
I have hopes that once the libnl3 fixes gets out, much of this will be
solved.
David S.
On 05/12/2014 06:21 PM, Dan Kenigsberg wrote:
> On Mon, May 12, 2014 at 05:53:10PM -0400, Bob Doolittle wrote:
>> On 05/12/2014 02:49 PM, Bob Doolittle wrote:
>>> Hi,
>>>
>>> I'm trying to set up a fresh system on F19, using oVirt 3.4.
>>>
>>> When running hosted-engine --deploy, it fails during "Configuring the
>>> management bridge". The ovirt-hosted-engine-setup log shows:
>>>
>>> 2014-05-12 13:59:35 INFO
>>> otopi.plugins.ovirt_hosted_engine_setup.network.bridge bridge._misc:196
>>> Configuring the management bridge
>>> 2014-05-12 13:59:35 DEBUG otopi.context context._executeMethod:152
>>> method
>>> exception
>>> Traceback (most recent call last):
>>> File "/usr/lib/python2.7/site-packages/otopi/context.py", line
>>> 142, in
>>> _executeMethod
>>> method['method']()
>>> File
>>>
"/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/network/bridge.py",
>>>
>>> line 201, in _misc
>>> ].s.getVdsCapabilities()['info']['nics'][nics]
>>> KeyError: 'info'
>>> 2014-05-12 13:59:35 ERROR otopi.context context._executeMethod:161
>>> Failed
>>> to execute stage 'Misc configuration': 'info'
>>>
>>>
>>> The vdsm.log shows:
>>>
>>> Thread-14::DEBUG::2014-05-12
>>> 13:59:35,840::BindingXMLRPC::1067::vds::(wrapper) client
>>> [127.0.0.1]::call
>>> getCapabilities with () {}
>>> Thread-14::DEBUG::2014-05-12 13:59:35,875::utils::642::root::(execCmd)
>>> '/sbin/ip route show to 0.0.0.0/0 table all' (cwd None)
>>> Thread-14::DEBUG::2014-05-12 13:59:35,879::utils::662::root::(execCmd)
>>> SUCCESS: <err> = ''; <rc> = 0
>>> Thread-14::ERROR::2014-05-12
>>> 13:59:35,882::BindingXMLRPC::1086::vds::(wrapper) unexpected error
>>> Traceback (most recent call last):
>>> File "/usr/share/vdsm/BindingXMLRPC.py", line 1070, in wrapper
>>> res = f(*args, **kwargs)
>>> File "/usr/share/vdsm/BindingXMLRPC.py", line 393, in
getCapabilities
>>> ret = api.getCapabilities()
>>> File "/usr/share/vdsm/API.py", line 1185, in getCapabilities
>>> c = caps.get()
>>> File "/usr/share/vdsm/caps.py", line 369, in get
>>> caps.update(netinfo.get())
>>> File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>> 566, in
>>> get
>>> d['nics'][dev.name] = _nicinfo(dev.name, paddr)
>>> File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>> 516, in
>>> _nicinfo
>>> info = _devinfo(nic)
>>> File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>> 536, in
>>> _devinfo
>>> ipv4addr, ipv4netmask, ipv6addrs = getIpInfo(dev)
>>> File "/usr/lib64/python2.7/site-packages/vdsm/netinfo.py", line
>>> 317, in
>>> getIpInfo
>>> ipv6addrs = devInfo.get_ipv6_addresses()
>>> SystemError: error return without exception set
>>>
>>>
>>> I have two NICs - a wireless NIC which is disabled, and an ethernet NIC
>>> "p3p1" which is statically configured via network-scripts.
>>>
>>> I've also attached the output of "ip addr".
>>>
>>> I also notice some disturbing looking messages in the vdsm log during
>>> setupMultipath, including "Panic: Error initializing IRS" and then
>>> subsequent lvm-related errors during StorageRefresh. Those did not
>>> abort
>>> the deployment, however. What do those failures indicate?
>> This looks a lot like a new manifestation of:
>>
https://bugzilla.redhat.com/show_bug.cgi?id=1057772
> Which version of Vdsm are you using? ovirt-3.4.1's vdsm-4.14.7 should
> have fixed the that problem.
>
>> I even instrumented the code in
>> /usr/lib64/python2.7/site-packages/vdsm/netinfo.py
>>
>> The device name ("p3p1") being passed in is correct (I even tried
>> setting
>> the string directly), but the returned object is empty.
>>
>> If I start python by hand and run ethtool.get_interfaces_info("p3p1")
it
>> returns the correct data.
>>
>> So it seems as though the code is somehow environmentally sensitive.
>> I'm not
>> sure what it is about my environment that would cause issues here
>> however,
>> since presumably this is working for others...
> I'm afraid this has recently been tickled by a relase of python-ethtool
> to Fedora 19.