[ovirt-users] VDSM Network Bug or Feature?

ml at ohnewald.net ml at ohnewald.net
Sun Apr 5 11:19:43 UTC 2015


Am 25.03.2015 um 15:34 schrieb Dan Kenigsberg:
> On Wed, Mar 25, 2015 at 10:31:14AM +0100, ml at ohnewald.net wrote:
>> Hello List,
>>
>> i think i found a nasty Bug (or feature) of ovirt.
>>
>> One of my network cards was set up with dhcp. At this specific time there
>> was not yet a dhcp server set up which could respond to dhcp requests.
>>
>> Therefore my network interface was not able to obtain an ip address. This
>> „failure“ leaded to that my ovirtmgnt bride would not get startet.
>>
>> __Maybe__ because ovirtmgmt is alpha numeric after dbvlan116? Because all my
>> bonding interfaces bond0 and bond1 started just fine.
>>
>> I was able to solve it by moving my /sbin/dhclient to /sbin/dhclient.backup
>> and creating a dummy exit0 bash script as /sbin/dhclient.
>>
>> Then the network startup process seems to progress to my ovirtmgmt
>> interface. From now on i was able to connect and manage my host again and to
>> set up my dbvlan116 interface from dhcp to none.
>>
>>
>> Here is the process list it seems to loop in:
>>
>>
>> root      2554  0.0  0.0 115612  1988 ?        S<   10:06   0:00 /bin/bash
>> /etc/sysconfig/network-scripts/ifup-eth ifcfg-dbvlan116
>> root      2594  0.0  0.0 104208 15620 ?        S<   10:06   0:00
>> /sbin/dhclient -H ovirt-node06-stgt -1 -q -lf
>> /var/lib/dhclient/dhclient--dbvlan116.lease -pf /var/run/
>> root     32047  0.0  0.0 115348  1676 ?        S<s  10:06   0:00 /bin/sh
>> /usr/libexec/vdsm/vdsmd_init_common.sh --pre-start
>> root     32142  1.5  0.0 348460 24952 ?        S<   10:06   0:00
>> /usr/bin/python /usr/share/vdsm/vdsm-restore-net-config
>>
>>
>> Just killing the dhclient does not seem to work. It keeps retrying.
>>
>>
>> I reported a bug before, but maybe its better to discuss it here first and
>> explain the bug properly to that the Bugtracker guys know what i mean and
>> what the problem is? :)
> Good. But could you share the bug number?

I have not created a bug yet.
>
>> Maybe its best to start the ovirtmgmt interface first? Otherwise a wrong
>> configured interface will lock you out of the system.
>>
> I don't think I understood what is the bug, and when does it show up.
> Let's start with the basics. Which platform are you using? el6? el7?
CentOS7 + EL7
> - do you have NetworkManager or firewalld running?
No
> - Which vdsm version are you using?

vdsm-jsonrpc-4.16.10-8.gitc937927.el7.noarch
vdsm-yajsonrpc-4.16.10-8.gitc937927.el7.noarch
vdsm-python-zombiereaper-4.16.10-8.gitc937927.el7.noarch
vdsm-cli-4.16.10-8.gitc937927.el7.noarch
vdsm-python-4.16.10-8.gitc937927.el7.noarch
vdsm-4.16.10-8.gitc937927.el7.x86_64
vdsm-gluster-4.16.10-8.gitc937927.el7.noarch
vdsm-xmlrpc-4.16.10-8.gitc937927.el7.noarch

> - How did you configure the networks? From Engine? Manually?
 From Engine.
> - Can you share your /var/lib/vdsm/persistence/netconf
find /var/lib/vdsm/persistence/netconf/
/var/lib/vdsm/persistence/netconf/
/var/lib/vdsm/persistence/netconf/bonds
/var/lib/vdsm/persistence/netconf/bonds/bond1
/var/lib/vdsm/persistence/netconf/bonds/bond0
/var/lib/vdsm/persistence/netconf/nets
/var/lib/vdsm/persistence/netconf/nets/dbvlan116
/var/lib/vdsm/persistence/netconf/nets/san5nach7
/var/lib/vdsm/persistence/netconf/nets/san5nach6
/var/lib/vdsm/persistence/netconf/nets/vlan111
/var/lib/vdsm/persistence/netconf/nets/ovirtmgmt



  cat /var/lib/vdsm/persistence/netconf/bonds/bond1
{"nics": ["enp5s0f0", "enp5s0f1"], "options": "mode=0 miimon=100"}


cat /var/lib/vdsm/persistence/netconf/bonds/bond0
{"nics": ["enp3s0f0", "enp3s0f1"], "options": "mode=0 miimon=100"}



cat /var/lib/vdsm/persistence/netconf/nets/dbvlan116  => this was set to 
DHCP
{"nic": "enp7s0f1", "vlan": "116", "STP": "no", "bridged": "true", 
"mtu": "1500"}[

  cat /var/lib/vdsm/persistence/netconf/nets/san5nach7
{"bondingOptions": "mode=0 miimon=100", "ipaddr": "10.10.3.5", 
"bonding": "bond1", "mtu": "9000", "netmask": "255.255.255.0", "STP": 
"no", "bridged": "true"}


cat /var/lib/vdsm/persistence/netconf/nets/san5nach6
{"bondingOptions": "mode=0 miimon=100", "ipaddr": "10.10.1.5", 
"bonding": "bond0", "mtu": "9000", "netmask": "255.255.255.0", 
"bridged": "false"}

  cat /var/lib/vdsm/persistence/netconf/nets/ovirtmgmt
{"nic": "enp7s0f0", "ipaddr": "192.168.43.124", "mtu": "1500", 
"netmask": "255.255.255.0", "STP": "no", "bridged": "true"}

>
> Do you say that `service vdsm start` hangs forever?

What im am saying is:

IF a interface A is set up with DHCP (which does not get an ip address 
for whatever reason) then it will not move on to interface B.


In my case:
===========
IF dbvlan116   does not get a DHCP response, it will NOT move on and 
bring up my ovirtmgmt interface.

However:
===========
My bond0+1 interfaces were there.


I guess this is because it starts with the B* (as in bond) interfaces, 
then moves on to my D* (as in dbvlan) interfaces and then the rest of 
the alphanummeric chain...

I hope i was able to explain it well enough.

I think the managment interface should always start first, otherwise you 
are not able to correct configurations problems like this.

Thanks,
Mario


> Regards,
> Dan.




More information about the Users mailing list