[ovirt-users] 3.6 looses network on reboot
Pavel Gashev
Pax at acronis.com
Tue Mar 1 01:33:52 EST 2016
I did see it few times. The first reboot after new node setup sometimes fails to bring network up. It tries to remove network interface when it doesn't exist.
Steps to recover:
1. Remove /var/lib/vdsm/persistence/netconf
2. Remove /var/run/vdsm/netconf
3. Configure network manually
4. Start vdsmd service
5. Configure network again using web ui. Make sure that config is synced.
On Mon, 2016-02-29 at 15:58 +0200, Dan Kenigsberg wrote:
This sounds very bad. Changing the subject, so the wider, more
problematic issue is visible.
Did any other user see this behavior?
On Mon, Feb 29, 2016 at 06:27:46AM +0000, David LeVene wrote:
Hi Dan,
Answers as follows;
# rpm -qa | grep -i vdsm
vdsm-jsonrpc-4.17.18-1.el7.noarch
vdsm-hook-vmfex-4.17.18-1.el7.noarch
vdsm-infra-4.17.18-1.el7.noarch
vdsm-4.17.18-1.el7.noarch
vdsm-python-4.17.18-1.el7.noarch
vdsm-yajsonrpc-4.17.18-1.el7.noarch
vdsm-cli-4.17.18-1.el7.noarch
vdsm-xmlrpc-4.17.18-1.el7.noarch
vdsm-hook-vmfex-dev-4.17.18-1.el7.noarch
There was in this folder ifcfg-ovirtmgnt bridge setup, and also route-ovirtmgnt & rule-ovirtmgmt.. but they were removed after the reboot.
# ls -althr | grep ifcfg
-rw-r--r--. 1 root root 254 Sep 16 21:21 ifcfg-lo
-rw-r--r--. 1 root root 120 Feb 25 14:07 ifcfg-enp7s0f0
-rw-rw-r--. 1 root root 174 Feb 25 14:40 ifcfg-enp6s0
I think I modified ifcfg-enp6s0 to get networking up again (eg was set to bridge.. but the bridge wasn't configured).. it was a few days ago.. if it's important I can reboot the box again to see what state it comes up with.
# cat ifcfg-enp6s0
BOOTPROTO="none"
IPADDR="10.80.10.117"
NETMASK="255.255.255.0"
GATEWAY="10.80.10.1"
DEVICE="enp6s0"
HWADDR="00:25:b5:00:0b:4f"
ONBOOT=yes
PEERDNS=yes
PEERROUTES=yes
MTU=1500
# cat ifcfg-enp7s0f0
# Generated by VDSM version 4.17.18-1.el7
DEVICE=enp7s0f0
ONBOOT=yes
MTU=1500
HWADDR=00:25:b5:00:0b:0f
NM_CONTROLLED=no
# find /var/lib/vdsm/persistence
/var/lib/vdsm/persistence
/var/lib/vdsm/persistence/netconf
/var/lib/vdsm/persistence/netconf.1456371473833165545
/var/lib/vdsm/persistence/netconf.1456371473833165545/nets
/var/lib/vdsm/persistence/netconf.1456371473833165545/nets/ovirtmgmt
# cat /var/lib/vdsm/persistence/netconf.1456371473833165545/nets/ovirtmgmt
{
"nic": "enp6s0",
"ipaddr": "10.80.10.117",
"mtu": "1500",
"netmask": "255.255.255.0",
"STP": "no",
"bridged": "true",
"gateway": "10.80.10.1",
"defaultRoute": true
}
Supervdsm log is attached.
Have you editted ifcfg-ovirtmgmt manually? Can you somehow reproduce it,
and share its content?
Do you have NetworkManager running? which version?
It seems that Vdsm has two bugs: on boot, initscripts end up setting an
ipv6 address that Vdsm never requested.
restore-net::INFO::2016-02-25 14:14:58,024::vdsm-restore-net-config::261::root::(_find_changed_or_missing) ovirtmgmt is different or missing from persistent configuration. current: {'nic': 'enp6s0', 'dhcpv6': False, 'ipaddr': '10.80.10.117', 'mtu': '1500', 'netmask': '255.255.255.0', 'bootproto': 'none', 'stp': False, 'bridged': True, 'ipv6addr': ['2400:7d00:110:3:225:b5ff:fe00:b4f/64'], 'gateway': '10.80.10.1', 'defaultRoute': True}, persisted: {u'nic': u'enp6s0', 'dhcpv6': False, u'ipaddr': u'10.80.10.117', u'mtu': '1500', u'netmask': u'255.255.255.0', 'bootproto': 'none', 'stp': False, u'bridged': True, u'gateway': u'10.80.10.1', u'defaultRoute': True}
Then, Vdsm tries to drop the
unsolicited address, but fails. Both must be fixed ASAP.
restore-net::ERROR::2016-02-25 14:14:59,490::__init__::58::root::(__exit__) Failed rollback transaction last known good network.
Traceback (most recent call last):
File "/usr/share/vdsm/network/api.py", line 918, in setupNetworks
keep_bridge=keep_bridge)
File "/usr/share/vdsm/network/api.py", line 222, in wrapped
ret = func(**attrs)
File "/usr/share/vdsm/network/api.py", line 502, in _delNetwork
configurator.removeQoS(net_ent)
File "/usr/share/vdsm/network/configurators/__init__.py", line 122, in removeQoS
qos.remove_outbound(top_device)
File "/usr/share/vdsm/network/configurators/qos.py", line 60, in remove_outbound
device, pref=_NON_VLANNED_ID if vlan_tag is None else vlan_tag)
File "/usr/share/vdsm/network/tc/filter.py", line 31, in delete
_wrapper.process_request(command)
File "/usr/share/vdsm/network/tc/_wrapper.py", line 38, in process_request
raise TrafficControlException(retcode, err, command)
TrafficControlException: (None, 'Message truncated', ['/usr/sbin/tc', 'filter', 'del', 'dev', 'enp6s0', 'pref', '5000'])
Regards,
Dan.
_______________________________________________
Users mailing list
Users at ovirt.org<mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160301/daf8850c/attachment.html>
More information about the Users
mailing list