3.5.1 net config persistence

Hello, There are a number of bugs [1] reported these days about the issue aourd network config of the hosts, when dealing with interfaces manually configured, with bonding and VLANs. These /etc/sysconfig/network-scripts/ifcfg.* files are wiped by vdsm after rebooting. I see that there are people at Redhat working on these, and some cases were reproduced in lab conditions - and some were not. I upgraded 3 DC from 3.4.? to 3.5.1, and faced this issue (lost of every network files) in an non-consistent manner. I finally thought I coped with this problem by adding net_persistence = ifcfg to /etc/vdsm/vdsm.conf and indeed, when restarting vdsmd and the network, files were conserved. It was before I observed that some action [2] lead to /etc/vdsm/vdsm.conf being renamed into /etc/vdsm/vdsm.conf.some_timestamp and the original one replaced by a very short file with no netcfg persistence at all. I didn't identified [2]. That could be : - some actions made by me through the Web UI ? - service vdsmd restart ? - reboots ? I'm sure that some Redhat people know what could be responsible for renaming /etc/vdsm/vdsm.conf into /etc/vdsm/vdsm.conf.some_timestamp, and I wish they are working closely with Dan Kenigsberg and Michael Burman who helped a lot on these issues (or maybe, THEY are the coders responsible for this ?) [2] : - https://bugzilla.redhat.com/show_bug.cgi?id=1154399 - https://bugzilla.redhat.com/show_bug.cgi?id=1188251 - and more or less related : https://bugzilla.redhat.com/show_bug.cgi?id=1134346 -- Nicolas Ecarnot

On Fri, Mar 20, 2015 at 10:14:54AM +0100, Nicolas Ecarnot wrote:
Hello,
There are a number of bugs [1] reported these days about the issue aourd network config of the hosts, when dealing with interfaces manually configured, with bonding and VLANs. These /etc/sysconfig/network-scripts/ifcfg.* files are wiped by vdsm after rebooting.
I see that there are people at Redhat working on these, and some cases were reproduced in lab conditions - and some were not.
I upgraded 3 DC from 3.4.? to 3.5.1, and faced this issue (lost of every network files) in an non-consistent manner.
I finally thought I coped with this problem by adding net_persistence = ifcfg to /etc/vdsm/vdsm.conf and indeed, when restarting vdsmd and the network, files were conserved.
It was before I observed that some action [2] lead to /etc/vdsm/vdsm.conf being renamed into /etc/vdsm/vdsm.conf.some_timestamp and the original one replaced by a very short file with no netcfg persistence at all.
I didn't identified [2]. That could be : - some actions made by me through the Web UI ? - service vdsmd restart ? - reboots ?
I'm sure that some Redhat people know what could be responsible for renaming /etc/vdsm/vdsm.conf into /etc/vdsm/vdsm.conf.some_timestamp, and I wish they are working closely with Dan Kenigsberg and Michael Burman who helped a lot on these issues (or maybe, THEY are the coders responsible for this ?)
[2] : - https://bugzilla.redhat.com/show_bug.cgi?id=1154399 - https://bugzilla.redhat.com/show_bug.cgi?id=1188251 - and more or less related : https://bugzilla.redhat.com/show_bug.cgi?id=1134346
Thanks for reporting this issue. We are well aware of it, and working hard to fix it. Unfortunately, there were several bugs on the process of upgrading ifcfg-based network configuration to vdsm's own "unified persistence" that sits under /var/lib/vdsm/persistence/netconf. Would you share which platform are you using? el6? el7? ovirt-node, or plain install? There is a recent report that ovirt-node may be restarting networking while vdsm starts up, which may well explain the problem and its inpredictability. Is this the case with you? Regarding /etc/vdsm/vdsm.conf: vdsm never rename it. Could it be rpm's new behavior (replacing vdsm.conf.rpmsave) ? Or could it be the node, Fabian?

Le 20/03/2015 14:40, Dan Kenigsberg a écrit :
On Fri, Mar 20, 2015 at 10:14:54AM +0100, Nicolas Ecarnot wrote:
Hello,
There are a number of bugs [1] reported these days about the issue aourd network config of the hosts, when dealing with interfaces manually configured, with bonding and VLANs. These /etc/sysconfig/network-scripts/ifcfg.* files are wiped by vdsm after rebooting.
I see that there are people at Redhat working on these, and some cases were reproduced in lab conditions - and some were not.
I upgraded 3 DC from 3.4.? to 3.5.1, and faced this issue (lost of every network files) in an non-consistent manner.
I finally thought I coped with this problem by adding net_persistence = ifcfg to /etc/vdsm/vdsm.conf and indeed, when restarting vdsmd and the network, files were conserved.
It was before I observed that some action [2] lead to /etc/vdsm/vdsm.conf being renamed into /etc/vdsm/vdsm.conf.some_timestamp and the original one replaced by a very short file with no netcfg persistence at all.
I didn't identified [2]. That could be : - some actions made by me through the Web UI ? - service vdsmd restart ? - reboots ?
I'm sure that some Redhat people know what could be responsible for renaming /etc/vdsm/vdsm.conf into /etc/vdsm/vdsm.conf.some_timestamp, and I wish they are working closely with Dan Kenigsberg and Michael Burman who helped a lot on these issues (or maybe, THEY are the coders responsible for this ?)
[2] : - https://bugzilla.redhat.com/show_bug.cgi?id=1154399 - https://bugzilla.redhat.com/show_bug.cgi?id=1188251 - and more or less related : https://bugzilla.redhat.com/show_bug.cgi?id=1134346
Thanks for reporting this issue. We are well aware of it, and working hard to fix it. Unfortunately, there were several bugs on the process of upgrading ifcfg-based network configuration to vdsm's own "unified persistence" that sits under /var/lib/vdsm/persistence/netconf.
Would you share which platform are you using? el6? el7? ovirt-node, or plain install?
We are using centos 6.6 on all our hosts, minimal install. Idem on the manager, bare metal stand alone, not hosted.
There is a recent report that ovirt-node may be restarting networking while vdsm starts up, which may well explain the problem and its inpredictability. Is this the case with you?
We are not using ovirt-nodes since 3 years, for some reasons.
Regarding /etc/vdsm/vdsm.conf: vdsm never rename it. Could it be rpm's new behavior (replacing vdsm.conf.rpmsave) ? Or could it be the node, Fabian?
Let us stay prudent : I indeed did some yum upgrade, BUT : - I made every step in a very modular way : first upgrade the manager - then put one host in maintenance - add the 3.5.1 repo on the host - then web-gui-reinstall it (upgrading the useful packages) - then put it up, migrate some VM on it, well, test it. - then put it back into maintenance - then yum upgrade it - then reboot it - then blah blah blah well you see, I won't explain every step, but I did that in a very cautious way, taking time for each of them, and repeating this whole process more than 20 times. I don't get why it is working like a charm on most of them, and facing the issues mentioned above on a portion of them. To answer to the renaming comment : yes Dan, some package upgrade renamed vdsm.conf into rpmsave, BUT I was explicitly talking about an additional renaming into vdsm.conf.201503191220 something, and I never saw a package upgrade do that. Just a final word : though I sound grumpy and find this issue a real pain, I am actually absolutely amazed by all the work done by all the oVirt community and the Redhat people :) -- Nicolas Ecarnot

I’ve encounter these issues on systems new and upgraded with bonding connections. The new system seems especially bad with bonds, and I’ve taken to immediately switching my hosts to the ifcfg persistence methods. Centos 6 and 7 hosts. If it matters, I’m good with setting up my own network config, and sometimes I REALLY DO NOT WANT ovirt to change them, especially with vlans and gluster co-existance. I can see the goal, but it seems pretty far from it right now, so I’m very happy that there’s a way to switch back to “system” control of those things.
On Mar 20, 2015, at 10:41 AM, Nicolas Ecarnot <nicolas@ecarnot.net> wrote:
Le 20/03/2015 14:40, Dan Kenigsberg a écrit :
On Fri, Mar 20, 2015 at 10:14:54AM +0100, Nicolas Ecarnot wrote:
Hello,
There are a number of bugs [1] reported these days about the issue aourd network config of the hosts, when dealing with interfaces manually configured, with bonding and VLANs. These /etc/sysconfig/network-scripts/ifcfg.* files are wiped by vdsm after rebooting.
I see that there are people at Redhat working on these, and some cases were reproduced in lab conditions - and some were not.
I upgraded 3 DC from 3.4.? to 3.5.1, and faced this issue (lost of every network files) in an non-consistent manner.
I finally thought I coped with this problem by adding net_persistence = ifcfg to /etc/vdsm/vdsm.conf and indeed, when restarting vdsmd and the network, files were conserved.
It was before I observed that some action [2] lead to /etc/vdsm/vdsm.conf being renamed into /etc/vdsm/vdsm.conf.some_timestamp and the original one replaced by a very short file with no netcfg persistence at all.
I didn't identified [2]. That could be : - some actions made by me through the Web UI ? - service vdsmd restart ? - reboots ?
I'm sure that some Redhat people know what could be responsible for renaming /etc/vdsm/vdsm.conf into /etc/vdsm/vdsm.conf.some_timestamp, and I wish they are working closely with Dan Kenigsberg and Michael Burman who helped a lot on these issues (or maybe, THEY are the coders responsible for this ?)
[2] : - https://bugzilla.redhat.com/show_bug.cgi?id=1154399 - https://bugzilla.redhat.com/show_bug.cgi?id=1188251 - and more or less related : https://bugzilla.redhat.com/show_bug.cgi?id=1134346
Thanks for reporting this issue. We are well aware of it, and working hard to fix it. Unfortunately, there were several bugs on the process of upgrading ifcfg-based network configuration to vdsm's own "unified persistence" that sits under /var/lib/vdsm/persistence/netconf.
Would you share which platform are you using? el6? el7? ovirt-node, or plain install?
We are using centos 6.6 on all our hosts, minimal install. Idem on the manager, bare metal stand alone, not hosted.
There is a recent report that ovirt-node may be restarting networking while vdsm starts up, which may well explain the problem and its inpredictability. Is this the case with you?
We are not using ovirt-nodes since 3 years, for some reasons.
Regarding /etc/vdsm/vdsm.conf: vdsm never rename it. Could it be rpm's new behavior (replacing vdsm.conf.rpmsave) ? Or could it be the node, Fabian?
Let us stay prudent : I indeed did some yum upgrade, BUT : - I made every step in a very modular way : first upgrade the manager - then put one host in maintenance - add the 3.5.1 repo on the host - then web-gui-reinstall it (upgrading the useful packages) - then put it up, migrate some VM on it, well, test it. - then put it back into maintenance - then yum upgrade it - then reboot it - then blah blah blah
well you see, I won't explain every step, but I did that in a very cautious way, taking time for each of them, and repeating this whole process more than 20 times.
I don't get why it is working like a charm on most of them, and facing the issues mentioned above on a portion of them.
To answer to the renaming comment : yes Dan, some package upgrade renamed vdsm.conf into rpmsave, BUT I was explicitly talking about an additional renaming into vdsm.conf.201503191220 something, and I never saw a package upgrade do that.
Just a final word : though I sound grumpy and find this issue a real pain, I am actually absolutely amazed by all the work done by all the oVirt community and the Redhat people :)
-- Nicolas Ecarnot _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Fri, Mar 20, 2015 at 02:01:25PM -0500, Darrell Budic wrote:
I’ve encounter these issues on systems new and upgraded with bonding connections. The new system seems especially bad with bonds, and I’ve taken to immediately switching my hosts to the ifcfg persistence methods. Centos 6 and 7 hosts.
There have been multiple issue regarding net config upgrade. We might have nailed an important one regarding ovirt-node. However, I'd like to learn more about your report regarding new systems. Your report sounds similar to Bug 1203422 - vdsm should restore networks much earlier, to let net-dependent services start
If it matters, I’m good with setting up my own network config, and sometimes I REALLY DO NOT WANT ovirt to change them, especially with vlans and gluster co-existance. I can see the goal, but it seems pretty far from it right now, so I’m very happy that there’s a way to switch back to “system” control of those things.
Besides Vdsm slowliness to start the network, what are the reasons for your not wanting ovirt to touch your ifcfg? BTW, even today ovirt overwrites ifcfg files, but only on network def time, to on every boot.

On Mar 23, 2015, at 12:35 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Fri, Mar 20, 2015 at 02:01:25PM -0500, Darrell Budic wrote:
I’ve encounter these issues on systems new and upgraded with bonding connections. The new system seems especially bad with bonds, and I’ve taken to immediately switching my hosts to the ifcfg persistence methods. Centos 6 and 7 hosts.
There have been multiple issue regarding net config upgrade. We might have nailed an important one regarding ovirt-node.
However, I'd like to learn more about your report regarding new systems. Your report sounds similar to
Bug 1203422 - vdsm should restore networks much earlier, to let net-dependent services start
Caveat: I don’t have systems available to recreate at this time, so this is from memory of what I go through on a new host setup. I havn’t filed bugs because I’ve seen several that look like mine, and until recently, I couldn’t be sure my problems weren’t being caused by upgrades from older systems. Whenever I experience issues, it’s related to installing onto a new host system, creating the bonds either in or outside of ovirt, and the next time I reboot that host, the bonds do not get created so none of the networks come up and I need to get on a console to fix things.
If it matters, I’m good with setting up my own network config, and
sometimes I REALLY DO NOT WANT ovirt to change them, especially with vlans and gluster co-existance. I can see the goal, but it seems pretty far from it right now, so I’m very happy that there’s a way to switch back to “system” control of those things.
Besides Vdsm slowliness to start the network, what are the reasons for your not wanting ovirt to touch your ifcfg? BTW, even today ovirt overwrites ifcfg files, but only on network def time, to on every boot.
I don’t actually notice the slowness, but my mgmt, access, and gluster storage networks depend on the bonded network config to function. I’d like to have them up at boot and not wait for vdsmd to bring them up. Similar to Bug 1203422, but my problem is that the bonds don’t get created at boot, so no other networks that depend on them can come up. Also, I have setup my gluster backend to use specific interfaces and ip addresses, and I’d like it if Ovirt didn’t mess with them. These are all things I can work around with ifcfg files, so I prefer them. I’ve taken to saving my ifcfg-* files so I can easily replace them if ovirt does things to them I don’t like (like setting ONBOOT=no). I did catch that it only alters them when defining a network, it does mean I can easily adjust things as needed.

I should add that I do one thing that may be considered unusual. I have a bunch of systems with 2 1Gb links on them, and I’m building them on one link, then manually converting them to bonded links before configuring them as ovirt host nodes. Since I have no other dedicated interfaces, all of my networking depends on the bonded interface for connectivity.
On Mar 24, 2015, at 11:40 AM, Darrell Budic <budic@onholyground.com> wrote:
On Mar 23, 2015, at 12:35 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Fri, Mar 20, 2015 at 02:01:25PM -0500, Darrell Budic wrote:
I’ve encounter these issues on systems new and upgraded with bonding connections. The new system seems especially bad with bonds, and I’ve taken to immediately switching my hosts to the ifcfg persistence methods. Centos 6 and 7 hosts.
There have been multiple issue regarding net config upgrade. We might have nailed an important one regarding ovirt-node.
However, I'd like to learn more about your report regarding new systems. Your report sounds similar to
Bug 1203422 - vdsm should restore networks much earlier, to let net-dependent services start
Caveat: I don’t have systems available to recreate at this time, so this is from memory of what I go through on a new host setup.
I havn’t filed bugs because I’ve seen several that look like mine, and until recently, I couldn’t be sure my problems weren’t being caused by upgrades from older systems. Whenever I experience issues, it’s related to installing onto a new host system, creating the bonds either in or outside of ovirt, and the next time I reboot that host, the bonds do not get created so none of the networks come up and I need to get on a console to fix things.
If it matters, I’m good with setting up my own network config, and
sometimes I REALLY DO NOT WANT ovirt to change them, especially with vlans and gluster co-existance. I can see the goal, but it seems pretty far from it right now, so I’m very happy that there’s a way to switch back to “system” control of those things.
Besides Vdsm slowliness to start the network, what are the reasons for your not wanting ovirt to touch your ifcfg? BTW, even today ovirt overwrites ifcfg files, but only on network def time, to on every boot.
I don’t actually notice the slowness, but my mgmt, access, and gluster storage networks depend on the bonded network config to function. I’d like to have them up at boot and not wait for vdsmd to bring them up. Similar to Bug 1203422, but my problem is that the bonds don’t get created at boot, so no other networks that depend on them can come up.
Also, I have setup my gluster backend to use specific interfaces and ip addresses, and I’d like it if Ovirt didn’t mess with them.
These are all things I can work around with ifcfg files, so I prefer them. I’ve taken to saving my ifcfg-* files so I can easily replace them if ovirt does things to them I don’t like (like setting ONBOOT=no). I did catch that it only alters them when defining a network, it does mean I can easily adjust things as needed.
participants (3)
-
Dan Kenigsberg
-
Darrell Budic
-
Nicolas Ecarnot