routing table and wrong ip rule attribution on hosts

Hi all, I 've finished migration from 4.4.4 to 4.4.9 and I'm facing a strange issue with routing table on my hosts: all IP addressed interfaces (and in particular gluster and migration ones that requiere an IP) are not part of the "254" or "0" usual ip rule. for instance: [root@fuego ~]# nmcli con sh gluster |grep ipv4.route-table ipv4.route-table: 202179335 [root@fuego ~]# nmcli con sh migration |grep ipv4.route-table ipv4.route-table: 316605387 but ovirtmgmt: [root@fuego ~]# nmcli con sh ovirtmgmt |grep ipv4.route-table ipv4.route-table: 254 (main) and obviously the main route table is empty: [root@ ~]# ip ro default via 10.34.100.65 dev ovirtmgmt proto dhcp metric 425 10.34.100.0/24 dev ovirtmgmt proto kernel scope link src 10.34.100.116 metric 425 None of the concerned hosts can ping each other on such interface, and live migrations systematically fail. This behaviour is new with 4.4.9 and I don't know if it is a new (and not achevied) network feature introduced with centos stream to deal network filtering packets. A simple workaround would be "nmcli connection mod migration ipv4.route-table 0 && nmcli con up migration", but I'd like to understand why such strange (and unuseful ?) rule table are now randomly attributed? -- Nathanaël Blanchet Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

On Mon, Nov 29, 2021 at 11:47 PM Nathanaël Blanchet <blanchet@abes.fr> wrote:
Hi all,
Hi,
I 've finished migration from 4.4.4 to 4.4.9 and I'm facing a strange issue with routing table on my hosts: all IP addressed interfaces (and in particular gluster and migration ones that requiere an IP) are not part of the "254" or "0" usual ip rule.
Only network with default route role will be in the default table (254). This has been the case for quite a while. What has changed in 4.4.8 is that now NetworkManager is aware of that, before the routes were managed outside of NM and it might have caused some issues.
for instance:
[root@fuego ~]# nmcli con sh gluster |grep ipv4.route-table ipv4.route-table: 202179335
[root@fuego ~]# nmcli con sh migration |grep ipv4.route-table ipv4.route-table: 316605387
but ovirtmgmt:
[root@fuego ~]# nmcli con sh ovirtmgmt |grep ipv4.route-table ipv4.route-table: 254 (main)
and obviously the main route table is empty:
[root@ ~]# ip ro default via 10.34.100.65 dev ovirtmgmt proto dhcp metric 425 10.34.100.0/24 dev ovirtmgmt proto kernel scope link src 10.34.100.116 metric 425
Well the main table should contain only the default route gateway. You can take a look at other routes by: ip route show table all
None of the concerned hosts can ping each other on such interface, and live migrations systematically fail.
That might be a different issue related to BZ#2022354 <https://bugzilla.redhat.com/2022354>. To check if that's really the case please take a look into oVirt engine and there you should see all affected networks out-of-sync. On the BZ there are two possible workarounds.
This behaviour is new with 4.4.9 and I don't know if it is a new (and not achevied) network feature introduced with centos stream to deal network filtering packets.
A simple workaround would be "nmcli connection mod migration ipv4.route-table 0 && nmcli con up migration", but I'd like to understand why such strange (and unuseful ?) rule table are now randomly attributed?
I would highly suggest against that because the default route in the default table should be only one, with exception to some backup scenarios.
-- Nathanaël Blanchet
Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QQR2XW7EYWGWYR...
Let us know if it's the mentioned bug, if not we can investigate deeper what might be wrong. Thank you. Best Regards, Ales -- Ales Musil Software Engineer - RHV Network Red Hat EMEA <https://www.redhat.com> amusil@redhat.com IM: amusil <https://red.ht/sig>

Le 30/11/2021 à 07:25, Ales Musil a écrit :
On Mon, Nov 29, 2021 at 11:47 PM Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:
Hi all,
Hi,
I 've finished migration from 4.4.4 to 4.4.9 and I'm facing a strange issue with routing table on my hosts: all IP addressed interfaces (and in particular gluster and migration ones that requiere an IP) are not part of the "254" or "0" usual ip rule.
Only network with default route role will be in the default table (254). This has been the case for quite a while. What has changed in 4.4.8 is that now NetworkManager is aware of that, before the routes were managed outside of NM and it might have caused some issues.
for instance:
[root@fuego ~]# nmcli con sh gluster |grep ipv4.route-table ipv4.route-table: 202179335
[root@fuego ~]# nmcli con sh migration |grep ipv4.route-table ipv4.route-table: 316605387
but ovirtmgmt:
[root@fuego ~]# nmcli con sh ovirtmgmt |grep ipv4.route-table ipv4.route-table: 254 (main)
and obviously the main route table is empty:
[root@ ~]# ip ro default via 10.34.100.65 dev ovirtmgmt proto dhcp metric 425 10.34.100.0/24 <http://10.34.100.0/24> dev ovirtmgmt proto kernel scope link src 10.34.100.116 metric 425
Well the main table should contain only the default route gateway. You can take a look at other routes by: ip route show table all
Indeed, other routes exists [root@fuego ~]# ip ro sh table all 10.34.101.0/24 dev gluster table 202179335 proto kernel scope link src 10.34.101.140 metric 426 10.34.106.0/23 dev admin table 100729354 proto kernel scope link src 10.34.106.72 metric 425 10.34.108.0/23 dev migration table 316605387 proto kernel scope link src 10.34.108.56 metric 427 but don't seem to be used by kernel like they should be by the main table.
None of the concerned hosts can ping each other on such interface, and live migrations systematically fail.
That might be a different issue related to BZ#2022354 <https://bugzilla.redhat.com/2022354>. To check if that's really the case please take a look into oVirt engine and there you should see all affected networks out-of-sync. On the BZ there are two possible workarounds.
Not seems to be that BZ because there is no out of sync network in my case, but the issue could be from the same root cause, because of NM routing table integration.
This behaviour is new with 4.4.9 and I don't know if it is a new (and not achevied) network feature introduced with centos stream to deal network filtering packets.
A simple workaround would be "nmcli connection mod migration ipv4.route-table 0 && nmcli con up migration", but I'd like to understand why such strange (and unuseful ?) rule table are now randomly attributed?
I would highly suggest against that because the default route in the default table should be only one, with exception to some backup scenarios.
Notice that this command doesn't add additionnal default route in addition to the main one, but only source route of the defined networks that allow hosts to be reachabled on that networks. [root@fuego ~]# ip ro default via 10.34.100.65 dev ovirtmgmt proto dhcp metric 425 10.34.100.0/24 dev ovirtmgmt proto kernel scope link src 10.34.100.116 metric 425 10.34.106.0/23 dev admin proto kernel scope link src 10.34.107.76 metric 450 10.34.108.0/23 dev migration proto kernel scope link src 10.34.108.121 metric 465 This behaviour is the same as before 4.4.8 and let the live migration to be effective because kernel is now aware to route the network to the correct bridge/interface. To my mind, you can easily reproduce the bug because it is the same on my 10 hosts. Thanks for your help.
-- Nathanaël Blanchet
Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr> _______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QQR2XW7EYWGWYR... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/QQR2XW7EYWGWYRCKLVBCUUA4VURDHRB7/>
Let us know if it's the mentioned bug, if not we can investigate deeper what might be wrong.
Thank you. Best Regards, Ales
--
Ales Musil
Software Engineer - RHV Network
Red Hat EMEA <https://www.redhat.com>
amusil@redhat.com <mailto:amusil@redhat.com> IM: amusil
-- Nathanaël Blanchet Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

On Tue, Nov 30, 2021 at 10:08 AM Nathanaël Blanchet <blanchet@abes.fr> wrote:
Le 30/11/2021 à 07:25, Ales Musil a écrit :
On Mon, Nov 29, 2021 at 11:47 PM Nathanaël Blanchet <blanchet@abes.fr> wrote:
Hi all,
Hi,
I 've finished migration from 4.4.4 to 4.4.9 and I'm facing a strange issue with routing table on my hosts: all IP addressed interfaces (and in particular gluster and migration ones that requiere an IP) are not part of the "254" or "0" usual ip rule.
Only network with default route role will be in the default table (254). This has been the case for quite a while. What has changed in 4.4.8 is that now NetworkManager is aware of that, before the routes were managed outside of NM and it might have caused some issues.
for instance:
[root@fuego ~]# nmcli con sh gluster |grep ipv4.route-table ipv4.route-table: 202179335
[root@fuego ~]# nmcli con sh migration |grep ipv4.route-table ipv4.route-table: 316605387
but ovirtmgmt:
[root@fuego ~]# nmcli con sh ovirtmgmt |grep ipv4.route-table ipv4.route-table: 254 (main)
and obviously the main route table is empty:
[root@ ~]# ip ro default via 10.34.100.65 dev ovirtmgmt proto dhcp metric 425 10.34.100.0/24 dev ovirtmgmt proto kernel scope link src 10.34.100.116 metric 425
Well the main table should contain only the default route gateway. You can take a look at other routes by: ip route show table all
Indeed, other routes exists
[root@fuego ~]# ip ro sh table all 10.34.101.0/24 dev gluster table 202179335 proto kernel scope link src 10.34.101.140 metric 426 10.34.106.0/23 dev admin table 100729354 proto kernel scope link src 10.34.106.72 metric 425 10.34.108.0/23 dev migration table 316605387 proto kernel scope link src 10.34.108.56 metric 427
but don't seem to be used by kernel like they should be by the main table.
None of the concerned hosts can ping each other on such interface, and live migrations systematically fail.
That might be a different issue related to BZ#2022354 <https://bugzilla.redhat.com/2022354>. To check if that's really the case please take a look into oVirt engine and there you should see all affected networks out-of-sync. On the BZ there are two possible workarounds.
Not seems to be that BZ because there is no out of sync network in my case, but the issue could be from the same root cause, because of NM routing table integration.
This behaviour is new with 4.4.9 and I don't know if it is a new (and not achevied) network feature introduced with centos stream to deal network filtering packets.
A simple workaround would be "nmcli connection mod migration ipv4.route-table 0 && nmcli con up migration", but I'd like to understand why such strange (and unuseful ?) rule table are now randomly attributed?
I would highly suggest against that because the default route in the default table should be only one, with exception to some backup scenarios.
Notice that this command doesn't add additionnal default route in addition to the main one, but only source route of the defined networks that allow hosts to be reachabled on that networks.
[root@fuego ~]# ip ro default via 10.34.100.65 dev ovirtmgmt proto dhcp metric 425 10.34.100.0/24 dev ovirtmgmt proto kernel scope link src 10.34.100.116 metric 425 10.34.106.0/23 dev admin proto kernel scope link src 10.34.107.76 metric 450 10.34.108.0/23 dev migration proto kernel scope link src 10.34.108.121 metric 465
This behaviour is the same as before 4.4.8 and let the live migration to be effective because kernel is now aware to route the network to the correct bridge/interface.
To my mind, you can easily reproduce the bug because it is the same on my 10 hosts.
Thanks for your help.
If I understand it right your networks do not have any gateway (except the default route role) associated with them right? So you are essentially missing the network routes to be in the main table. In that case you can workaround it by setting the table to main as you did or copy the routes. But even better option would be to add gateway to those networks so it can properly create route rules which will then tell the kernel where to route packets that are going from those networks. This was working before because the NM was not aware that we have some routes in different tables and created network routes in the default table. Now the question is if it is a bug as this was more unintentional before. If you feel like this should be working the same way please open a bug and we can discuss it there. Thanks, Ales
-- Nathanaël Blanchet
Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QQR2XW7EYWGWYR...
Let us know if it's the mentioned bug, if not we can investigate deeper what might be wrong.
Thank you. Best Regards, Ales
--
Ales Musil
Software Engineer - RHV Network
Red Hat EMEA <https://www.redhat.com>
amusil@redhat.com IM: amusil <https://red.ht/sig>
-- Nathanaël Blanchet
Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr
-- Ales Musil Software Engineer - RHV Network Red Hat EMEA <https://www.redhat.com> amusil@redhat.com IM: amusil <https://red.ht/sig>
participants (2)
-
Ales Musil
-
Nathanaël Blanchet