Unable to start VM on host with OVS networking

I have a production ovirt setup that's gone through multiple updates over the years. At some point when 4.0 or 4.1 came out, I switched from legacy networking to OVS, and everything worked perfectly until I upgraded to 4.2. Since I upgraded to 4.2, I've been getting messages that the networks were all out of sync, but everything continued working properly. Today I tracked down the network sync problem, fixed it on one of my three hosts, and then attempted to start a VM on the host. It refused to start with the error message: "Unable to add bridge ovirtmgmt port vnet0: Operation not supported". From what I can tell, the xml being generated is still for the old legacy network. I completely reinstalled the node, using the latest 4.2.3 node ISO image, and it still doesn't work. In the cluster, the switch type is "OVS (Experimental)" (and this option can't be changed, apparently), the compatibility version is 4.2, the firewall type is firewalld and there's no "Default Network Provider". I suspect that my upgrades have somehow left my system in half OVS/half legacy mode, but I'm not sure how to move it all the way to OVS mode and I don't want to mess with the other two hosts until I'm sure I've got it figured out. My (compressed) vdsm.log is at https://www.lesbg.com/jdieter/vdsm.log.xz and my (compressed) supervdsm.log is at https://www.lesbg.com/jdieter/supervdsm.log.xz. If anyone could point me in the right direction to get this fixed, I'd sure appreciate it. Jonathan

Hi Jonathan, it seems somewhat similar to what I've been struggeling with. I've worked around it with a hook script until someone can explain from where it should discover that the interface should be created on an ovs bridge. Look at the thread with topic "hosted engine with openvswitch". /Sverker Den 2018-05-17 kl. 22:09, skrev Jonathan Dieter:
I have a production ovirt setup that's gone through multiple updates over the years. At some point when 4.0 or 4.1 came out, I switched from legacy networking to OVS, and everything worked perfectly until I upgraded to 4.2. Since I upgraded to 4.2, I've been getting messages that the networks were all out of sync, but everything continued working properly.
Today I tracked down the network sync problem, fixed it on one of my three hosts, and then attempted to start a VM on the host. It refused to start with the error message: "Unable to add bridge ovirtmgmt port vnet0: Operation not supported". From what I can tell, the xml being generated is still for the old legacy network. I completely reinstalled the node, using the latest 4.2.3 node ISO image, and it still doesn't work.
In the cluster, the switch type is "OVS (Experimental)" (and this option can't be changed, apparently), the compatibility version is 4.2, the firewall type is firewalld and there's no "Default Network Provider".
I suspect that my upgrades have somehow left my system in half OVS/half legacy mode, but I'm not sure how to move it all the way to OVS mode and I don't want to mess with the other two hosts until I'm sure I've got it figured out.
My (compressed) vdsm.log is at https://www.lesbg.com/jdieter/vdsm.log.xz and my (compressed) supervdsm.log is at https://www.lesbg.com/jdieter/supervdsm.log.xz.
If anyone could point me in the right direction to get this fixed, I'd sure appreciate it.
Jonathan _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org

On Mon, 2018-05-21 at 02:48 +0200, Sverker Abrahamsson wrote:
Hi Jonathan, it seems somewhat similar to what I've been struggeling with. I've worked around it with a hook script until someone can explain from where it should discover that the interface should be created on an ovs bridge. Look at the thread with topic "hosted engine with openvswitch".
/Sverker
Thanks, I've looked at it. I've temporarily bodged my network configuration back into the non-synchronized form that is actually working. Looking at the code, it seems that vdsm is getting the XML straight from the engine, but the engine doesn't seem to handle the OVS stuff right. I haven't figured out whether it's because vdsm is supposed to do some hook-like foo to make it work or whether engine is just missing the right code. Jonathan

On Tue, May 22, 2018 at 4:14 PM, Jonathan Dieter <jdieter@lesbg.com> wrote:
On Mon, 2018-05-21 at 02:48 +0200, Sverker Abrahamsson wrote:
Hi Jonathan, it seems somewhat similar to what I've been struggeling with. I've worked around it with a hook script until someone can explain from where it should discover that the interface should be created on an ovs bridge. Look at the thread with topic "hosted engine with openvswitch".
/Sverker
Thanks, I've looked at it. I've temporarily bodged my network configuration back into the non-synchronized form that is actually working.
Looking at the code, it seems that vdsm is getting the XML straight from the engine, but the engine doesn't seem to handle the OVS stuff right. I haven't figured out whether it's because vdsm is supposed to do some hook-like foo to make it work or whether engine is just missing the right code.
Hi, please check also [ovirt-users] update from 4.1.9 to 4.2.3 and OVN doubt
Jonathan _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org

The answer is.. OVN replaced OVS as the networking technology. You cannot switch back to legacy, they disabled switching between ovs and legacy in the default (1st) datacenter using the gui. You can however, use Ansible to switch it. Remove the VDSM ovs setting, it will just mess you up, and it's not supported in 4.2 To able to migrate a VM in 4.2, you have use OVN with OVS. I did this a few months back, on a 4.2.2 hosted-engine setup: 0) To setup a node in a cluster, make sure the cluster is in OVS, not legacy. 1) Make sure you have an OVN controller setup somewhere. Default appears to be the ovirt-hosted-engine. a) you should also have the external network provider for OVN configured also; see the web interface. 2) when you install the node, make sure it has openvswitch installed and running - ie: a) 'systemctl status openvswitch' says it's up and running. (be sure it's enable also) b) 'ovs-vsctl show' has vdsm bridges listed, and possibly a br-int bridge. 3) if there is no br-int bridge, do 'vdsm-tool ovn-config ovn-controller-ip host-ip' 4) when you have configured several nodes in the OVN, you should see them listed as geneve devices in 'ovs-vsctl show', ie: This is a 4 node cluster, so the other 3 nodes are expected: [ root at d8-r12-c1-n3 ~]# ovs-vsctl show 42df28ba-ffd6-4e61-b7b2-219576da51ab Bridge br-int fail_mode: secure Port "ovn-27461b-0" Interface "ovn-27461b-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.91"} Port "vnet1" Interface "vnet1" Port "ovn-a1c08f-0" Interface "ovn-a1c08f-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.87"} Port "patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831" Interface "patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831" type: patch options: {peer="patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int"} Port "vnet0" Interface "vnet0" Port "patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec" Interface "patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec" type: patch options: {peer="patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int"} Port "ovn-8da92c-0" Interface "ovn-8da92c-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.95"} Port br-int Interface br-int type: internal Bridge "vdsmbr_LZmj3uJ1" Port "vdsmbr_LZmj3uJ1" Interface "vdsmbr_LZmj3uJ1" type: internal Port "net211" tag: 211 Interface "net211" type: internal Port "eno2" Interface "eno2" Bridge "vdsmbr_e7rcnufp" Port "vdsmbr_e7rcnufp" Interface "vdsmbr_e7rcnufp" type: internal Port ipmi tag: 20 Interface ipmi type: internal Port ovirtmgmt tag: 50 Interface ovirtmgmt type: internal Port "patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int" Interface "patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int" type: patch options: {peer="patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831"} Port "eno1" Interface "eno1" Port "patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int" Interface "patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int" type: patch options: {peer="patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec"} ovs_version: "2.7.3" 5) Create in the cluster the legacy style bridge networks - ie, ovirtmgmt, etc. Do this just like you where creating them for the legacy network. Define the VLAN #, the MTU, etc. 6) Now, create in the network config, the OVN networks - ie, ovn-ovirtmgmt is on an external provider (select OVN), and make sure 'connect to physical network' is checked, and the correct network from step 5 is picked. Save this off. This will connect the two networks together in a bridge, and all services are visible to both ie dhcp, dns.. 7) when you create the VM, select the OVN network interface, not the legacy bridge interface (this is why I decided to prefix with 'ovn-'). 8) Create the vm, start it, migrate, stop, re-start, etc, it all should work now. Lots of reading.. lots of interesting stuff found.. finally figured this out after reading a bunch of bug fixes for the latest RC (released today) The only doc link: https://ovirt.org/develop/release-management/features/network/provider-physi...

I believe the user community deserves a little background for this decision. OVS has been "experimental" since ovirt-4.0.z with migration disabled by default. We were not aware of huge benefits it had over the default Linux bridge, and did not expect people to be using it in important deployments. I would love to hear your experience regarding our OVS support, and why you have chosen it. In ovirt-4.2.0, the way in which VM libvirt definition is built has changed considerably, and takes place in ovirt-engine, not in vdsm. The vdsm code that supports OVS connectivity was disabled in ovirt-4.2.3 which means that indeed, the experimental OVS feature is no longer available for direct usage (unless you still use cluster compatibility level 4.1) However, as Thomas Davis explains, with OVN + physnet, ovirt-4.2 gives you a matching functionality, including live migration out-of-the-box. OVS switchtype was upgraded from "experimental" to "tech-preview". I'd like to drop the advisory altogether, but we keep it because we still have bugs and missing features comparing to Linux bridge clusters. We've blocked changing the switchtype of existing clusters because this functionality is buggy (particularly on the SPM host), and as of ovirt-4.2, we do not have code to support live migration from a Linux bridge host to an OVS one. Only cold migration is possible. We kept it open over REST to allow testing and bugfixes to that flow, as well as usage by careful users. Thanks for using oVirt and its new features, and for engaging with the community. Regards, Dan. On Tue, May 22, 2018 at 9:20 PM, <tadavis@lbl.gov> wrote:
The answer is..
OVN replaced OVS as the networking technology. You cannot switch back to legacy, they disabled switching between ovs and legacy in the default (1st) datacenter using the gui. You can however, use Ansible to switch it.
Remove the VDSM ovs setting, it will just mess you up, and it's not supported in 4.2
To able to migrate a VM in 4.2, you have use OVN with OVS.
I did this a few months back, on a 4.2.2 hosted-engine setup:
0) To setup a node in a cluster, make sure the cluster is in OVS, not legacy.
1) Make sure you have an OVN controller setup somewhere. Default appears to be the ovirt-hosted-engine. a) you should also have the external network provider for OVN configured also; see the web interface.
2) when you install the node, make sure it has openvswitch installed and running - ie: a) 'systemctl status openvswitch' says it's up and running. (be sure it's enable also) b) 'ovs-vsctl show' has vdsm bridges listed, and possibly a br-int bridge.
3) if there is no br-int bridge, do 'vdsm-tool ovn-config ovn-controller-ip host-ip'
4) when you have configured several nodes in the OVN, you should see them listed as geneve devices in 'ovs-vsctl show', ie:
This is a 4 node cluster, so the other 3 nodes are expected:
[ root at d8-r12-c1-n3 ~]# ovs-vsctl show 42df28ba-ffd6-4e61-b7b2-219576da51ab Bridge br-int fail_mode: secure Port "ovn-27461b-0" Interface "ovn-27461b-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.91"} Port "vnet1" Interface "vnet1" Port "ovn-a1c08f-0" Interface "ovn-a1c08f-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.87"} Port "patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831" Interface "patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831" type: patch options: {peer="patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int"} Port "vnet0" Interface "vnet0" Port "patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec" Interface "patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec" type: patch options: {peer="patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int"} Port "ovn-8da92c-0" Interface "ovn-8da92c-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.95"} Port br-int Interface br-int type: internal Bridge "vdsmbr_LZmj3uJ1" Port "vdsmbr_LZmj3uJ1" Interface "vdsmbr_LZmj3uJ1" type: internal Port "net211" tag: 211 Interface "net211" type: internal Port "eno2" Interface "eno2" Bridge "vdsmbr_e7rcnufp" Port "vdsmbr_e7rcnufp" Interface "vdsmbr_e7rcnufp" type: internal Port ipmi tag: 20 Interface ipmi type: internal Port ovirtmgmt tag: 50 Interface ovirtmgmt type: internal Port "patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int" Interface "patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int" type: patch options: {peer="patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831"} Port "eno1" Interface "eno1" Port "patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int" Interface "patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int" type: patch options: {peer="patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec"} ovs_version: "2.7.3"
5) Create in the cluster the legacy style bridge networks - ie, ovirtmgmt, etc. Do this just like you where creating them for the legacy network. Define the VLAN #, the MTU, etc.
6) Now, create in the network config, the OVN networks - ie, ovn-ovirtmgmt is on an external provider (select OVN), and make sure 'connect to physical network' is checked, and the correct network from step 5 is picked. Save this off.
This will connect the two networks together in a bridge, and all services are visible to both ie dhcp, dns..
7) when you create the VM, select the OVN network interface, not the legacy bridge interface (this is why I decided to prefix with 'ovn-').
8) Create the vm, start it, migrate, stop, re-start, etc, it all should work now.
Lots of reading.. lots of interesting stuff found.. finally figured this out after reading a bunch of bug fixes for the latest RC (released today)
The only doc link:
https://ovirt.org/develop/release-management/features/network/provider-physi... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org

Do I even tell you how I do the conversion from legacy to OVS? It does use ansible, and creates the vdsm nework configuration in /var/lib/vdsm for an OVS/OVN based network. It works great, the nodes join right up to oVirt engine with no configuration needed. But I know that's unsupported. :) But if your using Hosted Engine, you cannot use OVN/OVS on the Hosted Engine nodes with the 4.2.3 release because VDSM no longer creates the proper VM config.. which also means it's a chicken/egg problem. What comes first, the engine or the ovn controller? thomas

In my setup the reason why I attempted to use OVS/OVN is that I (currently) have two hosts in a data center (Hetzner) which does not have a common subnet. Each of these hosts have one public IPv4 address. Because of that I want the ovirt networks to be on private ip ranges and they do not correspond to any physical network interfaces. I've been able to work around that by making dummy interfaces visible for vdsm to which it can bind the ovirtmgmt bridge. It also means that I need to connect the obirtmgmt bridges on the two hosts with tunnels. With OVS I can set that up manually and it works great, issue is that vdsm wants to control the network interfaces and creates an OVS bridge with random name to attach ovirtmgmt port to, hence I have no convenient way to connect these bridges. I could put a hook in after_network_setup to create tunnels, but I suppose the OVS bridges needs to have the same name on both sides. I want to use ovirt for a test environment with quite complex network, hence why OVN is desired to be able to set it up virtual. With OVN + physnet as you describe it I assume I should get the desired functionality as ovirtmgmt needs to be present on all hosts. Since you disabled changing switch type in engine gui I changed it in db, don't know if anything more is needed, and in the vdsm persistent file. On vdsm level that works fine, the OVS bridge with random name is created with ovirtmgmt port, but as you write since 4.2.3 it no longer creates the port for hosted engine on the OVS bridge. I worked around that with a hook script that add the missing elements to interface section of the vm xml. My understanding is that for OVN+physnet then OVS switch is needed but doesn't then the ovirtmgmt port has to be on an OVS bridge? If so then hosted engine must be able to connect it's port to the same bridge. Next concern is where should the OVN databases be? With hosted-engine setup they are created on the vm which seems like a chicken and the egg issue to me as how could the hosts where ovn-controller runs be able to connect to db before the vm has started? In my setup the vm only have private address which means the other host is not able to reach ovn database until the virtual network has established. Therefore I created OVN databases on one of the hosts, and configured with ssl cert/keys so that they are able to communicate fine. I configured ovirt-provider-ovn running on hosted engine vm to connect to the db host but even when ovn-remote is set to the correct address on the host the requests from ovirt-provider-ovn towards port 6641 still goes to the engine vm. Are there any additional steps that is needed to take? /Sverker Den 2018-05-25 kl. 17:41, skrev Dan Kenigsberg:
I believe the user community deserves a little background for this decision. OVS has been "experimental" since ovirt-4.0.z with migration disabled by default. We were not aware of huge benefits it had over the default Linux bridge, and did not expect people to be using it in important deployments.
I would love to hear your experience regarding our OVS support, and why you have chosen it.
In ovirt-4.2.0, the way in which VM libvirt definition is built has changed considerably, and takes place in ovirt-engine, not in vdsm. The vdsm code that supports OVS connectivity was disabled in ovirt-4.2.3 which means that indeed, the experimental OVS feature is no longer available for direct usage (unless you still use cluster compatibility level 4.1)
However, as Thomas Davis explains, with OVN + physnet, ovirt-4.2 gives you a matching functionality, including live migration out-of-the-box. OVS switchtype was upgraded from "experimental" to "tech-preview". I'd like to drop the advisory altogether, but we keep it because we still have bugs and missing features comparing to Linux bridge clusters.
We've blocked changing the switchtype of existing clusters because this functionality is buggy (particularly on the SPM host), and as of ovirt-4.2, we do not have code to support live migration from a Linux bridge host to an OVS one. Only cold migration is possible. We kept it open over REST to allow testing and bugfixes to that flow, as well as usage by careful users.
Thanks for using oVirt and its new features, and for engaging with the community.
Regards, Dan.
On Tue, May 22, 2018 at 9:20 PM, <tadavis@lbl.gov> wrote:
The answer is..
OVN replaced OVS as the networking technology. You cannot switch back to legacy, they disabled switching between ovs and legacy in the default (1st) datacenter using the gui. You can however, use Ansible to switch it.
Remove the VDSM ovs setting, it will just mess you up, and it's not supported in 4.2
To able to migrate a VM in 4.2, you have use OVN with OVS.
I did this a few months back, on a 4.2.2 hosted-engine setup:
0) To setup a node in a cluster, make sure the cluster is in OVS, not legacy.
1) Make sure you have an OVN controller setup somewhere. Default appears to be the ovirt-hosted-engine. a) you should also have the external network provider for OVN configured also; see the web interface.
2) when you install the node, make sure it has openvswitch installed and running - ie: a) 'systemctl status openvswitch' says it's up and running. (be sure it's enable also) b) 'ovs-vsctl show' has vdsm bridges listed, and possibly a br-int bridge.
3) if there is no br-int bridge, do 'vdsm-tool ovn-config ovn-controller-ip host-ip'
4) when you have configured several nodes in the OVN, you should see them listed as geneve devices in 'ovs-vsctl show', ie:
This is a 4 node cluster, so the other 3 nodes are expected:
[ root at d8-r12-c1-n3 ~]# ovs-vsctl show 42df28ba-ffd6-4e61-b7b2-219576da51ab Bridge br-int fail_mode: secure Port "ovn-27461b-0" Interface "ovn-27461b-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.91"} Port "vnet1" Interface "vnet1" Port "ovn-a1c08f-0" Interface "ovn-a1c08f-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.87"} Port "patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831" Interface "patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831" type: patch options: {peer="patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int"} Port "vnet0" Interface "vnet0" Port "patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec" Interface "patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec" type: patch options: {peer="patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int"} Port "ovn-8da92c-0" Interface "ovn-8da92c-0" type: geneve options: {csum="true", key=flow, remote_ip="192.168.85.95"} Port br-int Interface br-int type: internal Bridge "vdsmbr_LZmj3uJ1" Port "vdsmbr_LZmj3uJ1" Interface "vdsmbr_LZmj3uJ1" type: internal Port "net211" tag: 211 Interface "net211" type: internal Port "eno2" Interface "eno2" Bridge "vdsmbr_e7rcnufp" Port "vdsmbr_e7rcnufp" Interface "vdsmbr_e7rcnufp" type: internal Port ipmi tag: 20 Interface ipmi type: internal Port ovirtmgmt tag: 50 Interface ovirtmgmt type: internal Port "patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int" Interface "patch-f7a19c7d-021a-455d-bf3a-c15e212d8831-to-br-int" type: patch options: {peer="patch-br-int-to-f7a19c7d-021a-455d-bf3a-c15e212d8831"} Port "eno1" Interface "eno1" Port "patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int" Interface "patch-7874ba85-8f6f-4e43-9535-5a1b1353a9ec-to-br-int" type: patch options: {peer="patch-br-int-to-7874ba85-8f6f-4e43-9535-5a1b1353a9ec"} ovs_version: "2.7.3"
5) Create in the cluster the legacy style bridge networks - ie, ovirtmgmt, etc. Do this just like you where creating them for the legacy network. Define the VLAN #, the MTU, etc.
6) Now, create in the network config, the OVN networks - ie, ovn-ovirtmgmt is on an external provider (select OVN), and make sure 'connect to physical network' is checked, and the correct network from step 5 is picked. Save this off.
This will connect the two networks together in a bridge, and all services are visible to both ie dhcp, dns..
7) when you create the VM, select the OVN network interface, not the legacy bridge interface (this is why I decided to prefix with 'ovn-').
8) Create the vm, start it, migrate, stop, re-start, etc, it all should work now.
Lots of reading.. lots of interesting stuff found.. finally figured this out after reading a bunch of bug fixes for the latest RC (released today)
The only doc link:
https://ovirt.org/develop/release-management/features/network/provider-physi... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org
participants (5)
-
Dan Kenigsberg
-
Jonathan Dieter
-
Simone Tiraboschi
-
Sverker Abrahamsson
-
tadavis@lbl.gov