Need to enable STP on ovirt bridges

Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs. I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating. After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears.... Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution? Here are some details about my systems, if you need it. selinux is disabled. [root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]#

I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE. https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h... Regards, Paul S. ________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs. I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating. After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears.... Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution? Here are some details about my systems, if you need it. selinux is disabled. [root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/

Hi Paul, Thank you very much for the response. However, I have tried those options. Adding "stp=on" does not enable STP on the bridges. Each node has a 1Gb Ethernet connection as well as a 10Gb FC connection, I've tried moving the VM migration network to each and the effect is the same. The vdsm disables STP when it writes the ifcfg-em0 or ifcfg-p1p1 files in /etc/sysconfig/networking-scripts. This behavior can been seen when using the standard "brctl show" utilities directly logged into the host. Can the templates that the VDSM uses to write those files be modified, possibly? Thank you! On Mon, Aug 19, 2019 at 6:26 AM Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/

Maybe try changing it in /var/lib/vdsm/persistence/netconf/nets/ovirtmgmt I'm not networking expert but isn't the migration network just for traffic when transferring the VM between hosts. I think the problem is more because your VM network connection is transferred to a new host and port on your physical switch. Regards, Paul S. ________________________________________ From: Curtis E. Combs Jr. <ej.albany@gmail.com> Sent: 19 August 2019 12:39 To: Staniforth, Paul Cc: users@ovirt.org Subject: Re: [ovirt-users] Need to enable STP on ovirt bridges Hi Paul, Thank you very much for the response. However, I have tried those options. Adding "stp=on" does not enable STP on the bridges. Each node has a 1Gb Ethernet connection as well as a 10Gb FC connection, I've tried moving the VM migration network to each and the effect is the same. The vdsm disables STP when it writes the ifcfg-em0 or ifcfg-p1p1 files in /etc/sysconfig/networking-scripts. This behavior can been seen when using the standard "brctl show" utilities directly logged into the host. Can the templates that the VDSM uses to write those files be modified, possibly? Thank you! On Mon, Aug 19, 2019 at 6:26 AM Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fovirt.org%...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/
To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/

e.albany, STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge sends, essentially 'hello' messages as multicast packets. The switches then detect the loop in the network and block one of the links to prevent such things as a broadcast storm. There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link, Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility. After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP packets where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again. I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware? Regards, Tony On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

Hey Tony! I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the whole oVirt cluster is to have it on. After changing it according to Paul's instruction, it works like it has never worked before. Every migration event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive". It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online. I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them.... Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem. Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues... Thanks! cecjr On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote:
e.albany,
STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge sends, essentially 'hello' messages as multicast packets. The switches then detect the loop in the network and block one of the links to prevent such things as a broadcast storm.
There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link,
Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility.
After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP packets where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again.
I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware?
Regards,
Tony
On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

If there is some other configuration that I'm not seeing, please do let me know. I'd love to know more about what the expected configuration is that enables VM migration and how it's "supposed" to be configured. I have 3 hosts - each of these has two bridges. Each of these bridges are named the same for each host. Each bridge has a different IP address that corresponds to the NIC attached to the host on that network. It seemed like the only and obvious choice... On Mon, Aug 19, 2019 at 9:52 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the whole oVirt cluster is to have it on. After changing it according to Paul's instruction, it works like it has never worked before. Every migration event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online.
I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues...
Thanks! cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote:
e.albany,
STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge sends, essentially 'hello' messages as multicast packets. The switches then detect the loop in the network and block one of the links to prevent such things as a broadcast storm.
There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link,
Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility.
After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP packets where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again.
I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware?
Regards,
Tony
On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

:) They might be using Cisco's per-vlan spanning tree on the network side. It is possible to capture the packets coming in from the network and confirm that. Attached screenshot of wireshark for you for reference. Glad you're all working :) Regards Tony Pearce On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the whole oVirt cluster is to have it on. After changing it according to Paul's instruction, it works like it has never worked before. Every migration event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online.
I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues...
Thanks! cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote:
e.albany,
STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge sends, essentially 'hello' messages as multicast packets. The switches then detect the loop in the network and block one of the links to prevent such things as a broadcast storm.
There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link,
Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility.
After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP packets where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again.
I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware?
Regards,
Tony
On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

Cool, I can capture some packets tomorrow when I'm in the office and see how that compares... But, yea, it's a hassle to get them to respond IF they do, so the only real options I'm going to have are what I can do with my servers from the OS. No physical access. No nice DC guy to help me out. On Mon, Aug 19, 2019 at 10:06 PM Tony Pearce <tonyppe@gmail.com> wrote:
:) They might be using Cisco's per-vlan spanning tree on the network side. It is possible to capture the packets coming in from the network and confirm that.
Attached screenshot of wireshark for you for reference.
Glad you're all working :)
Regards
Tony Pearce
On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the whole oVirt cluster is to have it on. After changing it according to Paul's instruction, it works like it has never worked before. Every migration event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online.
I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues...
Thanks! cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote:
e.albany,
STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge sends, essentially 'hello' messages as multicast packets. The switches then detect the loop in the network and block one of the links to prevent such things as a broadcast storm.
There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link,
Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility.
After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP packets where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again.
I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware?
Regards,
Tony
On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

A couple of links I found helpful, thought I'd send them over http://therandomsecurityguy.com/openvswitch-cheat-sheet/ https://ovirt.org/develop/release-management/features/network/openvswitch/na... With STP off, if the network is detecting a loop then it will have to block a link. With STP on I guess it's allowing the network to remain forwarding and the blocking to occur elsewhere. 👍 Tony Pearce On Tue, 20 Aug 2019 at 10:12, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Cool, I can capture some packets tomorrow when I'm in the office and see how that compares...
But, yea, it's a hassle to get them to respond IF they do, so the only real options I'm going to have are what I can do with my servers from the OS. No physical access. No nice DC guy to help me out.
On Mon, Aug 19, 2019 at 10:06 PM Tony Pearce <tonyppe@gmail.com> wrote:
:) They might be using Cisco's per-vlan spanning tree on the network side. It is possible to capture the packets coming in from the network and confirm that.
Attached screenshot of wireshark for you for reference.
Glad you're all working :)
Regards
Tony Pearce
On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the whole oVirt cluster is to have it on. After changing it according to Paul's instruction, it works like it has never worked before. Every migration event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online.
I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues...
Thanks! cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote:
e.albany,
STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge sends, essentially 'hello' messages as multicast packets. The switches then detect the loop in the network and block one of the links to prevent such things as a broadcast storm.
There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link,
Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility.
After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP packets where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again.
I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware?
Regards,
Tony
On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

Tony, Well, thank you for that, but I'm not using openvswitchs, I'm just using regular bridges. Are you suggesting that I do? From what I can see in the interface they have "(experimental)" marked on them and we'd like to see production with this at some point. None of my ports are trunking between VLANs - they only have one in untagged mode - and there's only 2 VLANs here, one per-port. Thanks again! cecjr On Mon, Aug 19, 2019 at 10:36 PM Tony Pearce <tonyppe@gmail.com> wrote:
A couple of links I found helpful, thought I'd send them over http://therandomsecurityguy.com/openvswitch-cheat-sheet/
https://ovirt.org/develop/release-management/features/network/openvswitch/na...
With STP off, if the network is detecting a loop then it will have to block a link. With STP on I guess it's allowing the network to remain forwarding and the blocking to occur elsewhere. 👍
Tony Pearce
On Tue, 20 Aug 2019 at 10:12, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Cool, I can capture some packets tomorrow when I'm in the office and see how that compares...
But, yea, it's a hassle to get them to respond IF they do, so the only real options I'm going to have are what I can do with my servers from
physical access. No nice DC guy to help me out.
On Mon, Aug 19, 2019 at 10:06 PM Tony Pearce <tonyppe@gmail.com> wrote:
:) They might be using Cisco's per-vlan spanning tree on the network side. It is possible to capture the packets coming in from the network and confirm that.
Attached screenshot of wireshark for you for reference.
Glad you're all working :)
Regards
Tony Pearce
On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. <ej.albany@gmail.com>
wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the whole oVirt cluster is to have it on. After changing it according to Paul's instruction, it works like it has never worked before. Every
migration
event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online.
I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues...
Thanks! cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote:
e.albany,
STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge sends, essentially 'hello' messages as multicast packets. The switches
detect the loop in the network and block one of the links to
such things as a broadcast storm.
There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link,
Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility.
After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP
where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again.
I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware?
Regards,
Tony
On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote:
I haven't used FC with oVirt but in the following it shows the
bridge options available and how to enable Ethtool and FCoE.
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h...
Regards, Paul S.
________________________________________ From: ej.albany@gmail.com <ej.albany@gmail.com> Sent: 17 August 2019 10:25 To: users@ovirt.org Subject: [ovirt-users] Need to enable STP on ovirt bridges
Hello. I have been trying to figure out an issue for a very long
time.
That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon
rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni
the OS. No then prevent packets pebs bts pclmulqdq
dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... To view the terms under which this email is distributed, please go to:- http://leedsbeckett.ac.uk/disclaimer/email/ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

No - no recommendations from me to use either. I took it that you were using ovs bridge as I was not aware of another bridge. The only other option I was aware of if what I am using, vlan interfaces and kernel vlan tags. If you could share a link to what you're using, I would be keen to read up on it to know more. *Tony Pearce* On Tue, 20 Aug 2019 at 16:27, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Tony,
Well, thank you for that, but I'm not using openvswitchs, I'm just using regular bridges. Are you suggesting that I do?
From what I can see in the interface they have "(experimental)" marked on them and we'd like to see production with this at some point.
None of my ports are trunking between VLANs - they only have one in untagged mode - and there's only 2 VLANs here, one per-port.
Thanks again! cecjr
On Mon, Aug 19, 2019 at 10:36 PM Tony Pearce <tonyppe@gmail.com> wrote:
A couple of links I found helpful, thought I'd send them over http://therandomsecurityguy.com/openvswitch-cheat-sheet/
https://ovirt.org/develop/release-management/features/network/openvswitch/na...
With STP off, if the network is detecting a loop then it will have to block a link. With STP on I guess it's allowing the network to remain forwarding and the blocking to occur elsewhere. 👍
Tony Pearce
On Tue, 20 Aug 2019 at 10:12, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Cool, I can capture some packets tomorrow when I'm in the office and see how that compares...
But, yea, it's a hassle to get them to respond IF they do, so the only real options I'm going to have are what I can do with my servers from
physical access. No nice DC guy to help me out.
On Mon, Aug 19, 2019 at 10:06 PM Tony Pearce <tonyppe@gmail.com> wrote:
:) They might be using Cisco's per-vlan spanning tree on the network side. It is possible to capture the packets coming in from the network and confirm that.
Attached screenshot of wireshark for you for reference.
Glad you're all working :)
Regards
Tony Pearce
On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the whole oVirt cluster is to have it on. After changing it according to
Paul's
instruction, it works like it has never worked before. Every migration event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online.
I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues...
Thanks! cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote:
e.albany,
STP is meant to block loops in layer 2. In basic operation, a root bridge is elected which is the root of the tree. This bridge
sends,
essentially 'hello' messages as multicast packets. The switches
detect the loop in the network and block one of the links to
such things as a broadcast storm.
There are different flavours of STP but "STP" usually means the hellos are sent over VLAN 1 (or no vlan). Therefore if you have multiple VLANs on links, the hellos are still only sent over VLAN 1 and all VLANs are dealt with that way. Meaning if a link is blocked then all VLANs are blocked on that link,
Then came the different flavours, one of which is per-vlan STP. This allows individual VLANs to be blocked and gives more flexibility.
After STP has dealt with the blocking, this link blocking will continue until a change in the network is detected. This is detected by the absence of the STP packets or the presence of new STP
where there shouldnt be. When this happens, STP packets are flooded everywhere to discover the new network topology. Ultimately, the loop will be blocked again.
I think that you have two STP versions running in your network and it's causing the issue. An easy test would be to remove the loop manually in the network and leave STP off on the ovirt host. You can view the topology as-per the network STP devices by obtaining info from the devices such as bridge priorities etc. What is your network hardware?
Regards,
Tony
On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul <P.Staniforth@leedsbeckett.ac.uk> wrote: > > I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE. > > https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h... > > > Regards, > Paul S. > > ________________________________________ > From: ej.albany@gmail.com <ej.albany@gmail.com> > Sent: 17 August 2019 10:25 > To: users@ovirt.org > Subject: [ovirt-users] Need to enable STP on ovirt bridges > > Hello. I have been trying to figure out an issue for a very long time. > That issue relates to the ethernet and 10gb fc links that I have on my > cluster being disabled any time a migration occurs. > > I believe this is because I need to have STP turned on in order to > participate with the switch. However, there does not seem to be any > way to tell oVirt to stop turning it off! Very frustrating. > > After entering a cronjob that enables stp on all bridges every 1 > minute, the migration issue disappears.... > > Is there any way at all to do without this cronjob and set STP to be > ON without having to resort to such a silly solution? > > Here are some details about my systems, if you need it. > > > selinux is disabled. > > > > > > > > > > [root@swm-02 ~]# rpm -qa | grep ovirt > ovirt-imageio-common-1.5.1-0.el7.x86_64 > ovirt-release43-4.3.5.2-1.el7.noarch > ovirt-imageio-daemon-1.5.1-0.el7.noarch > ovirt-vmconsole-host-1.0.7-2.el7.noarch > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > cockpit-machines-ovirt-195.1-1.el7.noarch > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > ovirt-vmconsole-1.0.7-2.el7.noarch > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > ovirt-host-deploy-common-1.8.0-1.el7.noarch > ovirt-host-4.3.4-1.el7.x86_64 > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > ovirt-ansible-repositories-1.1.5-1.el7.noarch > [root@swm-02 ~]# cat /etc/redhat-release > CentOS Linux release 7.6.1810 (Core) > [root@swm-02 ~]# uname -r > 3.10.0-957.27.2.el7.x86_64 > You have new mail in /var/spool/mail/root > [root@swm-02 ~]# ip a > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > group default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > test state UP group default qlen 1000 > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > ovirtmgmt state UP group default qlen 1000 > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > group default qlen 1000 > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > state UP group default qlen 1000 > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > valid_lft forever preferred_lft forever > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > noqueue state UP group default qlen 1000 > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > valid_lft forever preferred_lft forever > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > group default qlen 1000 > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > ovirtmgmt state UNKNOWN group default qlen 1000 > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > [root@swm-02 ~]# free -m > total used free shared buff/cache available > Mem: 64413 1873 61804 9 735 62062 > Swap: 16383 0 16383 > [root@swm-02 ~]# free -h > total used free shared buff/cache available > Mem: 62G 1.8G 60G 9.5M 735M 60G > Swap: 15G 0B 15G > [root@swm-02 ~]# ls > ls lsb_release lshw lslocks > lsmod lspci lssubsys > lsusb.py > lsattr lscgroup lsinitrd lslogins > lsns lss16toppm lstopo-no-graphics > lsblk lscpu lsipc lsmem > lsof lsscsi lsusb > [root@swm-02 ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 16 > On-line CPU(s) list: 0-15 > Thread(s) per core: 2 > Core(s) per socket: 4 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 44 > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > Stepping: 2 > CPU MHz: 3192.064 > BogoMIPS: 6384.12 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 12288K > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon
> rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni
the OS. No then prevent packets pebs bts pclmulqdq
> dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > [root@swm-02 ~]# > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... > oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... > List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... > To view the terms under which this email is distributed, please go to:- > http://leedsbeckett.ac.uk/disclaimer/email/ > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

Oh, I'm just using the standard Linux bridges. I believe in ovirt nomenclature they're called "legacy bridges". This is not by any real choice but just what is created by default, I believe. In the first link you sent, they are calling them "Linux legacy networking model" - though I only have 2 options when I make a new switch - the default and (it says) "openvswitch (experimental)" On Tue, Aug 20, 2019 at 4:47 AM Tony Pearce <tonyppe@gmail.com> wrote:
No - no recommendations from me to use either. I took it that you were using ovs bridge as I was not aware of another bridge. The only other option I was aware of if what I am using, vlan interfaces and kernel vlan tags. If you could share a link to what you're using, I would be keen to read up on it to know more.
*Tony Pearce*
On Tue, 20 Aug 2019 at 16:27, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Tony,
Well, thank you for that, but I'm not using openvswitchs, I'm just using regular bridges. Are you suggesting that I do?
From what I can see in the interface they have "(experimental)" marked on them and we'd like to see production with this at some point.
None of my ports are trunking between VLANs - they only have one in untagged mode - and there's only 2 VLANs here, one per-port.
Thanks again! cecjr
On Mon, Aug 19, 2019 at 10:36 PM Tony Pearce <tonyppe@gmail.com> wrote:
A couple of links I found helpful, thought I'd send them over http://therandomsecurityguy.com/openvswitch-cheat-sheet/
https://ovirt.org/develop/release-management/features/network/openvswitch/na...
With STP off, if the network is detecting a loop then it will have to block a link. With STP on I guess it's allowing the network to remain forwarding and the blocking to occur elsewhere. 👍
Tony Pearce
On Tue, 20 Aug 2019 at 10:12, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Cool, I can capture some packets tomorrow when I'm in the office and see how that compares...
But, yea, it's a hassle to get them to respond IF they do, so the only real options I'm going to have are what I can do with my servers from
physical access. No nice DC guy to help me out.
On Mon, Aug 19, 2019 at 10:06 PM Tony Pearce <tonyppe@gmail.com> wrote:
:) They might be using Cisco's per-vlan spanning tree on the network side. It is possible to capture the packets coming in from the
network
and confirm that.
Attached screenshot of wireshark for you for reference.
Glad you're all working :)
Regards
Tony Pearce
On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. < ej.albany@gmail.com> wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the only way to get migrations to work at all without breaking the
whole
oVirt cluster is to have it on. After changing it according to Paul's instruction, it works like it has never worked before. Every migration event was successful. Whereas before and even at times with the cronjob (when vdsm set STP to off between cron runs) the link would drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp requests...nothing. I never got a good idea of how long this would be for, but it would, eventually go away and the link would come back online.
I have no access to the hardware. From using tcpdump to get some CDP packets, I do know that it's Cisco switches but the IT team here is completely unresponsive (they literally ignore our tickets) and the co-lo where our servers are hosted won't even pick up the phone for anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is very functional, though. I created around 15 VMs today and migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can really do anything I want and I can just IPMI reboot the nodes if it causes issues...
Thanks! cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote: > > e.albany, > > STP is meant to block loops in layer 2. In basic operation, a root > bridge is elected which is the root of the tree. This bridge sends, > essentially 'hello' messages as multicast packets. The switches
> detect the loop in the network and block one of the links to
> such things as a broadcast storm. > > There are different flavours of STP but "STP" usually means the hellos > are sent over VLAN 1 (or no vlan). Therefore if you have multiple > VLANs on links, the hellos are still only sent over VLAN 1 and all > VLANs are dealt with that way. Meaning if a link is blocked then all > VLANs are blocked on that link, > > Then came the different flavours, one of which is per-vlan STP. This > allows individual VLANs to be blocked and gives more flexibility. > > After STP has dealt with the blocking, this link blocking will > continue until a change in the network is detected. This is detected > by the absence of the STP packets or the presence of new STP
> where there shouldnt be. When this happens, STP packets are flooded > everywhere to discover the new network topology. Ultimately, the loop > will be blocked again. > > I think that you have two STP versions running in your network and > it's causing the issue. An easy test would be to remove the loop > manually in the network and leave STP off on the ovirt host. You can > view the topology as-per the network STP devices by obtaining info > from the devices such as bridge priorities etc. What is your network > hardware? > > Regards, > > Tony > > > On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul > <P.Staniforth@leedsbeckett.ac.uk> wrote: > > > > I haven't used FC with oVirt but in the following it shows the bridge options available and how to enable Ethtool and FCoE. > > > > https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h... > > > > > > Regards, > > Paul S. > > > > ________________________________________ > > From: ej.albany@gmail.com <ej.albany@gmail.com> > > Sent: 17 August 2019 10:25 > > To: users@ovirt.org > > Subject: [ovirt-users] Need to enable STP on ovirt bridges > > > > Hello. I have been trying to figure out an issue for a very long time. > > That issue relates to the ethernet and 10gb fc links that I have on my > > cluster being disabled any time a migration occurs. > > > > I believe this is because I need to have STP turned on in order to > > participate with the switch. However, there does not seem to be any > > way to tell oVirt to stop turning it off! Very frustrating. > > > > After entering a cronjob that enables stp on all bridges every 1 > > minute, the migration issue disappears.... > > > > Is there any way at all to do without this cronjob and set STP to be > > ON without having to resort to such a silly solution? > > > > Here are some details about my systems, if you need it. > > > > > > selinux is disabled. > > > > > > > > > > > > > > > > > > > > [root@swm-02 ~]# rpm -qa | grep ovirt > > ovirt-imageio-common-1.5.1-0.el7.x86_64 > > ovirt-release43-4.3.5.2-1.el7.noarch > > ovirt-imageio-daemon-1.5.1-0.el7.noarch > > ovirt-vmconsole-host-1.0.7-2.el7.noarch > > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > > cockpit-machines-ovirt-195.1-1.el7.noarch > > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > > ovirt-vmconsole-1.0.7-2.el7.noarch > > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > > ovirt-host-deploy-common-1.8.0-1.el7.noarch > > ovirt-host-4.3.4-1.el7.x86_64 > > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > > ovirt-ansible-repositories-1.1.5-1.el7.noarch > > [root@swm-02 ~]# cat /etc/redhat-release > > CentOS Linux release 7.6.1810 (Core) > > [root@swm-02 ~]# uname -r > > 3.10.0-957.27.2.el7.x86_64 > > You have new mail in /var/spool/mail/root > > [root@swm-02 ~]# ip a > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > group default qlen 1000 > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > inet 127.0.0.1/8 scope host lo > > valid_lft forever preferred_lft forever > > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > test state UP group default qlen 1000 > > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > default qlen 1000 > > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > ovirtmgmt state UP group default qlen 1000 > > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > default qlen 1000 > > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > group default qlen 1000 > > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > default qlen 1000 > > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > > state UP group default qlen 1000 > > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > > valid_lft forever preferred_lft forever > > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > > noqueue state UP group default qlen 1000 > > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > > valid_lft forever preferred_lft forever > > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > group default qlen 1000 > > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > ovirtmgmt state UNKNOWN group default qlen 1000 > > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > > [root@swm-02 ~]# free -m > > total used free shared buff/cache available > > Mem: 64413 1873 61804 9 735 62062 > > Swap: 16383 0 16383 > > [root@swm-02 ~]# free -h > > total used free shared buff/cache available > > Mem: 62G 1.8G 60G 9.5M 735M 60G > > Swap: 15G 0B 15G > > [root@swm-02 ~]# ls > > ls lsb_release lshw lslocks > > lsmod lspci lssubsys > > lsusb.py > > lsattr lscgroup lsinitrd lslogins > > lsns lss16toppm lstopo-no-graphics > > lsblk lscpu lsipc lsmem > > lsof lsscsi lsusb > > [root@swm-02 ~]# lscpu > > Architecture: x86_64 > > CPU op-mode(s): 32-bit, 64-bit > > Byte Order: Little Endian > > CPU(s): 16 > > On-line CPU(s) list: 0-15 > > Thread(s) per core: 2 > > Core(s) per socket: 4 > > Socket(s): 2 > > NUMA node(s): 2 > > Vendor ID: GenuineIntel > > CPU family: 6 > > Model: 44 > > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > > Stepping: 2 > > CPU MHz: 3192.064 > > BogoMIPS: 6384.12 > > Virtualization: VT-x > > L1d cache: 32K > > L1i cache: 32K > > L2 cache: 256K > > L3 cache: 12288K > > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon
> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni
> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm
> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > > [root@swm-02 ~]# > > _______________________________________________ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-leave@ovirt.org > > Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... > > oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... > > List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... > > To view the terms under which this email is distributed,
the OS. No then prevent packets pebs bts pclmulqdq pcid dca please go to:-
> > http://leedsbeckett.ac.uk/disclaimer/email/ > > _______________________________________________ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-leave@ovirt.org > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

I can see, just through a series of searches, that you can configure ovirt to use a variety of things. I've configured many VMWare clusters before, so having a (Cisco calls them) trunked port or (what other people call them) ports in "tagged mode" would be nice, but like I was saying: I can't even speak directly to the co-lo people, much less ask them to set a port to access multiple VLANs. So, going at this just by trying to follow the docs, I simply add a "network" in the GUI and stick the port on it in "Setup Host Networks". On each host, I named the "network" the same so that it satisfies the requirement for Live Migration. So, 3 hosts, 2 networks per-host. Network 1 is named "ovirtmgmt", the other is named "int-1gb-10.15.11". I thought that this was required because a VM must attach to the same networks if it is to migrate because of load balancing, failover or just to migrate. So, two links, two networks - VMs can use the switch to access the local network and the wider network with the same configuration. As you can imagine, having the ports become un-linked without STP and having them work properly with it, seemed to indicate that STP was the issue. If STP is not the true culprit, then I really don't know what else it can be...but in either case, it seems like there really should be an option in the interface to enable or disable it, considering the bridge utilities literally have it as a first-level option and it's probably one of the more common networking configurations.... Again, if there's some other way to do this, please, please do let me know. Thanks again, Tony! cecjr On Tue, Aug 20, 2019 at 4:51 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Oh, I'm just using the standard Linux bridges. I believe in ovirt nomenclature they're called "legacy bridges". This is not by any real choice but just what is created by default, I believe. In the first link you sent, they are calling them "Linux legacy networking model" - though I only have 2 options when I make a new switch - the default and (it says) "openvswitch (experimental)"
On Tue, Aug 20, 2019 at 4:47 AM Tony Pearce <tonyppe@gmail.com> wrote:
No - no recommendations from me to use either. I took it that you were using ovs bridge as I was not aware of another bridge. The only other option I was aware of if what I am using, vlan interfaces and kernel vlan tags. If you could share a link to what you're using, I would be keen to read up on it to know more.
*Tony Pearce*
On Tue, 20 Aug 2019 at 16:27, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Tony,
Well, thank you for that, but I'm not using openvswitchs, I'm just using regular bridges. Are you suggesting that I do?
From what I can see in the interface they have "(experimental)" marked on them and we'd like to see production with this at some point.
None of my ports are trunking between VLANs - they only have one in untagged mode - and there's only 2 VLANs here, one per-port.
Thanks again! cecjr
On Mon, Aug 19, 2019 at 10:36 PM Tony Pearce <tonyppe@gmail.com> wrote:
A couple of links I found helpful, thought I'd send them over http://therandomsecurityguy.com/openvswitch-cheat-sheet/
https://ovirt.org/develop/release-management/features/network/openvswitch/na...
With STP off, if the network is detecting a loop then it will have to block a link. With STP on I guess it's allowing the network to remain forwarding and the blocking to occur elsewhere. 👍
Tony Pearce
On Tue, 20 Aug 2019 at 10:12, Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Cool, I can capture some packets tomorrow when I'm in the office and see how that compares...
But, yea, it's a hassle to get them to respond IF they do, so the only real options I'm going to have are what I can do with my servers from
physical access. No nice DC guy to help me out.
On Mon, Aug 19, 2019 at 10:06 PM Tony Pearce <tonyppe@gmail.com> wrote:
:) They might be using Cisco's per-vlan spanning tree on the network side. It is possible to capture the packets coming in from the
network
and confirm that.
Attached screenshot of wireshark for you for reference.
Glad you're all working :)
Regards
Tony Pearce
On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: > > Hey Tony! > > I only know the basics of Spanning Tree. At the current moment the > only way to get migrations to work at all without breaking the whole > oVirt cluster is to have it on. After changing it according to Paul's > instruction, it works like it has never worked before. Every migration > event was successful. Whereas before and even at times with the > cronjob (when vdsm set STP to off between cron runs) the link would > drop out and oVirt would say that the host was "unresponsive". > > It would be too - it wouldn't respond to SSH, ping, arp > requests...nothing. I never got a good idea of how long this would be > for, but it would, eventually go away and the link would come back > online. > > I have no access to the hardware. From using tcpdump to get some CDP > packets, I do know that it's Cisco switches but the IT team here is > completely unresponsive (they literally ignore our tickets) and
> co-lo where our servers are hosted won't even pick up the phone for > anyone but them.... > > Unfortunately, this is what I'm going to have to do. The cluster is > very functional, though. I created around 15 VMs today and > migrated them from host to host without any problem. > > Anything else you'd like me to try? This is currently dev, so I can > really do anything I want and I can just IPMI reboot the nodes if it > causes issues... > > Thanks! > cecjr > > > On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe@gmail.com> wrote: > > > > e.albany, > > > > STP is meant to block loops in layer 2. In basic operation, a root > > bridge is elected which is the root of the tree. This bridge sends, > > essentially 'hello' messages as multicast packets. The switches
> > detect the loop in the network and block one of the links to
> > such things as a broadcast storm. > > > > There are different flavours of STP but "STP" usually means the hellos > > are sent over VLAN 1 (or no vlan). Therefore if you have multiple > > VLANs on links, the hellos are still only sent over VLAN 1 and all > > VLANs are dealt with that way. Meaning if a link is blocked
> > VLANs are blocked on that link, > > > > Then came the different flavours, one of which is per-vlan STP. This > > allows individual VLANs to be blocked and gives more flexibility. > > > > After STP has dealt with the blocking, this link blocking will > > continue until a change in the network is detected. This is detected > > by the absence of the STP packets or the presence of new STP
> > where there shouldnt be. When this happens, STP packets are flooded > > everywhere to discover the new network topology. Ultimately,
> > will be blocked again. > > > > I think that you have two STP versions running in your network and > > it's causing the issue. An easy test would be to remove the loop > > manually in the network and leave STP off on the ovirt host. You can > > view the topology as-per the network STP devices by obtaining info > > from the devices such as bridge priorities etc. What is your network > > hardware? > > > > Regards, > > > > Tony > > > > > > On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul > > <P.Staniforth@leedsbeckett.ac.uk> wrote: > > > > > > I haven't used FC with oVirt but in the following it shows
> > > > > > https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Properties.h... > > > > > > > > > Regards, > > > Paul S. > > > > > > ________________________________________ > > > From: ej.albany@gmail.com <ej.albany@gmail.com> > > > Sent: 17 August 2019 10:25 > > > To: users@ovirt.org > > > Subject: [ovirt-users] Need to enable STP on ovirt bridges > > > > > > Hello. I have been trying to figure out an issue for a very long time. > > > That issue relates to the ethernet and 10gb fc links that I have on my > > > cluster being disabled any time a migration occurs. > > > > > > I believe this is because I need to have STP turned on in order to > > > participate with the switch. However, there does not seem to be any > > > way to tell oVirt to stop turning it off! Very frustrating. > > > > > > After entering a cronjob that enables stp on all bridges every 1 > > > minute, the migration issue disappears.... > > > > > > Is there any way at all to do without this cronjob and set STP to be > > > ON without having to resort to such a silly solution? > > > > > > Here are some details about my systems, if you need it. > > > > > > > > > selinux is disabled. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [root@swm-02 ~]# rpm -qa | grep ovirt > > > ovirt-imageio-common-1.5.1-0.el7.x86_64 > > > ovirt-release43-4.3.5.2-1.el7.noarch > > > ovirt-imageio-daemon-1.5.1-0.el7.noarch > > > ovirt-vmconsole-host-1.0.7-2.el7.noarch > > > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > > > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > > > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > > > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > > > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > > > cockpit-machines-ovirt-195.1-1.el7.noarch > > > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > > > ovirt-vmconsole-1.0.7-2.el7.noarch > > > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > > > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > > > ovirt-host-deploy-common-1.8.0-1.el7.noarch > > > ovirt-host-4.3.4-1.el7.x86_64 > > > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > > > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > > > ovirt-ansible-repositories-1.1.5-1.el7.noarch > > > [root@swm-02 ~]# cat /etc/redhat-release > > > CentOS Linux release 7.6.1810 (Core) > > > [root@swm-02 ~]# uname -r > > > 3.10.0-957.27.2.el7.x86_64 > > > You have new mail in /var/spool/mail/root > > > [root@swm-02 ~]# ip a > > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > > group default qlen 1000 > > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > > inet 127.0.0.1/8 scope host lo > > > valid_lft forever preferred_lft forever > > > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > > test state UP group default qlen 1000 > > > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > > default qlen 1000 > > > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > > > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > > ovirtmgmt state UP group default qlen 1000 > > > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > > default qlen 1000 > > > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > > > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > > group default qlen 1000 > > > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > > > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > > default qlen 1000 > > > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > > > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > > > state UP group default qlen 1000 > > > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > > > valid_lft forever preferred_lft forever > > > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > > > noqueue state UP group default qlen 1000 > > > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > > > valid_lft forever preferred_lft forever > > > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > > group default qlen 1000 > > > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > > > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > > ovirtmgmt state UNKNOWN group default qlen 1000 > > > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > > > [root@swm-02 ~]# free -m > > > total used free shared buff/cache available > > > Mem: 64413 1873 61804 9 735 62062 > > > Swap: 16383 0 16383 > > > [root@swm-02 ~]# free -h > > > total used free shared buff/cache available > > > Mem: 62G 1.8G 60G 9.5M 735M 60G > > > Swap: 15G 0B 15G > > > [root@swm-02 ~]# ls > > > ls lsb_release lshw lslocks > > > lsmod lspci lssubsys > > > lsusb.py > > > lsattr lscgroup lsinitrd lslogins > > > lsns lss16toppm lstopo-no-graphics > > > lsblk lscpu lsipc lsmem > > > lsof lsscsi lsusb > > > [root@swm-02 ~]# lscpu > > > Architecture: x86_64 > > > CPU op-mode(s): 32-bit, 64-bit > > > Byte Order: Little Endian > > > CPU(s): 16 > > > On-line CPU(s) list: 0-15 > > > Thread(s) per core: 2 > > > Core(s) per socket: 4 > > > Socket(s): 2 > > > NUMA node(s): 2 > > > Vendor ID: GenuineIntel > > > CPU family: 6 > > > Model: 44 > > > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > > > Stepping: 2 > > > CPU MHz: 3192.064 > > > BogoMIPS: 6384.12 > > > Virtualization: VT-x > > > L1d cache: 32K > > > L1i cache: 32K > > > L2 cache: 256K > > > L3 cache: 12288K > > > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > > > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > > > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > > > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > > > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon
> > > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni
> > > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm
> > > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > > > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > > > [root@swm-02 ~]# > > > _______________________________________________ > > > Users mailing list -- users@ovirt.org > > > To unsubscribe send an email to users-leave@ovirt.org > > > Privacy Statement: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... > > > oVirt Code of Conduct: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt.... > > > List Archives: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovir... > > > To view the terms under which this email is distributed,
the OS. No the then prevent then all packets the loop the bridge options available and how to enable Ethtool and FCoE. pebs bts pclmulqdq pcid dca please go to:-
> > > http://leedsbeckett.ac.uk/disclaimer/email/ > > > _______________________________________________ > > > Users mailing list -- users@ovirt.org > > > To unsubscribe send an email to users-leave@ovirt.org > > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > > > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQZQC...

On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote:
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Vdsm exposes a per bridge STP knob that you can use for this. By default it is set to false, which is probably why you had to use this shenanigan. You can, for instance: # show present state [vagrant@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff # show example bridge configuration - you're looking for the STP knob here. [root@vdsm ~]$ cat bridged_net_with_stp { "bondings": {}, "networks": { "test-network": { "nic": "eth0", "switch": "legacy", "bridged": true, "stp": true } }, "options": { "connectivityCheck": false } } # issue setup networks command: [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks { "code": 0, "message": "Done" } # show bridges [root@vdsm ~]$ brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no test-network 8000.52540041fb37 yes eth0 # show final state [root@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master test-network state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff I don't think this STP parameter is exposed via engine UI; @Dominik Holler , could you confirm ? What are our plans for it ?
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...

On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso < mdbarroso@redhat.com> wrote:
On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote:
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Vdsm exposes a per bridge STP knob that you can use for this. By default it is set to false, which is probably why you had to use this shenanigan.
You can, for instance:
# show present state [vagrant@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff
# show example bridge configuration - you're looking for the STP knob here. [root@vdsm ~]$ cat bridged_net_with_stp { "bondings": {}, "networks": { "test-network": { "nic": "eth0", "switch": "legacy", "bridged": true, "stp": true } }, "options": { "connectivityCheck": false } }
# issue setup networks command: [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks { "code": 0, "message": "Done" }
# show bridges [root@vdsm ~]$ brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no test-network 8000.52540041fb37 yes eth0
# show final state [root@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master test-network state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff
I don't think this STP parameter is exposed via engine UI; @Dominik Holler , could you confirm ? What are our plans for it ?
STP is only available via REST-API, see http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network please find an example how to enable STP in https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 We have no plans to add STP to the web ui, but new feature requests are always welcome on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache
available
Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...

Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. Thanks On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso < mdbarroso@redhat.com> wrote:
On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote:
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Vdsm exposes a per bridge STP knob that you can use for this. By default it is set to false, which is probably why you had to use this shenanigan.
You can, for instance:
# show present state [vagrant@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff
# show example bridge configuration - you're looking for the STP knob here. [root@vdsm ~]$ cat bridged_net_with_stp { "bondings": {}, "networks": { "test-network": { "nic": "eth0", "switch": "legacy", "bridged": true, "stp": true } }, "options": { "connectivityCheck": false } }
# issue setup networks command: [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks { "code": 0, "message": "Done" }
# show bridges [root@vdsm ~]$ brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no test-network 8000.52540041fb37 yes eth0
# show final state [root@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master test-network state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff
I don't think this STP parameter is exposed via engine UI; @Dominik Holler , could you confirm ? What are our plans for it ?
STP is only available via REST-API, see http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network please find an example how to enable STP in https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95
We have no plans to add STP to the web ui, but new feature requests are always welcome on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache
available
Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI. Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports. Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles. On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote:
Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. Thanks
On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote:
On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote:
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Vdsm exposes a per bridge STP knob that you can use for this. By default it is set to false, which is probably why you had to use this shenanigan.
You can, for instance:
# show present state [vagrant@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff
# show example bridge configuration - you're looking for the STP knob here. [root@vdsm ~]$ cat bridged_net_with_stp { "bondings": {}, "networks": { "test-network": { "nic": "eth0", "switch": "legacy", "bridged": true, "stp": true } }, "options": { "connectivityCheck": false } }
# issue setup networks command: [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks { "code": 0, "message": "Done" }
# show bridges [root@vdsm ~]$ brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no test-network 8000.52540041fb37 yes eth0
# show final state [root@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master test-network state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff
I don't think this STP parameter is exposed via engine UI; @Dominik Holler , could you confirm ? What are our plans for it ?
STP is only available via REST-API, see http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network please find an example how to enable STP in https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95
We have no plans to add STP to the web ui, but new feature requests are always welcome on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI.
Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports.
Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles.
openvswitch clusters are limited to ovn networks. You can create one like described in https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote:
Hi Dominik, would you mind sharing the use case for stp via API Only? I
am keen to know this.
Thanks
On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <
mdbarroso@redhat.com> wrote:
On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote:
Hello. I have been trying to figure out an issue for a very long
time.
That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Vdsm exposes a per bridge STP knob that you can use for this. By default it is set to false, which is probably why you had to use this shenanigan.
You can, for instance:
# show present state [vagrant@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff
# show example bridge configuration - you're looking for the STP knob here. [root@vdsm ~]$ cat bridged_net_with_stp { "bondings": {}, "networks": { "test-network": { "nic": "eth0", "switch": "legacy", "bridged": true, "stp": true } }, "options": { "connectivityCheck": false } }
# issue setup networks command: [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks { "code": 0, "message": "Done" }
# show bridges [root@vdsm ~]$ brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no test-network 8000.52540041fb37 yes eth0
# show final state [root@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master test-network state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff
I don't think this STP parameter is exposed via engine UI; @Dominik Holler , could you confirm ? What are our plans for it ?
STP is only available via REST-API, see http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network please find an example how to enable STP in https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95
We have no plans to add STP to the web ui, but new feature requests are always welcome on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache
available
Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Thanks, I'm just going to revert back to bridges. On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI.
Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports.
Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles.
openvswitch clusters are limited to ovn networks. You can create one like described in https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote:
Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. Thanks
On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote:
On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote:
Hello. I have been trying to figure out an issue for a very long time. That issue relates to the ethernet and 10gb fc links that I have on my cluster being disabled any time a migration occurs.
I believe this is because I need to have STP turned on in order to participate with the switch. However, there does not seem to be any way to tell oVirt to stop turning it off! Very frustrating.
After entering a cronjob that enables stp on all bridges every 1 minute, the migration issue disappears....
Is there any way at all to do without this cronjob and set STP to be ON without having to resort to such a silly solution?
Vdsm exposes a per bridge STP knob that you can use for this. By default it is set to false, which is probably why you had to use this shenanigan.
You can, for instance:
# show present state [vagrant@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff
# show example bridge configuration - you're looking for the STP knob here. [root@vdsm ~]$ cat bridged_net_with_stp { "bondings": {}, "networks": { "test-network": { "nic": "eth0", "switch": "legacy", "bridged": true, "stp": true } }, "options": { "connectivityCheck": false } }
# issue setup networks command: [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks { "code": 0, "message": "Done" }
# show bridges [root@vdsm ~]$ brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no test-network 8000.52540041fb37 yes eth0
# show final state [root@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master test-network state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff
I don't think this STP parameter is exposed via engine UI; @Dominik Holler , could you confirm ? What are our plans for it ?
STP is only available via REST-API, see http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network please find an example how to enable STP in https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95
We have no plans to add STP to the web ui, but new feature requests are always welcome on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine
Here are some details about my systems, if you need it.
selinux is disabled.
[root@swm-02 ~]# rpm -qa | grep ovirt ovirt-imageio-common-1.5.1-0.el7.x86_64 ovirt-release43-4.3.5.2-1.el7.noarch ovirt-imageio-daemon-1.5.1-0.el7.noarch ovirt-vmconsole-host-1.0.7-2.el7.noarch ovirt-hosted-engine-setup-2.3.11-1.el7.noarch ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch python2-ovirt-host-deploy-1.8.0-1.el7.noarch ovirt-ansible-engine-setup-1.1.9-1.el7.noarch python2-ovirt-setup-lib-1.2.0-1.el7.noarch cockpit-machines-ovirt-195.1-1.el7.noarch ovirt-hosted-engine-ha-2.3.3-1.el7.noarch ovirt-vmconsole-1.0.7-2.el7.noarch cockpit-ovirt-dashboard-0.13.5-1.el7.noarch ovirt-provider-ovn-driver-1.2.22-1.el7.noarch ovirt-host-deploy-common-1.8.0-1.el7.noarch ovirt-host-4.3.4-1.el7.x86_64 python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 ovirt-host-dependencies-4.3.4-1.el7.x86_64 ovirt-ansible-repositories-1.1.5-1.el7.noarch [root@swm-02 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@swm-02 ~]# uname -r 3.10.0-957.27.2.el7.x86_64 You have new mail in /var/spool/mail/root [root@swm-02 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff inet 10.15.11.21/24 brd 10.15.11.255 scope global test valid_lft forever preferred_lft forever 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UNKNOWN group default qlen 1000 link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff [root@swm-02 ~]# free -m total used free shared buff/cache available Mem: 64413 1873 61804 9 735 62062 Swap: 16383 0 16383 [root@swm-02 ~]# free -h total used free shared buff/cache available Mem: 62G 1.8G 60G 9.5M 735M 60G Swap: 15G 0B 15G [root@swm-02 ~]# ls ls lsb_release lshw lslocks lsmod lspci lssubsys lsusb.py lsattr lscgroup lsinitrd lslogins lsns lss16toppm lstopo-no-graphics lsblk lscpu lsipc lsmem lsof lsscsi lsusb [root@swm-02 ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz Stepping: 2 CPU MHz: 3192.064 BogoMIPS: 6384.12 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d [root@swm-02 ~]# _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Sorry to dead bump this, but I'm beginning to suspect that maybe it's not STP that's the problem. 2 of my hosts just went down when a few VMs tried to migrate. Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to start. [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex [68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Thanks, I'm just going to revert back to bridges.
On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI.
Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports.
Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles.
openvswitch clusters are limited to ovn networks. You can create one like described in https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote:
Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. Thanks
On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote:
On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: > > Hello. I have been trying to figure out an issue for a very long time. > That issue relates to the ethernet and 10gb fc links that I have on my > cluster being disabled any time a migration occurs. > > I believe this is because I need to have STP turned on in order to > participate with the switch. However, there does not seem to be any > way to tell oVirt to stop turning it off! Very frustrating. > > After entering a cronjob that enables stp on all bridges every 1 > minute, the migration issue disappears.... > > Is there any way at all to do without this cronjob and set STP to be > ON without having to resort to such a silly solution?
Vdsm exposes a per bridge STP knob that you can use for this. By default it is set to false, which is probably why you had to use this shenanigan.
You can, for instance:
# show present state [vagrant@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff
# show example bridge configuration - you're looking for the STP knob here. [root@vdsm ~]$ cat bridged_net_with_stp { "bondings": {}, "networks": { "test-network": { "nic": "eth0", "switch": "legacy", "bridged": true, "stp": true } }, "options": { "connectivityCheck": false } }
# issue setup networks command: [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks { "code": 0, "message": "Done" }
# show bridges [root@vdsm ~]$ brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no test-network 8000.52540041fb37 yes eth0
# show final state [root@vdsm ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master test-network state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fe83:5b6f/64 scope link valid_lft forever preferred_lft forever 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff
I don't think this STP parameter is exposed via engine UI; @Dominik Holler , could you confirm ? What are our plans for it ?
STP is only available via REST-API, see http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network please find an example how to enable STP in https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95
We have no plans to add STP to the web ui, but new feature requests are always welcome on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine
> > Here are some details about my systems, if you need it. > > > selinux is disabled. > > > > > > > > > > [root@swm-02 ~]# rpm -qa | grep ovirt > ovirt-imageio-common-1.5.1-0.el7.x86_64 > ovirt-release43-4.3.5.2-1.el7.noarch > ovirt-imageio-daemon-1.5.1-0.el7.noarch > ovirt-vmconsole-host-1.0.7-2.el7.noarch > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > cockpit-machines-ovirt-195.1-1.el7.noarch > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > ovirt-vmconsole-1.0.7-2.el7.noarch > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > ovirt-host-deploy-common-1.8.0-1.el7.noarch > ovirt-host-4.3.4-1.el7.x86_64 > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > ovirt-ansible-repositories-1.1.5-1.el7.noarch > [root@swm-02 ~]# cat /etc/redhat-release > CentOS Linux release 7.6.1810 (Core) > [root@swm-02 ~]# uname -r > 3.10.0-957.27.2.el7.x86_64 > You have new mail in /var/spool/mail/root > [root@swm-02 ~]# ip a > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > group default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > test state UP group default qlen 1000 > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > ovirtmgmt state UP group default qlen 1000 > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > group default qlen 1000 > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > default qlen 1000 > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > state UP group default qlen 1000 > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > valid_lft forever preferred_lft forever > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > noqueue state UP group default qlen 1000 > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > valid_lft forever preferred_lft forever > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > group default qlen 1000 > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > ovirtmgmt state UNKNOWN group default qlen 1000 > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > [root@swm-02 ~]# free -m > total used free shared buff/cache available > Mem: 64413 1873 61804 9 735 62062 > Swap: 16383 0 16383 > [root@swm-02 ~]# free -h > total used free shared buff/cache available > Mem: 62G 1.8G 60G 9.5M 735M 60G > Swap: 15G 0B 15G > [root@swm-02 ~]# ls > ls lsb_release lshw lslocks > lsmod lspci lssubsys > lsusb.py > lsattr lscgroup lsinitrd lslogins > lsns lss16toppm lstopo-no-graphics > lsblk lscpu lsipc lsmem > lsof lsscsi lsusb > [root@swm-02 ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 16 > On-line CPU(s) list: 0-15 > Thread(s) per core: 2 > Core(s) per socket: 4 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 44 > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > Stepping: 2 > CPU MHz: 3192.064 > BogoMIPS: 6384.12 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 12288K > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > [root@swm-02 ~]# > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Is you storage connected via NFS? Can you manually access the storage on the host? On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Sorry to dead bump this, but I'm beginning to suspect that maybe it's not STP that's the problem.
2 of my hosts just went down when a few VMs tried to migrate.
Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to start.
[68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex
[68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID
On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Thanks, I'm just going to revert back to bridges.
On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com>
On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI.
Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports.
Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles.
openvswitch clusters are limited to ovn networks. You can create one like described in
https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com>
wrote:
Hi Dominik, would you mind sharing the use case for stp via API
Only? I am keen to know this.
Thanks
On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <
mdbarroso@redhat.com> wrote:
> > On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: > > > > Hello. I have been trying to figure out an issue for a very long time. > > That issue relates to the ethernet and 10gb fc links that I have on my > > cluster being disabled any time a migration occurs. > > > > I believe this is because I need to have STP turned on in order to > > participate with the switch. However, there does not seem to be any > > way to tell oVirt to stop turning it off! Very frustrating. > > > > After entering a cronjob that enables stp on all bridges every 1 > > minute, the migration issue disappears.... > > > > Is there any way at all to do without this cronjob and set STP to be > > ON without having to resort to such a silly solution? > > Vdsm exposes a per bridge STP knob that you can use for this. By > default it is set to false, which is probably why you had to use
> shenanigan. > > You can, for instance: > > # show present state > [vagrant@vdsm ~]$ ip a > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > group default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> state UP group default qlen 1000 > link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> state UP group default qlen 1000 > link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > valid_lft forever preferred_lft forever > inet6 fe80::5054:ff:fe83:5b6f/64 scope link > valid_lft forever preferred_lft forever > 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > group default qlen 1000 > link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > > # show example bridge configuration - you're looking for the STP knob here. > [root@vdsm ~]$ cat bridged_net_with_stp > { > "bondings": {}, > "networks": { > "test-network": { > "nic": "eth0", > "switch": "legacy", > "bridged": true, > "stp": true > } > }, > "options": { > "connectivityCheck": false > } > } > > # issue setup networks command: > [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks > { > "code": 0, > "message": "Done" > } > > # show bridges > [root@vdsm ~]$ brctl show > bridge name bridge id STP enabled interfaces > ;vdsmdummy; 8000.000000000000 no > test-network 8000.52540041fb37 yes eth0 > > # show final state > [root@vdsm ~]$ ip a > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > group default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> master test-network state UP group default qlen 1000 > link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> state UP group default qlen 1000 > link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > valid_lft forever preferred_lft forever > inet6 fe80::5054:ff:fe83:5b6f/64 scope link > valid_lft forever preferred_lft forever > 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > group default qlen 1000 > link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > noqueue state UP group default qlen 1000 > link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > > I don't think this STP parameter is exposed via engine UI; @Dominik > Holler , could you confirm ? What are our plans for it ? >
STP is only available via REST-API, see http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network please find an example how to enable STP in
https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95
We have no plans to add STP to the web ui, but new feature requests are always welcome on https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine
> > > > > Here are some details about my systems, if you need it. > > > > > > selinux is disabled. > > > > > > > > > > > > > > > > > > > > [root@swm-02 ~]# rpm -qa | grep ovirt > > ovirt-imageio-common-1.5.1-0.el7.x86_64 > > ovirt-release43-4.3.5.2-1.el7.noarch > > ovirt-imageio-daemon-1.5.1-0.el7.noarch > > ovirt-vmconsole-host-1.0.7-2.el7.noarch > > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > > cockpit-machines-ovirt-195.1-1.el7.noarch > > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > > ovirt-vmconsole-1.0.7-2.el7.noarch > > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > > ovirt-host-deploy-common-1.8.0-1.el7.noarch > > ovirt-host-4.3.4-1.el7.x86_64 > > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > > ovirt-ansible-repositories-1.1.5-1.el7.noarch > > [root@swm-02 ~]# cat /etc/redhat-release > > CentOS Linux release 7.6.1810 (Core) > > [root@swm-02 ~]# uname -r > > 3.10.0-957.27.2.el7.x86_64 > > You have new mail in /var/spool/mail/root > > [root@swm-02 ~]# ip a > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state
UNKNOWN
> > group default qlen 1000 > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > inet 127.0.0.1/8 scope host lo > > valid_lft forever preferred_lft forever > > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > test state UP group default qlen 1000 > > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > default qlen 1000 > > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > ovirtmgmt state UP group default qlen 1000 > > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > default qlen 1000 > > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > group default qlen 1000 > > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > default qlen 1000 > > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > > state UP group default qlen 1000 > > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > > valid_lft forever preferred_lft forever > > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > > noqueue state UP group default qlen 1000 > > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > > valid_lft forever preferred_lft forever > > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > group default qlen 1000 > > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > ovirtmgmt state UNKNOWN group default qlen 1000 > > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > > [root@swm-02 ~]# free -m > > total used free shared buff/cache available > > Mem: 64413 1873 61804 9 735 62062 > > Swap: 16383 0 16383 > > [root@swm-02 ~]# free -h > > total used free shared buff/cache available > > Mem: 62G 1.8G 60G 9.5M 735M 60G > > Swap: 15G 0B 15G > > [root@swm-02 ~]# ls > > ls lsb_release lshw lslocks > > lsmod lspci lssubsys > > lsusb.py > > lsattr lscgroup lsinitrd lslogins > > lsns lss16toppm lstopo-no-graphics > > lsblk lscpu lsipc lsmem > > lsof lsscsi lsusb > > [root@swm-02 ~]# lscpu > > Architecture: x86_64 > > CPU op-mode(s): 32-bit, 64-bit > > Byte Order: Little Endian > > CPU(s): 16 > > On-line CPU(s) list: 0-15 > > Thread(s) per core: 2 > > Core(s) per socket: 4 > > Socket(s): 2 > > NUMA node(s): 2 > > Vendor ID: GenuineIntel > > CPU family: 6 > > Model: 44 > > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > > Stepping: 2 > > CPU MHz: 3192.064 > > BogoMIPS: 6384.12 > > Virtualization: VT-x > > L1d cache: 32K > > L1i cache: 32K > > L2 cache: 256K > > L3 cache: 12288K > > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon
> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni
wrote: this pfifo_fast pfifo_fast pfifo_fast pfifo_fast pebs bts pclmulqdq
> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > > [root@swm-02 ~]# > > _______________________________________________ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-leave@ovirt.org > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Hey Dominik, Thanks for helping. I really want to try to use ovirt. When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back... On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com> wrote:
Is you storage connected via NFS? Can you manually access the storage on the host?
On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Sorry to dead bump this, but I'm beginning to suspect that maybe it's not STP that's the problem.
2 of my hosts just went down when a few VMs tried to migrate.
Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to start.
[68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex
[68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID
On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Thanks, I'm just going to revert back to bridges.
On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI.
Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports.
Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles.
openvswitch clusters are limited to ovn networks. You can create one like described in https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote:
Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. Thanks
On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote: > > > > On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >> >> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: >> > >> > Hello. I have been trying to figure out an issue for a very long time. >> > That issue relates to the ethernet and 10gb fc links that I have on my >> > cluster being disabled any time a migration occurs. >> > >> > I believe this is because I need to have STP turned on in order to >> > participate with the switch. However, there does not seem to be any >> > way to tell oVirt to stop turning it off! Very frustrating. >> > >> > After entering a cronjob that enables stp on all bridges every 1 >> > minute, the migration issue disappears.... >> > >> > Is there any way at all to do without this cronjob and set STP to be >> > ON without having to resort to such a silly solution? >> >> Vdsm exposes a per bridge STP knob that you can use for this. By >> default it is set to false, which is probably why you had to use this >> shenanigan. >> >> You can, for instance: >> >> # show present state >> [vagrant@vdsm ~]$ ip a >> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> group default qlen 1000 >> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> inet 127.0.0.1/8 scope host lo >> valid_lft forever preferred_lft forever >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> state UP group default qlen 1000 >> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> state UP group default qlen 1000 >> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> valid_lft forever preferred_lft forever >> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> valid_lft forever preferred_lft forever >> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> group default qlen 1000 >> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> >> # show example bridge configuration - you're looking for the STP knob here. >> [root@vdsm ~]$ cat bridged_net_with_stp >> { >> "bondings": {}, >> "networks": { >> "test-network": { >> "nic": "eth0", >> "switch": "legacy", >> "bridged": true, >> "stp": true >> } >> }, >> "options": { >> "connectivityCheck": false >> } >> } >> >> # issue setup networks command: >> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >> { >> "code": 0, >> "message": "Done" >> } >> >> # show bridges >> [root@vdsm ~]$ brctl show >> bridge name bridge id STP enabled interfaces >> ;vdsmdummy; 8000.000000000000 no >> test-network 8000.52540041fb37 yes eth0 >> >> # show final state >> [root@vdsm ~]$ ip a >> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> group default qlen 1000 >> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> inet 127.0.0.1/8 scope host lo >> valid_lft forever preferred_lft forever >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> master test-network state UP group default qlen 1000 >> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> state UP group default qlen 1000 >> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> valid_lft forever preferred_lft forever >> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> valid_lft forever preferred_lft forever >> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> group default qlen 1000 >> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> noqueue state UP group default qlen 1000 >> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> >> I don't think this STP parameter is exposed via engine UI; @Dominik >> Holler , could you confirm ? What are our plans for it ? >> > > STP is only available via REST-API, see > http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network > please find an example how to enable STP in > https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 > > We have no plans to add STP to the web ui, > but new feature requests are always welcome on > https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine > > >> >> > >> > Here are some details about my systems, if you need it. >> > >> > >> > selinux is disabled. >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > [root@swm-02 ~]# rpm -qa | grep ovirt >> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >> > ovirt-release43-4.3.5.2-1.el7.noarch >> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >> > cockpit-machines-ovirt-195.1-1.el7.noarch >> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >> > ovirt-vmconsole-1.0.7-2.el7.noarch >> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >> > ovirt-host-4.3.4-1.el7.x86_64 >> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >> > [root@swm-02 ~]# cat /etc/redhat-release >> > CentOS Linux release 7.6.1810 (Core) >> > [root@swm-02 ~]# uname -r >> > 3.10.0-957.27.2.el7.x86_64 >> > You have new mail in /var/spool/mail/root >> > [root@swm-02 ~]# ip a >> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> > group default qlen 1000 >> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> > inet 127.0.0.1/8 scope host lo >> > valid_lft forever preferred_lft forever >> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > test state UP group default qlen 1000 >> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > default qlen 1000 >> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > ovirtmgmt state UP group default qlen 1000 >> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > default qlen 1000 >> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > group default qlen 1000 >> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > default qlen 1000 >> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >> > state UP group default qlen 1000 >> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >> > valid_lft forever preferred_lft forever >> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> > noqueue state UP group default qlen 1000 >> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >> > valid_lft forever preferred_lft forever >> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > group default qlen 1000 >> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > ovirtmgmt state UNKNOWN group default qlen 1000 >> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >> > [root@swm-02 ~]# free -m >> > total used free shared buff/cache available >> > Mem: 64413 1873 61804 9 735 62062 >> > Swap: 16383 0 16383 >> > [root@swm-02 ~]# free -h >> > total used free shared buff/cache available >> > Mem: 62G 1.8G 60G 9.5M 735M 60G >> > Swap: 15G 0B 15G >> > [root@swm-02 ~]# ls >> > ls lsb_release lshw lslocks >> > lsmod lspci lssubsys >> > lsusb.py >> > lsattr lscgroup lsinitrd lslogins >> > lsns lss16toppm lstopo-no-graphics >> > lsblk lscpu lsipc lsmem >> > lsof lsscsi lsusb >> > [root@swm-02 ~]# lscpu >> > Architecture: x86_64 >> > CPU op-mode(s): 32-bit, 64-bit >> > Byte Order: Little Endian >> > CPU(s): 16 >> > On-line CPU(s) list: 0-15 >> > Thread(s) per core: 2 >> > Core(s) per socket: 4 >> > Socket(s): 2 >> > NUMA node(s): 2 >> > Vendor ID: GenuineIntel >> > CPU family: 6 >> > Model: 44 >> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >> > Stepping: 2 >> > CPU MHz: 3192.064 >> > BogoMIPS: 6384.12 >> > Virtualization: VT-x >> > L1d cache: 32K >> > L1i cache: 32K >> > L2 cache: 256K >> > L3 cache: 12288K >> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep >> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq >> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca >> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >> > [root@swm-02 ~]# >> > _______________________________________________ >> > Users mailing list -- users@ovirt.org >> > To unsubscribe send an email to users-leave@ovirt.org >> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... > > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible. These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration. On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Dominik,
Thanks for helping. I really want to try to use ovirt.
When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back...
On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com> wrote:
Is you storage connected via NFS? Can you manually access the storage on the host?
On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Sorry to dead bump this, but I'm beginning to suspect that maybe it's not STP that's the problem.
2 of my hosts just went down when a few VMs tried to migrate.
Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to start.
[68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex
[68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID
On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Thanks, I'm just going to revert back to bridges.
On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI.
Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports.
Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles.
openvswitch clusters are limited to ovn networks. You can create one like described in https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: > > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > Thanks > > > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote: >> >> >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >>> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: >>> > >>> > Hello. I have been trying to figure out an issue for a very long time. >>> > That issue relates to the ethernet and 10gb fc links that I have on my >>> > cluster being disabled any time a migration occurs. >>> > >>> > I believe this is because I need to have STP turned on in order to >>> > participate with the switch. However, there does not seem to be any >>> > way to tell oVirt to stop turning it off! Very frustrating. >>> > >>> > After entering a cronjob that enables stp on all bridges every 1 >>> > minute, the migration issue disappears.... >>> > >>> > Is there any way at all to do without this cronjob and set STP to be >>> > ON without having to resort to such a silly solution? >>> >>> Vdsm exposes a per bridge STP knob that you can use for this. By >>> default it is set to false, which is probably why you had to use this >>> shenanigan. >>> >>> You can, for instance: >>> >>> # show present state >>> [vagrant@vdsm ~]$ ip a >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> group default qlen 1000 >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> inet 127.0.0.1/8 scope host lo >>> valid_lft forever preferred_lft forever >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> state UP group default qlen 1000 >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> state UP group default qlen 1000 >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>> valid_lft forever preferred_lft forever >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>> valid_lft forever preferred_lft forever >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> group default qlen 1000 >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>> >>> # show example bridge configuration - you're looking for the STP knob here. >>> [root@vdsm ~]$ cat bridged_net_with_stp >>> { >>> "bondings": {}, >>> "networks": { >>> "test-network": { >>> "nic": "eth0", >>> "switch": "legacy", >>> "bridged": true, >>> "stp": true >>> } >>> }, >>> "options": { >>> "connectivityCheck": false >>> } >>> } >>> >>> # issue setup networks command: >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >>> { >>> "code": 0, >>> "message": "Done" >>> } >>> >>> # show bridges >>> [root@vdsm ~]$ brctl show >>> bridge name bridge id STP enabled interfaces >>> ;vdsmdummy; 8000.000000000000 no >>> test-network 8000.52540041fb37 yes eth0 >>> >>> # show final state >>> [root@vdsm ~]$ ip a >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> group default qlen 1000 >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> inet 127.0.0.1/8 scope host lo >>> valid_lft forever preferred_lft forever >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> master test-network state UP group default qlen 1000 >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> state UP group default qlen 1000 >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>> valid_lft forever preferred_lft forever >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>> valid_lft forever preferred_lft forever >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> group default qlen 1000 >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>> noqueue state UP group default qlen 1000 >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> >>> I don't think this STP parameter is exposed via engine UI; @Dominik >>> Holler , could you confirm ? What are our plans for it ? >>> >> >> STP is only available via REST-API, see >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network >> please find an example how to enable STP in >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 >> >> We have no plans to add STP to the web ui, >> but new feature requests are always welcome on >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine >> >> >>> >>> > >>> > Here are some details about my systems, if you need it. >>> > >>> > >>> > selinux is disabled. >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > [root@swm-02 ~]# rpm -qa | grep ovirt >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >>> > ovirt-release43-4.3.5.2-1.el7.noarch >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >>> > cockpit-machines-ovirt-195.1-1.el7.noarch >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >>> > ovirt-vmconsole-1.0.7-2.el7.noarch >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >>> > ovirt-host-4.3.4-1.el7.x86_64 >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >>> > [root@swm-02 ~]# cat /etc/redhat-release >>> > CentOS Linux release 7.6.1810 (Core) >>> > [root@swm-02 ~]# uname -r >>> > 3.10.0-957.27.2.el7.x86_64 >>> > You have new mail in /var/spool/mail/root >>> > [root@swm-02 ~]# ip a >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> > group default qlen 1000 >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> > inet 127.0.0.1/8 scope host lo >>> > valid_lft forever preferred_lft forever >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > test state UP group default qlen 1000 >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > default qlen 1000 >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > ovirtmgmt state UP group default qlen 1000 >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > default qlen 1000 >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > group default qlen 1000 >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > default qlen 1000 >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >>> > state UP group default qlen 1000 >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >>> > valid_lft forever preferred_lft forever >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>> > noqueue state UP group default qlen 1000 >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >>> > valid_lft forever preferred_lft forever >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > group default qlen 1000 >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > ovirtmgmt state UNKNOWN group default qlen 1000 >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >>> > [root@swm-02 ~]# free -m >>> > total used free shared buff/cache available >>> > Mem: 64413 1873 61804 9 735 62062 >>> > Swap: 16383 0 16383 >>> > [root@swm-02 ~]# free -h >>> > total used free shared buff/cache available >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G >>> > Swap: 15G 0B 15G >>> > [root@swm-02 ~]# ls >>> > ls lsb_release lshw lslocks >>> > lsmod lspci lssubsys >>> > lsusb.py >>> > lsattr lscgroup lsinitrd lslogins >>> > lsns lss16toppm lstopo-no-graphics >>> > lsblk lscpu lsipc lsmem >>> > lsof lsscsi lsusb >>> > [root@swm-02 ~]# lscpu >>> > Architecture: x86_64 >>> > CPU op-mode(s): 32-bit, 64-bit >>> > Byte Order: Little Endian >>> > CPU(s): 16 >>> > On-line CPU(s) list: 0-15 >>> > Thread(s) per core: 2 >>> > Core(s) per socket: 4 >>> > Socket(s): 2 >>> > NUMA node(s): 2 >>> > Vendor ID: GenuineIntel >>> > CPU family: 6 >>> > Model: 44 >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >>> > Stepping: 2 >>> > CPU MHz: 3192.064 >>> > BogoMIPS: 6384.12 >>> > Virtualization: VT-x >>> > L1d cache: 32K >>> > L1i cache: 32K >>> > L2 cache: 256K >>> > L3 cache: 12288K >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >>> > [root@swm-02 ~]# >>> > _______________________________________________ >>> > Users mailing list -- users@ovirt.org >>> > To unsubscribe send an email to users-leave@ovirt.org >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... >> >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible.
Can you share some details about your storage? Maybe there is a feature used during live migration, which triggers the issue.
These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration.
On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Dominik,
Thanks for helping. I really want to try to use ovirt.
When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back...
On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com>
Is you storage connected via NFS? Can you manually access the storage on the host?
On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Sorry to dead bump this, but I'm beginning to suspect that maybe it's not STP that's the problem.
2 of my hosts just went down when a few VMs tried to migrate.
Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to
start.
[68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120
seconds.
[68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex
[68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID
On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote:
Thanks, I'm just going to revert back to bridges.
On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com>
wrote:
On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
> > Seems like the STP options are so common and necessary that it would > be a priority over seldom-used bridge_opts. I know what STP is and I'm > not even a networking guy - never even heard of half of the > bridge_opts that have switches in the UI. > > Anyway. I wanted to try the openvswitches, so I reinstalled all of my > nodes and used "openvswitch (Technology Preview)" as the engine-setup > option for the first host. I made a new Cluster for my nodes, added > them all to the new cluster, created a new "logical network" for
> internal network and attached it to the internal network ports. > > Now, when I go to create a new VM, I don't even have either the > ovirtmgmt switch OR the internal switch as an option. The drop-down is > empy as if I don't have any vnic-profiles. >
openvswitch clusters are limited to ovn networks. You can create one like described in
https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
> > On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com>
wrote:
> > > > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > > Thanks > > > > > > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, < dholler@redhat.com> wrote: > >> > >> > >> > >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: > >>> > >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: > >>> > > >>> > Hello. I have been trying to figure out an issue for a very long time. > >>> > That issue relates to the ethernet and 10gb fc links that I have on my > >>> > cluster being disabled any time a migration occurs. > >>> > > >>> > I believe this is because I need to have STP turned on in order to > >>> > participate with the switch. However, there does not seem to be any > >>> > way to tell oVirt to stop turning it off! Very frustrating. > >>> > > >>> > After entering a cronjob that enables stp on all bridges every 1 > >>> > minute, the migration issue disappears.... > >>> > > >>> > Is there any way at all to do without this cronjob and set STP to be > >>> > ON without having to resort to such a silly solution? > >>> > >>> Vdsm exposes a per bridge STP knob that you can use for
> >>> default it is set to false, which is probably why you had to use this > >>> shenanigan. > >>> > >>> You can, for instance: > >>> > >>> # show present state > >>> [vagrant@vdsm ~]$ ip a > >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >>> group default qlen 1000 > >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >>> inet 127.0.0.1/8 scope host lo > >>> valid_lft forever preferred_lft forever > >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >>> state UP group default qlen 1000 > >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >>> state UP group default qlen 1000 > >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >>> valid_lft forever preferred_lft forever > >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >>> valid_lft forever preferred_lft forever > >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>> group default qlen 1000 > >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >>> > >>> # show example bridge configuration - you're looking for the STP knob here. > >>> [root@vdsm ~]$ cat bridged_net_with_stp > >>> { > >>> "bondings": {}, > >>> "networks": { > >>> "test-network": { > >>> "nic": "eth0", > >>> "switch": "legacy", > >>> "bridged": true, > >>> "stp": true > >>> } > >>> }, > >>> "options": { > >>> "connectivityCheck": false > >>> } > >>> } > >>> > >>> # issue setup networks command: > >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks > >>> { > >>> "code": 0, > >>> "message": "Done" > >>> } > >>> > >>> # show bridges > >>> [root@vdsm ~]$ brctl show > >>> bridge name bridge id STP enabled interfaces > >>> ;vdsmdummy; 8000.000000000000 no > >>> test-network 8000.52540041fb37 yes eth0 > >>> > >>> # show final state > >>> [root@vdsm ~]$ ip a > >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >>> group default qlen 1000 > >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >>> inet 127.0.0.1/8 scope host lo > >>> valid_lft forever preferred_lft forever > >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >>> master test-network state UP group default qlen 1000 > >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >>> state UP group default qlen 1000 > >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >>> valid_lft forever preferred_lft forever > >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >>> valid_lft forever preferred_lft forever > >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>> group default qlen 1000 > >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >>> noqueue state UP group default qlen 1000 > >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >>> > >>> I don't think this STP parameter is exposed via engine UI; @Dominik > >>> Holler , could you confirm ? What are our plans for it ? > >>> > >> > >> STP is only available via REST-API, see > >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network > >> please find an example how to enable STP in > >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 > >> > >> We have no plans to add STP to the web ui, > >> but new feature requests are always welcome on > >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine > >> > >> > >>> > >>> > > >>> > Here are some details about my systems, if you need it. > >>> > > >>> > > >>> > selinux is disabled. > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > [root@swm-02 ~]# rpm -qa | grep ovirt > >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 > >>> > ovirt-release43-4.3.5.2-1.el7.noarch > >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch > >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch > >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > >>> > cockpit-machines-ovirt-195.1-1.el7.noarch > >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > >>> > ovirt-vmconsole-1.0.7-2.el7.noarch > >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch > >>> > ovirt-host-4.3.4-1.el7.x86_64 > >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch > >>> > [root@swm-02 ~]# cat /etc/redhat-release > >>> > CentOS Linux release 7.6.1810 (Core) > >>> > [root@swm-02 ~]# uname -r > >>> > 3.10.0-957.27.2.el7.x86_64 > >>> > You have new mail in /var/spool/mail/root > >>> > [root@swm-02 ~]# ip a > >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >>> > group default qlen 1000 > >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >>> > inet 127.0.0.1/8 scope host lo > >>> > valid_lft forever preferred_lft forever > >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >>> > test state UP group default qlen 1000 > >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >>> > default qlen 1000 > >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >>> > ovirtmgmt state UP group default qlen 1000 > >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >>> > default qlen 1000 > >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>> > group default qlen 1000 > >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >>> > default qlen 1000 > >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > >>> > state UP group default qlen 1000 > >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > >>> > valid_lft forever preferred_lft forever > >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >>> > noqueue state UP group default qlen 1000 > >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > >>> > valid_lft forever preferred_lft forever > >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>> > group default qlen 1000 > >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >>> > ovirtmgmt state UNKNOWN group default qlen 1000 > >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > >>> > [root@swm-02 ~]# free -m > >>> > total used free shared buff/cache available > >>> > Mem: 64413 1873 61804 9 735 62062 > >>> > Swap: 16383 0 16383 > >>> > [root@swm-02 ~]# free -h > >>> > total used free shared buff/cache available > >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G > >>> > Swap: 15G 0B 15G > >>> > [root@swm-02 ~]# ls > >>> > ls lsb_release lshw lslocks > >>> > lsmod lspci lssubsys > >>> > lsusb.py > >>> > lsattr lscgroup lsinitrd lslogins > >>> > lsns lss16toppm lstopo-no-graphics > >>> > lsblk lscpu lsipc lsmem > >>> > lsof lsscsi lsusb > >>> > [root@swm-02 ~]# lscpu > >>> > Architecture: x86_64 > >>> > CPU op-mode(s): 32-bit, 64-bit > >>> > Byte Order: Little Endian > >>> > CPU(s): 16 > >>> > On-line CPU(s) list: 0-15 > >>> > Thread(s) per core: 2 > >>> > Core(s) per socket: 4 > >>> > Socket(s): 2 > >>> > NUMA node(s): 2 > >>> > Vendor ID: GenuineIntel > >>> > CPU family: 6 > >>> > Model: 44 > >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > >>> > Stepping: 2 > >>> > CPU MHz: 3192.064 > >>> > BogoMIPS: 6384.12 > >>> > Virtualization: VT-x > >>> > L1d cache: 32K > >>> > L1i cache: 32K > >>> > L2 cache: 256K > >>> > L3 cache: 12288K > >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu
> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm
wrote: the this. By pfifo_fast pfifo_fast pfifo_fast pfifo_fast pni pclmulqdq pcid dca
> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > >>> > [root@swm-02 ~]# > >>> > _______________________________________________ > >>> > Users mailing list -- users@ovirt.org > >>> > To unsubscribe send an email to users-leave@ovirt.org > >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... > >> > >> _______________________________________________ > >> Users mailing list -- users@ovirt.org > >> To unsubscribe send an email to users-leave@ovirt.org > >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Sure! Right now, I only have a 500gb partition on each node shared over NFS, added as storage domains. This is on each node - so, currently 3. How can the storage cause a node to drop out? On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible.
Can you share some details about your storage? Maybe there is a feature used during live migration, which triggers the issue.
These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration.
On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Dominik,
Thanks for helping. I really want to try to use ovirt.
When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back...
On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com>
Is you storage connected via NFS? Can you manually access the storage on the host?
On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Sorry to dead bump this, but I'm beginning to suspect that maybe it's not STP that's the problem.
2 of my hosts just went down when a few VMs tried to migrate.
Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to
start.
[68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120
seconds.
[68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex
[68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID
On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote:
Thanks, I'm just going to revert back to bridges.
On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <
dholler@redhat.com> wrote:
> > > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> >> Seems like the STP options are so common and necessary that it would >> be a priority over seldom-used bridge_opts. I know what STP is and I'm >> not even a networking guy - never even heard of half of the >> bridge_opts that have switches in the UI. >> >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my >> nodes and used "openvswitch (Technology Preview)" as the engine-setup >> option for the first host. I made a new Cluster for my nodes, added >> them all to the new cluster, created a new "logical network" for the >> internal network and attached it to the internal network ports. >> >> Now, when I go to create a new VM, I don't even have either the >> ovirtmgmt switch OR the internal switch as an option. The drop-down is >> empy as if I don't have any vnic-profiles. >> > > openvswitch clusters are limited to ovn networks. > You can create one like described in > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... > > >> >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. >> > Thanks >> > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, < dholler@redhat.com> wrote: >> >> >> >> >> >> >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >> >>> >> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: >> >>> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. >> >>> > That issue relates to the ethernet and 10gb fc links that I have on my >> >>> > cluster being disabled any time a migration occurs. >> >>> > >> >>> > I believe this is because I need to have STP turned on in order to >> >>> > participate with the switch. However, there does not seem to be any >> >>> > way to tell oVirt to stop turning it off! Very frustrating. >> >>> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 >> >>> > minute, the migration issue disappears.... >> >>> > >> >>> > Is there any way at all to do without this cronjob and set STP to be >> >>> > ON without having to resort to such a silly solution? >> >>> >> >>> Vdsm exposes a per bridge STP knob that you can use for
>> >>> default it is set to false, which is probably why you had to use this >> >>> shenanigan. >> >>> >> >>> You can, for instance: >> >>> >> >>> # show present state >> >>> [vagrant@vdsm ~]$ ip a >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> >>> group default qlen 1000 >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> >>> inet 127.0.0.1/8 scope host lo >> >>> valid_lft forever preferred_lft forever >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>> >>> state UP group default qlen 1000 >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>> >>> state UP group default qlen 1000 >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> >>> valid_lft forever preferred_lft forever >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> >>> valid_lft forever preferred_lft forever >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> >>> group default qlen 1000 >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> >>> >> >>> # show example bridge configuration - you're looking for
>> >>> [root@vdsm ~]$ cat bridged_net_with_stp >> >>> { >> >>> "bondings": {}, >> >>> "networks": { >> >>> "test-network": { >> >>> "nic": "eth0", >> >>> "switch": "legacy", >> >>> "bridged": true, >> >>> "stp": true >> >>> } >> >>> }, >> >>> "options": { >> >>> "connectivityCheck": false >> >>> } >> >>> } >> >>> >> >>> # issue setup networks command: >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >> >>> { >> >>> "code": 0, >> >>> "message": "Done" >> >>> } >> >>> >> >>> # show bridges >> >>> [root@vdsm ~]$ brctl show >> >>> bridge name bridge id STP enabled interfaces >> >>> ;vdsmdummy; 8000.000000000000 no >> >>> test-network 8000.52540041fb37 yes eth0 >> >>> >> >>> # show final state >> >>> [root@vdsm ~]$ ip a >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> >>> group default qlen 1000 >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> >>> inet 127.0.0.1/8 scope host lo >> >>> valid_lft forever preferred_lft forever >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>> >>> master test-network state UP group default qlen 1000 >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>> >>> state UP group default qlen 1000 >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> >>> valid_lft forever preferred_lft forever >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> >>> valid_lft forever preferred_lft forever >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> >>> group default qlen 1000 >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> >>> noqueue state UP group default qlen 1000 >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> >>> >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik >> >>> Holler , could you confirm ? What are our plans for it ? >> >>> >> >> >> >> STP is only available via REST-API, see >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network >> >> please find an example how to enable STP in >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 >> >> >> >> We have no plans to add STP to the web ui, >> >> but new feature requests are always welcome on >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine >> >> >> >> >> >>> >> >>> > >> >>> > Here are some details about my systems, if you need it. >> >>> > >> >>> > >> >>> > selinux is disabled. >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >> >>> > ovirt-host-4.3.4-1.el7.x86_64 >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >> >>> > [root@swm-02 ~]# cat /etc/redhat-release >> >>> > CentOS Linux release 7.6.1810 (Core) >> >>> > [root@swm-02 ~]# uname -r >> >>> > 3.10.0-957.27.2.el7.x86_64 >> >>> > You have new mail in /var/spool/mail/root >> >>> > [root@swm-02 ~]# ip a >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> >>> > group default qlen 1000 >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> >>> > inet 127.0.0.1/8 scope host lo >> >>> > valid_lft forever preferred_lft forever >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> >>> > test state UP group default qlen 1000 >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> >>> > default qlen 1000 >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> >>> > ovirtmgmt state UP group default qlen 1000 >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> >>> > default qlen 1000 >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> >>> > group default qlen 1000 >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> >>> > default qlen 1000 >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >> >>> > state UP group default qlen 1000 >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >> >>> > valid_lft forever preferred_lft forever >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> >>> > noqueue state UP group default qlen 1000 >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >> >>> > valid_lft forever preferred_lft forever >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> >>> > group default qlen 1000 >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >> >>> > [root@swm-02 ~]# free -m >> >>> > total used free shared buff/cache available >> >>> > Mem: 64413 1873 61804 9 735 62062 >> >>> > Swap: 16383 0 16383 >> >>> > [root@swm-02 ~]# free -h >> >>> > total used free shared buff/cache available >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G >> >>> > Swap: 15G 0B 15G >> >>> > [root@swm-02 ~]# ls >> >>> > ls lsb_release lshw lslocks >> >>> > lsmod lspci lssubsys >> >>> > lsusb.py >> >>> > lsattr lscgroup lsinitrd lslogins >> >>> > lsns lss16toppm lstopo-no-graphics >> >>> > lsblk lscpu lsipc lsmem >> >>> > lsof lsscsi lsusb >> >>> > [root@swm-02 ~]# lscpu >> >>> > Architecture: x86_64 >> >>> > CPU op-mode(s): 32-bit, 64-bit >> >>> > Byte Order: Little Endian >> >>> > CPU(s): 16 >> >>> > On-line CPU(s) list: 0-15 >> >>> > Thread(s) per core: 2 >> >>> > Core(s) per socket: 4 >> >>> > Socket(s): 2 >> >>> > NUMA node(s): 2 >> >>> > Vendor ID: GenuineIntel >> >>> > CPU family: 6 >> >>> > Model: 44 >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >> >>> > Stepping: 2 >> >>> > CPU MHz: 3192.064 >> >>> > BogoMIPS: 6384.12 >> >>> > Virtualization: VT-x >> >>> > L1d cache: 32K >> >>> > L1i cache: 32K >> >>> > L2 cache: 256K >> >>> > L3 cache: 12288K >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu
>> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
wrote: this. By pfifo_fast pfifo_fast the STP knob here. pfifo_fast pfifo_fast pni pclmulqdq pdcm pcid dca
>> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >> >>> > [root@swm-02 ~]# >> >>> > _______________________________________________ >> >>> > Users mailing list -- users@ovirt.org >> >>> > To unsubscribe send an email to users-leave@ovirt.org >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... >> >> >> >> _______________________________________________ >> >> Users mailing list -- users@ovirt.org >> >> To unsubscribe send an email to users-leave@ovirt.org >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Sure! Right now, I only have a 500gb partition on each node shared over NFS, added as storage domains. This is on each node - so, currently 3.
How can the storage cause a node to drop out?
Thanks, I got it. All three links go down on load, which causes NFS to fail. Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? Can you try to modify the behavior by changing the settings of your host interfaces, e.g. ethtool -A em1 autoneg off rx off tx off or ethtool -A em1 autoneg on rx on tx on ?
On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible.
Can you share some details about your storage? Maybe there is a feature used during live migration, which triggers the issue.
These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration.
On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Dominik,
Thanks for helping. I really want to try to use ovirt.
When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back...
On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com>
Is you storage connected via NFS? Can you manually access the storage on the host?
On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Sorry to dead bump this, but I'm beginning to suspect that maybe
it's
not STP that's the problem.
2 of my hosts just went down when a few VMs tried to migrate.
Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to start.
[68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex
[68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID
On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: > > Thanks, I'm just going to revert back to bridges. > > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler < dholler@redhat.com> wrote: > > > > > > > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: > >> > >> Seems like the STP options are so common and necessary that it would > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm > >> not even a networking guy - never even heard of half of the > >> bridge_opts that have switches in the UI. > >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup > >> option for the first host. I made a new Cluster for my nodes, added > >> them all to the new cluster, created a new "logical network" for the > >> internal network and attached it to the internal network ports. > >> > >> Now, when I go to create a new VM, I don't even have either the > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is > >> empy as if I don't have any vnic-profiles. > >> > > > > openvswitch clusters are limited to ovn networks. > > You can create one like described in > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... > > > > > >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: > >> > > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > >> > Thanks > >> > > >> > > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, < dholler@redhat.com> wrote: > >> >> > >> >> > >> >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: > >> >>> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: > >> >>> > > >> >>> > Hello. I have been trying to figure out an issue for a very long time. > >> >>> > That issue relates to the ethernet and 10gb fc links
> >> >>> > cluster being disabled any time a migration occurs. > >> >>> > > >> >>> > I believe this is because I need to have STP turned on in order to > >> >>> > participate with the switch. However, there does not seem to be any > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. > >> >>> > > >> >>> > After entering a cronjob that enables stp on all bridges every 1 > >> >>> > minute, the migration issue disappears.... > >> >>> > > >> >>> > Is there any way at all to do without this cronjob and set STP to be > >> >>> > ON without having to resort to such a silly solution? > >> >>> > >> >>> Vdsm exposes a per bridge STP knob that you can use for
> >> >>> default it is set to false, which is probably why you had to use this > >> >>> shenanigan. > >> >>> > >> >>> You can, for instance: > >> >>> > >> >>> # show present state > >> >>> [vagrant@vdsm ~]$ ip a > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >> >>> group default qlen 1000 > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >> >>> inet 127.0.0.1/8 scope host lo > >> >>> valid_lft forever preferred_lft forever > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >> >>> state UP group default qlen 1000 > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >> >>> state UP group default qlen 1000 > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >> >>> valid_lft forever preferred_lft forever > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >> >>> valid_lft forever preferred_lft forever > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> >>> group default qlen 1000 > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >> >>> > >> >>> # show example bridge configuration - you're looking for
> >> >>> [root@vdsm ~]$ cat bridged_net_with_stp > >> >>> { > >> >>> "bondings": {}, > >> >>> "networks": { > >> >>> "test-network": { > >> >>> "nic": "eth0", > >> >>> "switch": "legacy", > >> >>> "bridged": true, > >> >>> "stp": true > >> >>> } > >> >>> }, > >> >>> "options": { > >> >>> "connectivityCheck": false > >> >>> } > >> >>> } > >> >>> > >> >>> # issue setup networks command: > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks > >> >>> { > >> >>> "code": 0, > >> >>> "message": "Done" > >> >>> } > >> >>> > >> >>> # show bridges > >> >>> [root@vdsm ~]$ brctl show > >> >>> bridge name bridge id STP enabled interfaces > >> >>> ;vdsmdummy; 8000.000000000000 no > >> >>> test-network 8000.52540041fb37 yes eth0 > >> >>> > >> >>> # show final state > >> >>> [root@vdsm ~]$ ip a > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >> >>> group default qlen 1000 > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >> >>> inet 127.0.0.1/8 scope host lo > >> >>> valid_lft forever preferred_lft forever > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >> >>> master test-network state UP group default qlen 1000 > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> >> >>> state UP group default qlen 1000 > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >> >>> valid_lft forever preferred_lft forever > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >> >>> valid_lft forever preferred_lft forever > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> >>> group default qlen 1000 > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >> >>> noqueue state UP group default qlen 1000 > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >> >>> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik > >> >>> Holler , could you confirm ? What are our plans for it ? > >> >>> > >> >> > >> >> STP is only available via REST-API, see > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network > >> >> please find an example how to enable STP in > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 > >> >> > >> >> We have no plans to add STP to the web ui, > >> >> but new feature requests are always welcome on > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine > >> >> > >> >> > >> >>> > >> >>> > > >> >>> > Here are some details about my systems, if you need it. > >> >>> > > >> >>> > > >> >>> > selinux is disabled. > >> >>> > > >> >>> > > >> >>> > > >> >>> > > >> >>> > > >> >>> > > >> >>> > > >> >>> > > >> >>> > > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release > >> >>> > CentOS Linux release 7.6.1810 (Core) > >> >>> > [root@swm-02 ~]# uname -r > >> >>> > 3.10.0-957.27.2.el7.x86_64 > >> >>> > You have new mail in /var/spool/mail/root > >> >>> > [root@swm-02 ~]# ip a > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >> >>> > group default qlen 1000 > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >> >>> > inet 127.0.0.1/8 scope host lo > >> >>> > valid_lft forever preferred_lft forever > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >> >>> > test state UP group default qlen 1000 > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >> >>> > default qlen 1000 > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >> >>> > ovirtmgmt state UP group default qlen 1000 > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >> >>> > default qlen 1000 > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> >>> > group default qlen 1000 > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >> >>> > default qlen 1000 > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > >> >>> > state UP group default qlen 1000 > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > >> >>> > valid_lft forever preferred_lft forever > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >> >>> > noqueue state UP group default qlen 1000 > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > >> >>> > valid_lft forever preferred_lft forever > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> >>> > group default qlen 1000 > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > >> >>> > [root@swm-02 ~]# free -m > >> >>> > total used free shared buff/cache available > >> >>> > Mem: 64413 1873 61804 9 735 62062 > >> >>> > Swap: 16383 0 16383 > >> >>> > [root@swm-02 ~]# free -h > >> >>> > total used free shared buff/cache available > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G > >> >>> > Swap: 15G 0B 15G > >> >>> > [root@swm-02 ~]# ls > >> >>> > ls lsb_release lshw lslocks > >> >>> > lsmod lspci lssubsys > >> >>> > lsusb.py > >> >>> > lsattr lscgroup lsinitrd lslogins > >> >>> > lsns lss16toppm lstopo-no-graphics > >> >>> > lsblk lscpu lsipc lsmem > >> >>> > lsof lsscsi lsusb > >> >>> > [root@swm-02 ~]# lscpu > >> >>> > Architecture: x86_64 > >> >>> > CPU op-mode(s): 32-bit, 64-bit > >> >>> > Byte Order: Little Endian > >> >>> > CPU(s): 16 > >> >>> > On-line CPU(s) list: 0-15 > >> >>> > Thread(s) per core: 2 > >> >>> > Core(s) per socket: 4 > >> >>> > Socket(s): 2 > >> >>> > NUMA node(s): 2 > >> >>> > Vendor ID: GenuineIntel > >> >>> > CPU family: 6 > >> >>> > Model: 44 > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > >> >>> > Stepping: 2 > >> >>> > CPU MHz: 3192.064 > >> >>> > BogoMIPS: 6384.12 > >> >>> > Virtualization: VT-x > >> >>> > L1d cache: 32K > >> >>> > L1i cache: 32K > >> >>> > L2 cache: 256K > >> >>> > L3 cache: 12288K > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu
> >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
wrote: that I have on my this. By pfifo_fast pfifo_fast the STP knob here. pfifo_fast pfifo_fast pni pclmulqdq pdcm pcid dca
> >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > >> >>> > [root@swm-02 ~]# > >> >>> > _______________________________________________ > >> >>> > Users mailing list -- users@ovirt.org > >> >>> > To unsubscribe send an email to users-leave@ovirt.org > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... > >> >> > >> >> _______________________________________________ > >> >> Users mailing list -- users@ovirt.org > >> >> To unsubscribe send an email to users-leave@ovirt.org > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Unfortunately, I can't check on the switch. Trust me, I've tried. These servers are in a Co-Lo and I've put 5 tickets in asking about the port configuration. They just get ignored - but that's par for the coarse for IT here. Only about 2 out of 10 of our tickets get any response and usually the response doesn't help. Then the system they use auto-closes the ticket. That was why I was suspecting STP before. I can do ethtool. I do have root on these servers, though. Are you trying to get me to turn off link-speed auto-negotiation? Would you like me to try that? On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Sure! Right now, I only have a 500gb partition on each node shared over NFS, added as storage domains. This is on each node - so, currently 3.
How can the storage cause a node to drop out?
Thanks, I got it. All three links go down on load, which causes NFS to fail.
Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? Can you try to modify the behavior by changing the settings of your host interfaces, e.g.
ethtool -A em1 autoneg off rx off tx off
or ethtool -A em1 autoneg on rx on tx on ?
On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible.
Can you share some details about your storage? Maybe there is a feature used during live migration, which triggers the issue.
These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration.
On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Dominik,
Thanks for helping. I really want to try to use ovirt.
When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back...
On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com> wrote:
Is you storage connected via NFS? Can you manually access the storage on the host?
On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > Sorry to dead bump this, but I'm beginning to suspect that maybe it's > not STP that's the problem. > > 2 of my hosts just went down when a few VMs tried to migrate. > > Do any of you have any idea what might be going on here? I don't even > know where to start. I'm going to include the dmesg in case it helps. > This happens on both of the hosts whenever any migration attempts to start. > > > > > > > > > > [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down > [68099.246055] internal: port 1(em1) entered disabled state > [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down > [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state > [68184.177856] ovirtmgmt: topology change detected, propagating > [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > [68277.078727] Call Trace: > [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 > [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > [68397.072439] Call Trace: > [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 > [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps > full duplex > > [68401.573247] internal: port 1(em1) entered blocking state > [68401.573255] internal: port 1(em1) entered listening state > [68403.576985] internal: port 1(em1) entered learning state > [68405.580907] internal: port 1(em1) entered forwarding state > [68405.580916] internal: topology change detected, propagating > [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out > [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out > [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow > Control: RX/TX > [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state > [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state > [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state > [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state > [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu > [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > [68494.777996] NFSD: client 10.15.28.22 testing state ID with > incorrect client ID > [68494.778580] NFSD: client 10.15.28.22 testing state ID with > incorrect client ID > > > On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > > > Thanks, I'm just going to revert back to bridges. > > > > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote: > > > > > > > > > > > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > >> > > >> Seems like the STP options are so common and necessary that it would > > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm > > >> not even a networking guy - never even heard of half of the > > >> bridge_opts that have switches in the UI. > > >> > > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my > > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup > > >> option for the first host. I made a new Cluster for my nodes, added > > >> them all to the new cluster, created a new "logical network" for the > > >> internal network and attached it to the internal network ports. > > >> > > >> Now, when I go to create a new VM, I don't even have either the > > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is > > >> empy as if I don't have any vnic-profiles. > > >> > > > > > > openvswitch clusters are limited to ovn networks. > > > You can create one like described in > > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... > > > > > > > > >> > > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: > > >> > > > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > > >> > Thanks > > >> > > > >> > > > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote: > > >> >> > > >> >> > > >> >> > > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: > > >> >>> > > >> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: > > >> >>> > > > >> >>> > Hello. I have been trying to figure out an issue for a very long time. > > >> >>> > That issue relates to the ethernet and 10gb fc links that I have on my > > >> >>> > cluster being disabled any time a migration occurs. > > >> >>> > > > >> >>> > I believe this is because I need to have STP turned on in order to > > >> >>> > participate with the switch. However, there does not seem to be any > > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. > > >> >>> > > > >> >>> > After entering a cronjob that enables stp on all bridges every 1 > > >> >>> > minute, the migration issue disappears.... > > >> >>> > > > >> >>> > Is there any way at all to do without this cronjob and set STP to be > > >> >>> > ON without having to resort to such a silly solution? > > >> >>> > > >> >>> Vdsm exposes a per bridge STP knob that you can use for this. By > > >> >>> default it is set to false, which is probably why you had to use this > > >> >>> shenanigan. > > >> >>> > > >> >>> You can, for instance: > > >> >>> > > >> >>> # show present state > > >> >>> [vagrant@vdsm ~]$ ip a > > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > >> >>> group default qlen 1000 > > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > >> >>> inet 127.0.0.1/8 scope host lo > > >> >>> valid_lft forever preferred_lft forever > > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> >>> state UP group default qlen 1000 > > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> >>> state UP group default qlen 1000 > > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > > >> >>> valid_lft forever preferred_lft forever > > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > > >> >>> valid_lft forever preferred_lft forever > > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> >>> group default qlen 1000 > > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > > >> >>> > > >> >>> # show example bridge configuration - you're looking for the STP knob here. > > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp > > >> >>> { > > >> >>> "bondings": {}, > > >> >>> "networks": { > > >> >>> "test-network": { > > >> >>> "nic": "eth0", > > >> >>> "switch": "legacy", > > >> >>> "bridged": true, > > >> >>> "stp": true > > >> >>> } > > >> >>> }, > > >> >>> "options": { > > >> >>> "connectivityCheck": false > > >> >>> } > > >> >>> } > > >> >>> > > >> >>> # issue setup networks command: > > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks > > >> >>> { > > >> >>> "code": 0, > > >> >>> "message": "Done" > > >> >>> } > > >> >>> > > >> >>> # show bridges > > >> >>> [root@vdsm ~]$ brctl show > > >> >>> bridge name bridge id STP enabled interfaces > > >> >>> ;vdsmdummy; 8000.000000000000 no > > >> >>> test-network 8000.52540041fb37 yes eth0 > > >> >>> > > >> >>> # show final state > > >> >>> [root@vdsm ~]$ ip a > > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > >> >>> group default qlen 1000 > > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > >> >>> inet 127.0.0.1/8 scope host lo > > >> >>> valid_lft forever preferred_lft forever > > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> >>> master test-network state UP group default qlen 1000 > > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> >>> state UP group default qlen 1000 > > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > > >> >>> valid_lft forever preferred_lft forever > > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > > >> >>> valid_lft forever preferred_lft forever > > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> >>> group default qlen 1000 > > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > > >> >>> noqueue state UP group default qlen 1000 > > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > > >> >>> > > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik > > >> >>> Holler , could you confirm ? What are our plans for it ? > > >> >>> > > >> >> > > >> >> STP is only available via REST-API, see > > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network > > >> >> please find an example how to enable STP in > > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 > > >> >> > > >> >> We have no plans to add STP to the web ui, > > >> >> but new feature requests are always welcome on > > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine > > >> >> > > >> >> > > >> >>> > > >> >>> > > > >> >>> > Here are some details about my systems, if you need it. > > >> >>> > > > >> >>> > > > >> >>> > selinux is disabled. > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt > > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 > > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch > > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch > > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch > > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch > > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch > > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch > > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 > > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch > > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release > > >> >>> > CentOS Linux release 7.6.1810 (Core) > > >> >>> > [root@swm-02 ~]# uname -r > > >> >>> > 3.10.0-957.27.2.el7.x86_64 > > >> >>> > You have new mail in /var/spool/mail/root > > >> >>> > [root@swm-02 ~]# ip a > > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > >> >>> > group default qlen 1000 > > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > >> >>> > inet 127.0.0.1/8 scope host lo > > >> >>> > valid_lft forever preferred_lft forever > > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > >> >>> > test state UP group default qlen 1000 > > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > >> >>> > default qlen 1000 > > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > >> >>> > ovirtmgmt state UP group default qlen 1000 > > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > >> >>> > default qlen 1000 > > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> >>> > group default qlen 1000 > > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > >> >>> > default qlen 1000 > > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > > >> >>> > state UP group default qlen 1000 > > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > > >> >>> > valid_lft forever preferred_lft forever > > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > > >> >>> > noqueue state UP group default qlen 1000 > > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > > >> >>> > valid_lft forever preferred_lft forever > > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> >>> > group default qlen 1000 > > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 > > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > > >> >>> > [root@swm-02 ~]# free -m > > >> >>> > total used free shared buff/cache available > > >> >>> > Mem: 64413 1873 61804 9 735 62062 > > >> >>> > Swap: 16383 0 16383 > > >> >>> > [root@swm-02 ~]# free -h > > >> >>> > total used free shared buff/cache available > > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G > > >> >>> > Swap: 15G 0B 15G > > >> >>> > [root@swm-02 ~]# ls > > >> >>> > ls lsb_release lshw lslocks > > >> >>> > lsmod lspci lssubsys > > >> >>> > lsusb.py > > >> >>> > lsattr lscgroup lsinitrd lslogins > > >> >>> > lsns lss16toppm lstopo-no-graphics > > >> >>> > lsblk lscpu lsipc lsmem > > >> >>> > lsof lsscsi lsusb > > >> >>> > [root@swm-02 ~]# lscpu > > >> >>> > Architecture: x86_64 > > >> >>> > CPU op-mode(s): 32-bit, 64-bit > > >> >>> > Byte Order: Little Endian > > >> >>> > CPU(s): 16 > > >> >>> > On-line CPU(s) list: 0-15 > > >> >>> > Thread(s) per core: 2 > > >> >>> > Core(s) per socket: 4 > > >> >>> > Socket(s): 2 > > >> >>> > NUMA node(s): 2 > > >> >>> > Vendor ID: GenuineIntel > > >> >>> > CPU family: 6 > > >> >>> > Model: 44 > > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > > >> >>> > Stepping: 2 > > >> >>> > CPU MHz: 3192.064 > > >> >>> > BogoMIPS: 6384.12 > > >> >>> > Virtualization: VT-x > > >> >>> > L1d cache: 32K > > >> >>> > L1i cache: 32K > > >> >>> > L2 cache: 256K > > >> >>> > L3 cache: 12288K > > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > > >> >>> > [root@swm-02 ~]# > > >> >>> > _______________________________________________ > > >> >>> > Users mailing list -- users@ovirt.org > > >> >>> > To unsubscribe send an email to users-leave@ovirt.org > > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... > > >> >> > > >> >> _______________________________________________ > > >> >> Users mailing list -- users@ovirt.org > > >> >> To unsubscribe send an email to users-leave@ovirt.org > > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Unfortunately, I can't check on the switch. Trust me, I've tried. These servers are in a Co-Lo and I've put 5 tickets in asking about the port configuration. They just get ignored - but that's par for the coarse for IT here. Only about 2 out of 10 of our tickets get any response and usually the response doesn't help. Then the system they use auto-closes the ticket. That was why I was suspecting STP before.
I can do ethtool. I do have root on these servers, though. Are you trying to get me to turn off link-speed auto-negotiation? Would you like me to try that?
It is just a suspicion, that the reason is pause frames. Let's start on a NIC which is not used for ovirtmgmt, I guess em1. Does 'ethtool -S em1 | grep pause' show something? Does 'ethtool em1 | grep pause' indicates support for pause? The current config is shown by 'ethtool -a em1'. '-A autoneg' "Specifies whether pause autonegotiation should be enabled." according to ethtool doc. Assuming flow control is enabled by default, I would try to disable it via 'ethtool -A em1 autoneg off rx off tx off' and check if it is applied via 'ethtool -a em1' and check if the behavior under load changes.
On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <ej.albany@gmail.com>
Sure! Right now, I only have a 500gb partition on each node shared over
NFS, added as storage domains. This is on each node - so, currently 3.
How can the storage cause a node to drop out?
Thanks, I got it. All three links go down on load, which causes NFS to fail.
Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? Can you try to modify the behavior by changing the settings of your host interfaces, e.g.
ethtool -A em1 autoneg off rx off tx off
or ethtool -A em1 autoneg on rx on tx on ?
On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com>
wrote:
On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible.
Can you share some details about your storage? Maybe there is a feature used during live migration, which triggers
These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration.
On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Dominik,
Thanks for helping. I really want to try to use ovirt.
When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back...
On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com>
wrote:
> > Is you storage connected via NFS? > Can you manually access the storage on the host? > > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's >> not STP that's the problem. >> >> 2 of my hosts just went down when a few VMs tried to migrate. >> >> Do any of you have any idea what might be going on here? I don't even >> know where to start. I'm going to include the dmesg in case it helps. >> This happens on both of the hosts whenever any migration attempts to start. >> >> >> >> >> >> >> >> >> >> [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down >> [68099.246055] internal: port 1(em1) entered disabled state >> [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state >> [68184.177856] ovirtmgmt: topology change detected, propagating >> [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >> [68277.078727] Call Trace: >> [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >> [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 >> [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >> [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >> [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >> [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >> [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >> [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >> [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >> [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >> [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >> [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >> [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >> [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >> [68397.072439] Call Trace: >> [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >> [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 >> [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >> [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >> [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >> [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >> [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >> [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >> [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >> [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >> [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >> [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >> [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >> [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps >> full duplex >> >> [68401.573247] internal: port 1(em1) entered blocking state >> [68401.573255] internal: port 1(em1) entered listening state >> [68403.576985] internal: port 1(em1) entered learning state >> [68405.580907] internal: port 1(em1) entered forwarding state >> [68405.580916] internal: topology change detected, propagating >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out >> [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow >> Control: RX/TX >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with >> incorrect client ID >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with >> incorrect client ID >> >> >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> > >> > Thanks, I'm just going to revert back to bridges. >> > >> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler < dholler@redhat.com> wrote: >> > > >> > > >> > > >> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> > >> >> > >> Seems like the STP options are so common and necessary that it would >> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm >> > >> not even a networking guy - never even heard of half of the >> > >> bridge_opts that have switches in the UI. >> > >> >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my >> > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup >> > >> option for the first host. I made a new Cluster for my nodes, added >> > >> them all to the new cluster, created a new "logical network" for the >> > >> internal network and attached it to the internal network
>> > >> >> > >> Now, when I go to create a new VM, I don't even have either
>> > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is >> > >> empy as if I don't have any vnic-profiles. >> > >> >> > > >> > > openvswitch clusters are limited to ovn networks. >> > > You can create one like described in >> > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... >> > > >> > > >> > >> >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce < tonyppe@gmail.com> wrote: >> > >> > >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. >> > >> > Thanks >> > >> > >> > >> > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, < dholler@redhat.com> wrote: >> > >> >> >> > >> >> >> > >> >> >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >> > >> >>> >> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: >> > >> >>> > >> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. >> > >> >>> > That issue relates to the ethernet and 10gb fc links
>> > >> >>> > cluster being disabled any time a migration occurs. >> > >> >>> > >> > >> >>> > I believe this is because I need to have STP turned on in order to >> > >> >>> > participate with the switch. However, there does not seem to be any >> > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. >> > >> >>> > >> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 >> > >> >>> > minute, the migration issue disappears.... >> > >> >>> > >> > >> >>> > Is there any way at all to do without this cronjob and set STP to be >> > >> >>> > ON without having to resort to such a silly solution? >> > >> >>> >> > >> >>> Vdsm exposes a per bridge STP knob that you can use for
>> > >> >>> default it is set to false, which is probably why you had to use this >> > >> >>> shenanigan. >> > >> >>> >> > >> >>> You can, for instance: >> > >> >>> >> > >> >>> # show present state >> > >> >>> [vagrant@vdsm ~]$ ip a >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> > >> >>> group default qlen 1000 >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> > >> >>> inet 127.0.0.1/8 scope host lo >> > >> >>> valid_lft forever preferred_lft forever >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> >>> state UP group default qlen 1000 >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> >>> state UP group default qlen 1000 >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> > >> >>> valid_lft forever preferred_lft forever >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> > >> >>> valid_lft forever preferred_lft forever >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> >>> group default qlen 1000 >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> > >> >>> >> > >> >>> # show example bridge configuration - you're looking for the STP knob here. >> > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp >> > >> >>> { >> > >> >>> "bondings": {}, >> > >> >>> "networks": { >> > >> >>> "test-network": { >> > >> >>> "nic": "eth0", >> > >> >>> "switch": "legacy", >> > >> >>> "bridged": true, >> > >> >>> "stp": true >> > >> >>> } >> > >> >>> }, >> > >> >>> "options": { >> > >> >>> "connectivityCheck": false >> > >> >>> } >> > >> >>> } >> > >> >>> >> > >> >>> # issue setup networks command: >> > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >> > >> >>> { >> > >> >>> "code": 0, >> > >> >>> "message": "Done" >> > >> >>> } >> > >> >>> >> > >> >>> # show bridges >> > >> >>> [root@vdsm ~]$ brctl show >> > >> >>> bridge name bridge id STP enabled interfaces >> > >> >>> ;vdsmdummy; 8000.000000000000 no >> > >> >>> test-network 8000.52540041fb37 yes eth0 >> > >> >>> >> > >> >>> # show final state >> > >> >>> [root@vdsm ~]$ ip a >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> > >> >>> group default qlen 1000 >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> > >> >>> inet 127.0.0.1/8 scope host lo >> > >> >>> valid_lft forever preferred_lft forever >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> >>> master test-network state UP group default qlen 1000 >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> >>> state UP group default qlen 1000 >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> > >> >>> valid_lft forever preferred_lft forever >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> > >> >>> valid_lft forever preferred_lft forever >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> >>> group default qlen 1000 >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> > >> >>> noqueue state UP group default qlen 1000 >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> > >> >>> >> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik >> > >> >>> Holler , could you confirm ? What are our plans for it ? >> > >> >>> >> > >> >> >> > >> >> STP is only available via REST-API, see >> > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network >> > >> >> please find an example how to enable STP in >> > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 >> > >> >> >> > >> >> We have no plans to add STP to the web ui, >> > >> >> but new feature requests are always welcome on >> > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine >> > >> >> >> > >> >> >> > >> >>> >> > >> >>> > >> > >> >>> > Here are some details about my systems, if you need it. >> > >> >>> > >> > >> >>> > >> > >> >>> > selinux is disabled. >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt >> > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >> > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch >> > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >> > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >> > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >> > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >> > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >> > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >> > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >> > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch >> > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >> > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch >> > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >> > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >> > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >> > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 >> > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >> > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >> > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >> > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release >> > >> >>> > CentOS Linux release 7.6.1810 (Core) >> > >> >>> > [root@swm-02 ~]# uname -r >> > >> >>> > 3.10.0-957.27.2.el7.x86_64 >> > >> >>> > You have new mail in /var/spool/mail/root >> > >> >>> > [root@swm-02 ~]# ip a >> > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> > >> >>> > group default qlen 1000 >> > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> > >> >>> > inet 127.0.0.1/8 scope host lo >> > >> >>> > valid_lft forever preferred_lft forever >> > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > >> >>> > test state UP group default qlen 1000 >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > >> >>> > default qlen 1000 >> > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >> > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > >> >>> > ovirtmgmt state UP group default qlen 1000 >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > >> >>> > default qlen 1000 >> > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >> > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> >>> > group default qlen 1000 >> > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >> > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > >> >>> > default qlen 1000 >> > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >> > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >> > >> >>> > state UP group default qlen 1000 >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >> > >> >>> > valid_lft forever preferred_lft forever >> > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> > >> >>> > noqueue state UP group default qlen 1000 >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >> > >> >>> > valid_lft forever preferred_lft forever >> > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> >>> > group default qlen 1000 >> > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >> > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 >> > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >> > >> >>> > [root@swm-02 ~]# free -m >> > >> >>> > total used free shared buff/cache available >> > >> >>> > Mem: 64413 1873 61804 9 735 62062 >> > >> >>> > Swap: 16383 0 16383 >> > >> >>> > [root@swm-02 ~]# free -h >> > >> >>> > total used free shared buff/cache available >> > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G >> > >> >>> > Swap: 15G 0B 15G >> > >> >>> > [root@swm-02 ~]# ls >> > >> >>> > ls lsb_release lshw lslocks >> > >> >>> > lsmod lspci lssubsys >> > >> >>> > lsusb.py >> > >> >>> > lsattr lscgroup lsinitrd lslogins >> > >> >>> > lsns lss16toppm lstopo-no-graphics >> > >> >>> > lsblk lscpu lsipc lsmem >> > >> >>> > lsof lsscsi lsusb >> > >> >>> > [root@swm-02 ~]# lscpu >> > >> >>> > Architecture: x86_64 >> > >> >>> > CPU op-mode(s): 32-bit, 64-bit >> > >> >>> > Byte Order: Little Endian >> > >> >>> > CPU(s): 16 >> > >> >>> > On-line CPU(s) list: 0-15 >> > >> >>> > Thread(s) per core: 2 >> > >> >>> > Core(s) per socket: 4 >> > >> >>> > Socket(s): 2 >> > >> >>> > NUMA node(s): 2 >> > >> >>> > Vendor ID: GenuineIntel >> > >> >>> > CPU family: 6 >> > >> >>> > Model: 44 >> > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >> > >> >>> > Stepping: 2 >> > >> >>> > CPU MHz: 3192.064 >> > >> >>> > BogoMIPS: 6384.12 >> > >> >>> > Virtualization: VT-x >> > >> >>> > L1d cache: 32K >> > >> >>> > L1i cache: 32K >> > >> >>> > L2 cache: 256K >> > >> >>> > L3 cache: 12288K >> > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >> > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >> > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep >> > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >> > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >> > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq >> > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
wrote: the issue. ports. the that I have on my this. By pdcm pcid dca
>> > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >> > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >> > >> >>> > [root@swm-02 ~]# >> > >> >>> > _______________________________________________ >> > >> >>> > Users mailing list -- users@ovirt.org >> > >> >>> > To unsubscribe send an email to users-leave@ovirt.org >> > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... >> > >> >> >> > >> >> _______________________________________________ >> > >> >> Users mailing list -- users@ovirt.org >> > >> >> To unsubscribe send an email to users-leave@ovirt.org >> > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

This little cluster isn't in production or anything like that yet. So, I went ahead and used your ethtool commands to disable pause frames on both interfaces of each server. I then, chose a few VMs to migrate around at random. swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't ssh, and the SSH session that I had open was unresponsive. Any other ideas? On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Unfortunately, I can't check on the switch. Trust me, I've tried. These servers are in a Co-Lo and I've put 5 tickets in asking about the port configuration. They just get ignored - but that's par for the coarse for IT here. Only about 2 out of 10 of our tickets get any response and usually the response doesn't help. Then the system they use auto-closes the ticket. That was why I was suspecting STP before.
I can do ethtool. I do have root on these servers, though. Are you trying to get me to turn off link-speed auto-negotiation? Would you like me to try that?
It is just a suspicion, that the reason is pause frames. Let's start on a NIC which is not used for ovirtmgmt, I guess em1. Does 'ethtool -S em1 | grep pause' show something? Does 'ethtool em1 | grep pause' indicates support for pause? The current config is shown by 'ethtool -a em1'. '-A autoneg' "Specifies whether pause autonegotiation should be enabled." according to ethtool doc. Assuming flow control is enabled by default, I would try to disable it via 'ethtool -A em1 autoneg off rx off tx off' and check if it is applied via 'ethtool -a em1' and check if the behavior under load changes.
On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Sure! Right now, I only have a 500gb partition on each node shared over NFS, added as storage domains. This is on each node - so, currently 3.
How can the storage cause a node to drop out?
Thanks, I got it. All three links go down on load, which causes NFS to fail.
Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? Can you try to modify the behavior by changing the settings of your host interfaces, e.g.
ethtool -A em1 autoneg off rx off tx off
or ethtool -A em1 autoneg on rx on tx on ?
On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible.
Can you share some details about your storage? Maybe there is a feature used during live migration, which triggers the issue.
These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration.
On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > Hey Dominik, > > Thanks for helping. I really want to try to use ovirt. > > When these events happen, I cannot even SSH to the nodes due to the > link being down. After a little while, the hosts come back... > > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com> wrote: > > > > Is you storage connected via NFS? > > Can you manually access the storage on the host? > > > > > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >> > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's > >> not STP that's the problem. > >> > >> 2 of my hosts just went down when a few VMs tried to migrate. > >> > >> Do any of you have any idea what might be going on here? I don't even > >> know where to start. I'm going to include the dmesg in case it helps. > >> This happens on both of the hosts whenever any migration attempts to start. > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down > >> [68099.246055] internal: port 1(em1) entered disabled state > >> [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state > >> [68184.177856] ovirtmgmt: topology change detected, propagating > >> [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > >> disables this message. > >> [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > >> [68277.078727] Call Trace: > >> [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > >> [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 > >> [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > >> [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > >> [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > >> [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > >> [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > >> [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > >> [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > >> [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > >> [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > >> [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > >> [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > >> [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > >> disables this message. > >> [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > >> [68397.072439] Call Trace: > >> [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > >> [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 > >> [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > >> [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > >> [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > >> [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > >> [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > >> [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > >> [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > >> [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > >> [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > >> [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > >> [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > >> [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps > >> full duplex > >> > >> [68401.573247] internal: port 1(em1) entered blocking state > >> [68401.573255] internal: port 1(em1) entered listening state > >> [68403.576985] internal: port 1(em1) entered learning state > >> [68405.580907] internal: port 1(em1) entered forwarding state > >> [68405.580916] internal: topology change detected, propagating > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out > >> [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow > >> Control: RX/TX > >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state > >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state > >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state > >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state > >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu > >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with > >> incorrect client ID > >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with > >> incorrect client ID > >> > >> > >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >> > > >> > Thanks, I'm just going to revert back to bridges. > >> > > >> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote: > >> > > > >> > > > >> > > > >> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >> > >> > >> > >> Seems like the STP options are so common and necessary that it would > >> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm > >> > >> not even a networking guy - never even heard of half of the > >> > >> bridge_opts that have switches in the UI. > >> > >> > >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my > >> > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup > >> > >> option for the first host. I made a new Cluster for my nodes, added > >> > >> them all to the new cluster, created a new "logical network" for the > >> > >> internal network and attached it to the internal network ports. > >> > >> > >> > >> Now, when I go to create a new VM, I don't even have either the > >> > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is > >> > >> empy as if I don't have any vnic-profiles. > >> > >> > >> > > > >> > > openvswitch clusters are limited to ovn networks. > >> > > You can create one like described in > >> > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... > >> > > > >> > > > >> > >> > >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: > >> > >> > > >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > >> > >> > Thanks > >> > >> > > >> > >> > > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote: > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: > >> > >> >>> > >> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: > >> > >> >>> > > >> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. > >> > >> >>> > That issue relates to the ethernet and 10gb fc links that I have on my > >> > >> >>> > cluster being disabled any time a migration occurs. > >> > >> >>> > > >> > >> >>> > I believe this is because I need to have STP turned on in order to > >> > >> >>> > participate with the switch. However, there does not seem to be any > >> > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. > >> > >> >>> > > >> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 > >> > >> >>> > minute, the migration issue disappears.... > >> > >> >>> > > >> > >> >>> > Is there any way at all to do without this cronjob and set STP to be > >> > >> >>> > ON without having to resort to such a silly solution? > >> > >> >>> > >> > >> >>> Vdsm exposes a per bridge STP knob that you can use for this. By > >> > >> >>> default it is set to false, which is probably why you had to use this > >> > >> >>> shenanigan. > >> > >> >>> > >> > >> >>> You can, for instance: > >> > >> >>> > >> > >> >>> # show present state > >> > >> >>> [vagrant@vdsm ~]$ ip a > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >> > >> >>> group default qlen 1000 > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >> > >> >>> inet 127.0.0.1/8 scope host lo > >> > >> >>> valid_lft forever preferred_lft forever > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >> > >> >>> state UP group default qlen 1000 > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >> > >> >>> state UP group default qlen 1000 > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >> > >> >>> valid_lft forever preferred_lft forever > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >> > >> >>> valid_lft forever preferred_lft forever > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> > >> >>> group default qlen 1000 > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >> > >> >>> > >> > >> >>> # show example bridge configuration - you're looking for the STP knob here. > >> > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp > >> > >> >>> { > >> > >> >>> "bondings": {}, > >> > >> >>> "networks": { > >> > >> >>> "test-network": { > >> > >> >>> "nic": "eth0", > >> > >> >>> "switch": "legacy", > >> > >> >>> "bridged": true, > >> > >> >>> "stp": true > >> > >> >>> } > >> > >> >>> }, > >> > >> >>> "options": { > >> > >> >>> "connectivityCheck": false > >> > >> >>> } > >> > >> >>> } > >> > >> >>> > >> > >> >>> # issue setup networks command: > >> > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks > >> > >> >>> { > >> > >> >>> "code": 0, > >> > >> >>> "message": "Done" > >> > >> >>> } > >> > >> >>> > >> > >> >>> # show bridges > >> > >> >>> [root@vdsm ~]$ brctl show > >> > >> >>> bridge name bridge id STP enabled interfaces > >> > >> >>> ;vdsmdummy; 8000.000000000000 no > >> > >> >>> test-network 8000.52540041fb37 yes eth0 > >> > >> >>> > >> > >> >>> # show final state > >> > >> >>> [root@vdsm ~]$ ip a > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >> > >> >>> group default qlen 1000 > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >> > >> >>> inet 127.0.0.1/8 scope host lo > >> > >> >>> valid_lft forever preferred_lft forever > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >> > >> >>> master test-network state UP group default qlen 1000 > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >> > >> >>> state UP group default qlen 1000 > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >> > >> >>> valid_lft forever preferred_lft forever > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >> > >> >>> valid_lft forever preferred_lft forever > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> > >> >>> group default qlen 1000 > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >> > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >> > >> >>> noqueue state UP group default qlen 1000 > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > >> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik > >> > >> >>> Holler , could you confirm ? What are our plans for it ? > >> > >> >>> > >> > >> >> > >> > >> >> STP is only available via REST-API, see > >> > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network > >> > >> >> please find an example how to enable STP in > >> > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 > >> > >> >> > >> > >> >> We have no plans to add STP to the web ui, > >> > >> >> but new feature requests are always welcome on > >> > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine > >> > >> >> > >> > >> >> > >> > >> >>> > >> > >> >>> > > >> > >> >>> > Here are some details about my systems, if you need it. > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > selinux is disabled. > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > > >> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt > >> > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 > >> > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch > >> > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch > >> > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch > >> > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > >> > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > >> > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > >> > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > >> > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > >> > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch > >> > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > >> > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch > >> > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > >> > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > >> > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch > >> > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 > >> > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > >> > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > >> > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch > >> > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release > >> > >> >>> > CentOS Linux release 7.6.1810 (Core) > >> > >> >>> > [root@swm-02 ~]# uname -r > >> > >> >>> > 3.10.0-957.27.2.el7.x86_64 > >> > >> >>> > You have new mail in /var/spool/mail/root > >> > >> >>> > [root@swm-02 ~]# ip a > >> > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >> > >> >>> > group default qlen 1000 > >> > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >> > >> >>> > inet 127.0.0.1/8 scope host lo > >> > >> >>> > valid_lft forever preferred_lft forever > >> > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >> > >> >>> > test state UP group default qlen 1000 > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >> > >> >>> > default qlen 1000 > >> > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >> > >> >>> > ovirtmgmt state UP group default qlen 1000 > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >> > >> >>> > default qlen 1000 > >> > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> > >> >>> > group default qlen 1000 > >> > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >> > >> >>> > default qlen 1000 > >> > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > >> > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > >> > >> >>> > state UP group default qlen 1000 > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > >> > >> >>> > valid_lft forever preferred_lft forever > >> > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >> > >> >>> > noqueue state UP group default qlen 1000 > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > >> > >> >>> > valid_lft forever preferred_lft forever > >> > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >> > >> >>> > group default qlen 1000 > >> > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > >> > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >> > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 > >> > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > >> > >> >>> > [root@swm-02 ~]# free -m > >> > >> >>> > total used free shared buff/cache available > >> > >> >>> > Mem: 64413 1873 61804 9 735 62062 > >> > >> >>> > Swap: 16383 0 16383 > >> > >> >>> > [root@swm-02 ~]# free -h > >> > >> >>> > total used free shared buff/cache available > >> > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G > >> > >> >>> > Swap: 15G 0B 15G > >> > >> >>> > [root@swm-02 ~]# ls > >> > >> >>> > ls lsb_release lshw lslocks > >> > >> >>> > lsmod lspci lssubsys > >> > >> >>> > lsusb.py > >> > >> >>> > lsattr lscgroup lsinitrd lslogins > >> > >> >>> > lsns lss16toppm lstopo-no-graphics > >> > >> >>> > lsblk lscpu lsipc lsmem > >> > >> >>> > lsof lsscsi lsusb > >> > >> >>> > [root@swm-02 ~]# lscpu > >> > >> >>> > Architecture: x86_64 > >> > >> >>> > CPU op-mode(s): 32-bit, 64-bit > >> > >> >>> > Byte Order: Little Endian > >> > >> >>> > CPU(s): 16 > >> > >> >>> > On-line CPU(s) list: 0-15 > >> > >> >>> > Thread(s) per core: 2 > >> > >> >>> > Core(s) per socket: 4 > >> > >> >>> > Socket(s): 2 > >> > >> >>> > NUMA node(s): 2 > >> > >> >>> > Vendor ID: GenuineIntel > >> > >> >>> > CPU family: 6 > >> > >> >>> > Model: 44 > >> > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > >> > >> >>> > Stepping: 2 > >> > >> >>> > CPU MHz: 3192.064 > >> > >> >>> > BogoMIPS: 6384.12 > >> > >> >>> > Virtualization: VT-x > >> > >> >>> > L1d cache: 32K > >> > >> >>> > L1i cache: 32K > >> > >> >>> > L2 cache: 256K > >> > >> >>> > L3 cache: 12288K > >> > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > >> > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > >> > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > >> > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > >> > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > >> > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > >> > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > >> > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > >> > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > >> > >> >>> > [root@swm-02 ~]# > >> > >> >>> > _______________________________________________ > >> > >> >>> > Users mailing list -- users@ovirt.org > >> > >> >>> > To unsubscribe send an email to users-leave@ovirt.org > >> > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >> > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... > >> > >> >> > >> > >> >> _______________________________________________ > >> > >> >> Users mailing list -- users@ovirt.org > >> > >> >> To unsubscribe send an email to users-leave@ovirt.org > >> > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >> > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
This little cluster isn't in production or anything like that yet.
So, I went ahead and used your ethtool commands to disable pause frames on both interfaces of each server. I then, chose a few VMs to migrate around at random.
swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't ssh, and the SSH session that I had open was unresponsive.
Any other ideas?
Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. This is a strong indication that the root cause is related to the switch. Maybe you can get some information about the switch config by 'lldptool get-tlv -n -i em1'
On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com>
Unfortunately, I can't check on the switch. Trust me, I've tried. These servers are in a Co-Lo and I've put 5 tickets in asking about the port configuration. They just get ignored - but that's par for the coarse for IT here. Only about 2 out of 10 of our tickets get any response and usually the response doesn't help. Then the system they use auto-closes the ticket. That was why I was suspecting STP before.
I can do ethtool. I do have root on these servers, though. Are you trying to get me to turn off link-speed auto-negotiation? Would you like me to try that?
It is just a suspicion, that the reason is pause frames. Let's start on a NIC which is not used for ovirtmgmt, I guess em1. Does 'ethtool -S em1 | grep pause' show something? Does 'ethtool em1 | grep pause' indicates support for pause? The current config is shown by 'ethtool -a em1'. '-A autoneg' "Specifies whether pause autonegotiation should be enabled." according to ethtool doc. Assuming flow control is enabled by default, I would try to disable it via 'ethtool -A em1 autoneg off rx off tx off' and check if it is applied via 'ethtool -a em1' and check if the behavior under load changes.
On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com>
wrote:
On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Sure! Right now, I only have a 500gb partition on each node shared
over NFS, added as storage domains. This is on each node - so, currently 3.
How can the storage cause a node to drop out?
Thanks, I got it. All three links go down on load, which causes NFS to fail.
Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? Can you try to modify the behavior by changing the settings of your host interfaces, e.g.
ethtool -A em1 autoneg off rx off tx off
or ethtool -A em1 autoneg on rx on tx on ?
On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com>
wrote:
On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
> > Also, if it helps, the hosts will sit there, quietly, for hours or > days before anything happens. They're up and working just fine. But > then, when I manually migrate a VM from one host to another, they > become completely inaccessible. >
Can you share some details about your storage? Maybe there is a feature used during live migration, which triggers
> > These are vanilla-as-possible CentOS7 nodes. Very basic ovirt
install
> and configuration. > > On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. > <ej.albany@gmail.com> wrote: > > > > Hey Dominik, > > > > Thanks for helping. I really want to try to use ovirt. > > > > When these events happen, I cannot even SSH to the nodes due to
> > link being down. After a little while, the hosts come back... > > > > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler < dholler@redhat.com> wrote: > > > > > > Is you storage connected via NFS? > > > Can you manually access the storage on the host? > > > > > > > > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: > > >> > > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's > > >> not STP that's the problem. > > >> > > >> 2 of my hosts just went down when a few VMs tried to migrate. > > >> > > >> Do any of you have any idea what might be going on here? I don't even > > >> know where to start. I'm going to include the dmesg in case it helps. > > >> This happens on both of the hosts whenever any migration attempts to start. > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down > > >> [68099.246055] internal: port 1(em1) entered disabled state > > >> [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down > > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state > > >> [68184.177856] ovirtmgmt: topology change detected,
> > >> [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > >> disables this message. > > >> [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > > >> [68277.078727] Call Trace: > > >> [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > > >> [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 > > >> [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > > >> [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > > >> [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > > >> [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > > >> [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > > >> [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > > >> [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > > >> [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > > >> [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > > >> [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > > >> [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > > >> [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > >> disables this message. > > >> [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > > >> [68397.072439] Call Trace: > > >> [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > > >> [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 > > >> [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > > >> [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > > >> [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > > >> [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > > >> [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > > >> [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > > >> [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > > >> [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > > >> [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > > >> [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > > >> [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > > >> [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps > > >> full duplex > > >> > > >> [68401.573247] internal: port 1(em1) entered blocking state > > >> [68401.573255] internal: port 1(em1) entered listening state > > >> [68403.576985] internal: port 1(em1) entered learning state > > >> [68405.580907] internal: port 1(em1) entered forwarding state > > >> [68405.580916] internal: topology change detected, propagating > > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out > > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out > > >> [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow > > >> Control: RX/TX > > >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state > > >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state > > >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state > > >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state > > >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu > > >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with > > >> incorrect client ID > > >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with > > >> incorrect client ID > > >> > > >> > > >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: > > >> > > > >> > Thanks, I'm just going to revert back to bridges. > > >> > > > >> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler < dholler@redhat.com> wrote: > > >> > > > > >> > > > > >> > > > > >> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: > > >> > >> > > >> > >> Seems like the STP options are so common and necessary
> > >> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm > > >> > >> not even a networking guy - never even heard of half of
> > >> > >> bridge_opts that have switches in the UI. > > >> > >> > > >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my > > >> > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup > > >> > >> option for the first host. I made a new Cluster for my nodes, added > > >> > >> them all to the new cluster, created a new "logical network" for the > > >> > >> internal network and attached it to the internal network
> > >> > >> > > >> > >> Now, when I go to create a new VM, I don't even have either the > > >> > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is > > >> > >> empy as if I don't have any vnic-profiles. > > >> > >> > > >> > > > > >> > > openvswitch clusters are limited to ovn networks. > > >> > > You can create one like described in > > >> > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... > > >> > > > > >> > > > > >> > >> > > >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce < tonyppe@gmail.com> wrote: > > >> > >> > > > >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > > >> > >> > Thanks > > >> > >> > > > >> > >> > > > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, < dholler@redhat.com> wrote: > > >> > >> >> > > >> > >> >> > > >> > >> >> > > >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: > > >> > >> >>> > > >> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM < ej.albany@gmail.com> wrote: > > >> > >> >>> > > > >> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. > > >> > >> >>> > That issue relates to the ethernet and 10gb fc
wrote: the issue. the propagating that it would the ports. links that I have on my
> > >> > >> >>> > cluster being disabled any time a migration occurs. > > >> > >> >>> > > > >> > >> >>> > I believe this is because I need to have STP turned on in order to > > >> > >> >>> > participate with the switch. However, there does not seem to be any > > >> > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. > > >> > >> >>> > > > >> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 > > >> > >> >>> > minute, the migration issue disappears.... > > >> > >> >>> > > > >> > >> >>> > Is there any way at all to do without this cronjob and set STP to be > > >> > >> >>> > ON without having to resort to such a silly solution? > > >> > >> >>> > > >> > >> >>> Vdsm exposes a per bridge STP knob that you can use for this. By > > >> > >> >>> default it is set to false, which is probably why you had to use this > > >> > >> >>> shenanigan. > > >> > >> >>> > > >> > >> >>> You can, for instance: > > >> > >> >>> > > >> > >> >>> # show present state > > >> > >> >>> [vagrant@vdsm ~]$ ip a > > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > >> > >> >>> group default qlen 1000 > > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > >> > >> >>> inet 127.0.0.1/8 scope host lo > > >> > >> >>> valid_lft forever preferred_lft forever > > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> > >> >>> state UP group default qlen 1000 > > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> > >> >>> state UP group default qlen 1000 > > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > > >> > >> >>> valid_lft forever preferred_lft forever > > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > > >> > >> >>> valid_lft forever preferred_lft forever > > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> > >> >>> group default qlen 1000 > > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > > >> > >> >>> # show example bridge configuration - you're looking for the STP knob here. > > >> > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp > > >> > >> >>> { > > >> > >> >>> "bondings": {}, > > >> > >> >>> "networks": { > > >> > >> >>> "test-network": { > > >> > >> >>> "nic": "eth0", > > >> > >> >>> "switch": "legacy", > > >> > >> >>> "bridged": true, > > >> > >> >>> "stp": true > > >> > >> >>> } > > >> > >> >>> }, > > >> > >> >>> "options": { > > >> > >> >>> "connectivityCheck": false > > >> > >> >>> } > > >> > >> >>> } > > >> > >> >>> > > >> > >> >>> # issue setup networks command: > > >> > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks > > >> > >> >>> { > > >> > >> >>> "code": 0, > > >> > >> >>> "message": "Done" > > >> > >> >>> } > > >> > >> >>> > > >> > >> >>> # show bridges > > >> > >> >>> [root@vdsm ~]$ brctl show > > >> > >> >>> bridge name bridge id STP enabled interfaces > > >> > >> >>> ;vdsmdummy; 8000.000000000000 no > > >> > >> >>> test-network 8000.52540041fb37 yes eth0 > > >> > >> >>> > > >> > >> >>> # show final state > > >> > >> >>> [root@vdsm ~]$ ip a > > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > >> > >> >>> group default qlen 1000 > > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > >> > >> >>> inet 127.0.0.1/8 scope host lo > > >> > >> >>> valid_lft forever preferred_lft forever > > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> > >> >>> master test-network state UP group default qlen 1000 > > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > > >> > >> >>> state UP group default qlen 1000 > > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > > >> > >> >>> valid_lft forever preferred_lft forever > > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > > >> > >> >>> valid_lft forever preferred_lft forever > > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> > >> >>> group default qlen 1000 > > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > > >> > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > > >> > >> >>> noqueue state UP group default qlen 1000 > > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > > >> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik > > >> > >> >>> Holler , could you confirm ? What are our plans for it ? > > >> > >> >>> > > >> > >> >> > > >> > >> >> STP is only available via REST-API, see > > >> > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network > > >> > >> >> please find an example how to enable STP in > > >> > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 > > >> > >> >> > > >> > >> >> We have no plans to add STP to the web ui, > > >> > >> >> but new feature requests are always welcome on > > >> > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine > > >> > >> >> > > >> > >> >> > > >> > >> >>> > > >> > >> >>> > > > >> > >> >>> > Here are some details about my systems, if you need it. > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > selinux is disabled. > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > > > >> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt > > >> > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 > > >> > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch > > >> > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch > > >> > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch > > >> > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > > >> > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > > >> > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > > >> > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > > >> > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > > >> > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch > > >> > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > > >> > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch > > >> > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > > >> > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > > >> > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch > > >> > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 > > >> > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > > >> > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > > >> > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch > > >> > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release > > >> > >> >>> > CentOS Linux release 7.6.1810 (Core) > > >> > >> >>> > [root@swm-02 ~]# uname -r > > >> > >> >>> > 3.10.0-957.27.2.el7.x86_64 > > >> > >> >>> > You have new mail in /var/spool/mail/root > > >> > >> >>> > [root@swm-02 ~]# ip a > > >> > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > > >> > >> >>> > group default qlen 1000 > > >> > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > >> > >> >>> > inet 127.0.0.1/8 scope host lo > > >> > >> >>> > valid_lft forever preferred_lft forever > > >> > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > >> > >> >>> > test state UP group default qlen 1000 > > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > >> > >> >>> > default qlen 1000 > > >> > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > >> > >> >>> > ovirtmgmt state UP group default qlen 1000 > > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > >> > >> >>> > default qlen 1000 > > >> > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> > >> >>> > group default qlen 1000 > > >> > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > > >> > >> >>> > default qlen 1000 > > >> > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > > >> > >> >>> > state UP group default qlen 1000 > > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > > >> > >> >>> > valid_lft forever preferred_lft forever > > >> > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > > >> > >> >>> > noqueue state UP group default qlen 1000 > > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > > >> > >> >>> > valid_lft forever preferred_lft forever > > >> > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > > >> > >> >>> > group default qlen 1000 > > >> > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > > >> > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 > > >> > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > > >> > >> >>> > [root@swm-02 ~]# free -m > > >> > >> >>> > total used free shared buff/cache available > > >> > >> >>> > Mem: 64413 1873 61804 9 735 62062 > > >> > >> >>> > Swap: 16383 0 16383 > > >> > >> >>> > [root@swm-02 ~]# free -h > > >> > >> >>> > total used free shared buff/cache available > > >> > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G > > >> > >> >>> > Swap: 15G 0B 15G > > >> > >> >>> > [root@swm-02 ~]# ls > > >> > >> >>> > ls lsb_release lshw lslocks > > >> > >> >>> > lsmod lspci lssubsys > > >> > >> >>> > lsusb.py > > >> > >> >>> > lsattr lscgroup lsinitrd lslogins > > >> > >> >>> > lsns lss16toppm lstopo-no-graphics > > >> > >> >>> > lsblk lscpu lsipc lsmem > > >> > >> >>> > lsof lsscsi lsusb > > >> > >> >>> > [root@swm-02 ~]# lscpu > > >> > >> >>> > Architecture: x86_64 > > >> > >> >>> > CPU op-mode(s): 32-bit, 64-bit > > >> > >> >>> > Byte Order: Little Endian > > >> > >> >>> > CPU(s): 16 > > >> > >> >>> > On-line CPU(s) list: 0-15 > > >> > >> >>> > Thread(s) per core: 2 > > >> > >> >>> > Core(s) per socket: 4 > > >> > >> >>> > Socket(s): 2 > > >> > >> >>> > NUMA node(s): 2 > > >> > >> >>> > Vendor ID: GenuineIntel > > >> > >> >>> > CPU family: 6 > > >> > >> >>> > Model: 44 > > >> > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > > >> > >> >>> > Stepping: 2 > > >> > >> >>> > CPU MHz: 3192.064 > > >> > >> >>> > BogoMIPS: 6384.12 > > >> > >> >>> > Virtualization: VT-x > > >> > >> >>> > L1d cache: 32K > > >> > >> >>> > L1i cache: 32K > > >> > >> >>> > L2 cache: 256K > > >> > >> >>> > L3 cache: 12288K > > >> > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > > >> > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > > >> > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > > >> > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > > >> > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > > >> > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > > >> > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > > >> > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > > >> > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > > >> > >> >>> > [root@swm-02 ~]# > > >> > >> >>> > _______________________________________________ > > >> > >> >>> > Users mailing list -- users@ovirt.org > > >> > >> >>> > To unsubscribe send an email to users-leave@ovirt.org > > >> > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > >> > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > > >> > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... > > >> > >> >> > > >> > >> >> _______________________________________________ > > >> > >> >> Users mailing list -- users@ovirt.org > > >> > >> >> To unsubscribe send an email to users-leave@ovirt.org > > >> > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > >> > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > > >> > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
This little cluster isn't in production or anything like that yet.
So, I went ahead and used your ethtool commands to disable pause frames on both interfaces of each server. I then, chose a few VMs to migrate around at random.
swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't ssh, and the SSH session that I had open was unresponsive.
Any other ideas?
Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. This is a strong indication that the root cause is related to the switch. Maybe you can get some information about the switch config by 'lldptool get-tlv -n -i em1'
Another guess: After the optional 'lldptool get-tlv -n -i em1' 'systemctl stop lldpad' another try to migrate.
On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <
Unfortunately, I can't check on the switch. Trust me, I've tried. These servers are in a Co-Lo and I've put 5 tickets in asking about the port configuration. They just get ignored - but that's par for the coarse for IT here. Only about 2 out of 10 of our tickets get any response and usually the response doesn't help. Then the system they use auto-closes the ticket. That was why I was suspecting STP before.
I can do ethtool. I do have root on these servers, though. Are you trying to get me to turn off link-speed auto-negotiation? Would you like me to try that?
It is just a suspicion, that the reason is pause frames. Let's start on a NIC which is not used for ovirtmgmt, I guess em1. Does 'ethtool -S em1 | grep pause' show something? Does 'ethtool em1 | grep pause' indicates support for pause? The current config is shown by 'ethtool -a em1'. '-A autoneg' "Specifies whether pause autonegotiation should be enabled." according to ethtool doc. Assuming flow control is enabled by default, I would try to disable it via 'ethtool -A em1 autoneg off rx off tx off' and check if it is applied via 'ethtool -a em1' and check if the behavior under load changes.
On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com>
wrote:
On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Sure! Right now, I only have a 500gb partition on each node shared
over NFS, added as storage domains. This is on each node - so, currently 3.
How can the storage cause a node to drop out?
Thanks, I got it. All three links go down on load, which causes NFS to fail.
Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? Can you try to modify the behavior by changing the settings of your host interfaces, e.g.
ethtool -A em1 autoneg off rx off tx off
or ethtool -A em1 autoneg on rx on tx on ?
On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com>
wrote:
> > > > On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> >> Also, if it helps, the hosts will sit there, quietly, for hours or >> days before anything happens. They're up and working just fine. But >> then, when I manually migrate a VM from one host to another, they >> become completely inaccessible. >> > > Can you share some details about your storage? > Maybe there is a feature used during live migration, which
> > >> >> These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install >> and configuration. >> >> On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. >> <ej.albany@gmail.com> wrote: >> > >> > Hey Dominik, >> > >> > Thanks for helping. I really want to try to use ovirt. >> > >> > When these events happen, I cannot even SSH to the nodes due to
>> > link being down. After a little while, the hosts come back... >> > >> > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler < dholler@redhat.com> wrote: >> > > >> > > Is you storage connected via NFS? >> > > Can you manually access the storage on the host? >> > > >> > > >> > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> > >> >> > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's >> > >> not STP that's the problem. >> > >> >> > >> 2 of my hosts just went down when a few VMs tried to migrate. >> > >> >> > >> Do any of you have any idea what might be going on here? I don't even >> > >> know where to start. I'm going to include the dmesg in case it helps. >> > >> This happens on both of the hosts whenever any migration attempts to start. >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down >> > >> [68099.246055] internal: port 1(em1) entered disabled state >> > >> [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down >> > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state >> > >> [68184.177856] ovirtmgmt: topology change detected,
>> > >> [68277.078671] INFO: task qemu-kvm:8888 blocked for more
>> > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> > >> disables this message. >> > >> [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >> > >> [68277.078727] Call Trace: >> > >> [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >> > >> [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 >> > >> [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >> > >> [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >> > >> [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >> > >> [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >> > >> [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >> > >> [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >> > >> [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >> > >> [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >> > >> [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >> > >> [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >> > >> [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >> > >> [68397.072384] INFO: task qemu-kvm:8888 blocked for more
>> > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> > >> disables this message. >> > >> [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >> > >> [68397.072439] Call Trace: >> > >> [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >> > >> [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 >> > >> [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >> > >> [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >> > >> [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >> > >> [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >> > >> [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >> > >> [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >> > >> [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >> > >> [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >> > >> [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >> > >> [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >> > >> [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >> > >> [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps >> > >> full duplex >> > >> >> > >> [68401.573247] internal: port 1(em1) entered blocking state >> > >> [68401.573255] internal: port 1(em1) entered listening state >> > >> [68403.576985] internal: port 1(em1) entered learning state >> > >> [68405.580907] internal: port 1(em1) entered forwarding state >> > >> [68405.580916] internal: topology change detected,
>> > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out >> > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out >> > >> [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow >> > >> Control: RX/TX >> > >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state >> > >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state >> > >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state >> > >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state >> > >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu >> > >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! >> > >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with >> > >> incorrect client ID >> > >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with >> > >> incorrect client ID >> > >> >> > >> >> > >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> > >> > >> > >> > Thanks, I'm just going to revert back to bridges. >> > >> > >> > >> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler < dholler@redhat.com> wrote: >> > >> > > >> > >> > > >> > >> > > >> > >> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> > >> > >> >> > >> > >> Seems like the STP options are so common and necessary
>> > >> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm >> > >> > >> not even a networking guy - never even heard of half of
>> > >> > >> bridge_opts that have switches in the UI. >> > >> > >> >> > >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my >> > >> > >> nodes and used "openvswitch (Technology Preview)" as
>> > >> > >> option for the first host. I made a new Cluster for my nodes, added >> > >> > >> them all to the new cluster, created a new "logical network" for the >> > >> > >> internal network and attached it to the internal network ports. >> > >> > >> >> > >> > >> Now, when I go to create a new VM, I don't even have either the >> > >> > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is >> > >> > >> empy as if I don't have any vnic-profiles. >> > >> > >> >> > >> > > >> > >> > > openvswitch clusters are limited to ovn networks. >> > >> > > You can create one like described in >> > >> > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... >> > >> > > >> > >> > > >> > >> > >> >> > >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce < tonyppe@gmail.com> wrote: >> > >> > >> > >> > >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. >> > >> > >> > Thanks >> > >> > >> > >> > >> > >> > >> > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, < dholler@redhat.com> wrote: >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >> > >> > >> >>> >> > >> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM < ej.albany@gmail.com> wrote: >> > >> > >> >>> > >> > >> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. >> > >> > >> >>> > That issue relates to the ethernet and 10gb fc
ej.albany@gmail.com> wrote: triggers the issue. the propagating than 120 seconds. than 120 seconds. propagating that it would the the engine-setup links that I have on my
>> > >> > >> >>> > cluster being disabled any time a migration occurs. >> > >> > >> >>> > >> > >> > >> >>> > I believe this is because I need to have STP turned on in order to >> > >> > >> >>> > participate with the switch. However, there does not seem to be any >> > >> > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. >> > >> > >> >>> > >> > >> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 >> > >> > >> >>> > minute, the migration issue disappears.... >> > >> > >> >>> > >> > >> > >> >>> > Is there any way at all to do without this cronjob and set STP to be >> > >> > >> >>> > ON without having to resort to such a silly solution? >> > >> > >> >>> >> > >> > >> >>> Vdsm exposes a per bridge STP knob that you can use for this. By >> > >> > >> >>> default it is set to false, which is probably why you had to use this >> > >> > >> >>> shenanigan. >> > >> > >> >>> >> > >> > >> >>> You can, for instance: >> > >> > >> >>> >> > >> > >> >>> # show present state >> > >> > >> >>> [vagrant@vdsm ~]$ ip a >> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> > >> > >> >>> group default qlen 1000 >> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> > >> > >> >>> inet 127.0.0.1/8 scope host lo >> > >> > >> >>> valid_lft forever preferred_lft forever >> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> > >> >>> state UP group default qlen 1000 >> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> > >> >>> state UP group default qlen 1000 >> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> > >> > >> >>> valid_lft forever preferred_lft forever >> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> > >> > >> >>> valid_lft forever preferred_lft forever >> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> > >> >>> group default qlen 1000 >> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> >> > >> > >> >>> # show example bridge configuration - you're looking for the STP knob here. >> > >> > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp >> > >> > >> >>> { >> > >> > >> >>> "bondings": {}, >> > >> > >> >>> "networks": { >> > >> > >> >>> "test-network": { >> > >> > >> >>> "nic": "eth0", >> > >> > >> >>> "switch": "legacy", >> > >> > >> >>> "bridged": true, >> > >> > >> >>> "stp": true >> > >> > >> >>> } >> > >> > >> >>> }, >> > >> > >> >>> "options": { >> > >> > >> >>> "connectivityCheck": false >> > >> > >> >>> } >> > >> > >> >>> } >> > >> > >> >>> >> > >> > >> >>> # issue setup networks command: >> > >> > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >> > >> > >> >>> { >> > >> > >> >>> "code": 0, >> > >> > >> >>> "message": "Done" >> > >> > >> >>> } >> > >> > >> >>> >> > >> > >> >>> # show bridges >> > >> > >> >>> [root@vdsm ~]$ brctl show >> > >> > >> >>> bridge name bridge id STP enabled interfaces >> > >> > >> >>> ;vdsmdummy; 8000.000000000000 no >> > >> > >> >>> test-network 8000.52540041fb37 yes eth0 >> > >> > >> >>> >> > >> > >> >>> # show final state >> > >> > >> >>> [root@vdsm ~]$ ip a >> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> > >> > >> >>> group default qlen 1000 >> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> > >> > >> >>> inet 127.0.0.1/8 scope host lo >> > >> > >> >>> valid_lft forever preferred_lft forever >> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> > >> >>> master test-network state UP group default qlen 1000 >> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >> > >> > >> >>> state UP group default qlen 1000 >> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >> > >> > >> >>> valid_lft forever preferred_lft forever >> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >> > >> > >> >>> valid_lft forever preferred_lft forever >> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> > >> >>> group default qlen 1000 >> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> > >> > >> >>> noqueue state UP group default qlen 1000 >> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> >> > >> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik >> > >> > >> >>> Holler , could you confirm ? What are our plans for it ? >> > >> > >> >>> >> > >> > >> >> >> > >> > >> >> STP is only available via REST-API, see >> > >> > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network >> > >> > >> >> please find an example how to enable STP in >> > >> > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 >> > >> > >> >> >> > >> > >> >> We have no plans to add STP to the web ui, >> > >> > >> >> but new feature requests are always welcome on >> > >> > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine >> > >> > >> >> >> > >> > >> >> >> > >> > >> >>> >> > >> > >> >>> > >> > >> > >> >>> > Here are some details about my systems, if you need it. >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > selinux is disabled. >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt >> > >> > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >> > >> > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch >> > >> > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >> > >> > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >> > >> > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >> > >> > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >> > >> > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >> > >> > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >> > >> > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >> > >> > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch >> > >> > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >> > >> > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch >> > >> > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >> > >> > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >> > >> > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >> > >> > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 >> > >> > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >> > >> > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >> > >> > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >> > >> > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release >> > >> > >> >>> > CentOS Linux release 7.6.1810 (Core) >> > >> > >> >>> > [root@swm-02 ~]# uname -r >> > >> > >> >>> > 3.10.0-957.27.2.el7.x86_64 >> > >> > >> >>> > You have new mail in /var/spool/mail/root >> > >> > >> >>> > [root@swm-02 ~]# ip a >> > >> > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >> > >> > >> >>> > group default qlen 1000 >> > >> > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> > >> > >> >>> > inet 127.0.0.1/8 scope host lo >> > >> > >> >>> > valid_lft forever preferred_lft forever >> > >> > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > >> > >> >>> > test state UP group default qlen 1000 >> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > >> > >> >>> > default qlen 1000 >> > >> > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > >> > >> >>> > ovirtmgmt state UP group default qlen 1000 >> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > >> > >> >>> > default qlen 1000 >> > >> > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> > >> >>> > group default qlen 1000 >> > >> > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >> > >> > >> >>> > default qlen 1000 >> > >> > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >> > >> > >> >>> > state UP group default qlen 1000 >> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >> > >> > >> >>> > valid_lft forever preferred_lft forever >> > >> > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> > >> > >> >>> > noqueue state UP group default qlen 1000 >> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >> > >> > >> >>> > valid_lft forever preferred_lft forever >> > >> > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >> > >> > >> >>> > group default qlen 1000 >> > >> > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >> > >> > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 >> > >> > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >> > >> > >> >>> > [root@swm-02 ~]# free -m >> > >> > >> >>> > total used free shared buff/cache available >> > >> > >> >>> > Mem: 64413 1873 61804 9 735 62062 >> > >> > >> >>> > Swap: 16383 0 16383 >> > >> > >> >>> > [root@swm-02 ~]# free -h >> > >> > >> >>> > total used free shared buff/cache available >> > >> > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G >> > >> > >> >>> > Swap: 15G 0B 15G >> > >> > >> >>> > [root@swm-02 ~]# ls >> > >> > >> >>> > ls lsb_release lshw lslocks >> > >> > >> >>> > lsmod lspci lssubsys >> > >> > >> >>> > lsusb.py >> > >> > >> >>> > lsattr lscgroup lsinitrd lslogins >> > >> > >> >>> > lsns lss16toppm lstopo-no-graphics >> > >> > >> >>> > lsblk lscpu lsipc lsmem >> > >> > >> >>> > lsof lsscsi lsusb >> > >> > >> >>> > [root@swm-02 ~]# lscpu >> > >> > >> >>> > Architecture: x86_64 >> > >> > >> >>> > CPU op-mode(s): 32-bit, 64-bit >> > >> > >> >>> > Byte Order: Little Endian >> > >> > >> >>> > CPU(s): 16 >> > >> > >> >>> > On-line CPU(s) list: 0-15 >> > >> > >> >>> > Thread(s) per core: 2 >> > >> > >> >>> > Core(s) per socket: 4 >> > >> > >> >>> > Socket(s): 2 >> > >> > >> >>> > NUMA node(s): 2 >> > >> > >> >>> > Vendor ID: GenuineIntel >> > >> > >> >>> > CPU family: 6 >> > >> > >> >>> > Model: 44 >> > >> > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >> > >> > >> >>> > Stepping: 2 >> > >> > >> >>> > CPU MHz: 3192.064 >> > >> > >> >>> > BogoMIPS: 6384.12 >> > >> > >> >>> > Virtualization: VT-x >> > >> > >> >>> > L1d cache: 32K >> > >> > >> >>> > L1i cache: 32K >> > >> > >> >>> > L2 cache: 256K >> > >> > >> >>> > L3 cache: 12288K >> > >> > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >> > >> > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >> > >> > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep >> > >> > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >> > >> > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >> > >> > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq >> > >> > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca >> > >> > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >> > >> > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >> > >> > >> >>> > [root@swm-02 ~]# >> > >> > >> >>> > _______________________________________________ >> > >> > >> >>> > Users mailing list -- users@ovirt.org >> > >> > >> >>> > To unsubscribe send an email to users-leave@ovirt.org >> > >> > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> > >> > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> > >> > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... >> > >> > >> >> >> > >> > >> >> _______________________________________________ >> > >> > >> >> Users mailing list -- users@ovirt.org >> > >> > >> >> To unsubscribe send an email to users-leave@ovirt.org >> > >> > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> > >> > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> > >> > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

It took a while for my servers to come back on the network this time. I think it's due to ovirt continuing to try to migrate the VMs around like I requested. The 3 servers' names are "swm-01, swm-02 and swm-03". Eventually (about 2-3 minutes ago) they all came back online. So I disabled and stopped the lldpad service. Nope. Started some more migrations and swm-02 and swm-03 disappeared again. No ping, SSH hung, same as before - almost as soon as the migration started. If you wall have any ideas what switch-level setting might be enabled, let me know, cause I'm stumped. I can add it to the ticket that's requesting the port configurations. I've already added the port numbers and switch name that I got from CDP. Thanks again, I really appreciate the help! cecjr On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
This little cluster isn't in production or anything like that yet.
So, I went ahead and used your ethtool commands to disable pause frames on both interfaces of each server. I then, chose a few VMs to migrate around at random.
swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't ssh, and the SSH session that I had open was unresponsive.
Any other ideas?
Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. This is a strong indication that the root cause is related to the switch. Maybe you can get some information about the switch config by 'lldptool get-tlv -n -i em1'
Another guess: After the optional 'lldptool get-tlv -n -i em1' 'systemctl stop lldpad' another try to migrate.
On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Unfortunately, I can't check on the switch. Trust me, I've tried. These servers are in a Co-Lo and I've put 5 tickets in asking about the port configuration. They just get ignored - but that's par for the coarse for IT here. Only about 2 out of 10 of our tickets get any response and usually the response doesn't help. Then the system they use auto-closes the ticket. That was why I was suspecting STP before.
I can do ethtool. I do have root on these servers, though. Are you trying to get me to turn off link-speed auto-negotiation? Would you like me to try that?
It is just a suspicion, that the reason is pause frames. Let's start on a NIC which is not used for ovirtmgmt, I guess em1. Does 'ethtool -S em1 | grep pause' show something? Does 'ethtool em1 | grep pause' indicates support for pause? The current config is shown by 'ethtool -a em1'. '-A autoneg' "Specifies whether pause autonegotiation should be enabled." according to ethtool doc. Assuming flow control is enabled by default, I would try to disable it via 'ethtool -A em1 autoneg off rx off tx off' and check if it is applied via 'ethtool -a em1' and check if the behavior under load changes.
On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > Sure! Right now, I only have a 500gb partition on each node shared over NFS, added as storage domains. This is on each node - so, currently 3. > > How can the storage cause a node to drop out? >
Thanks, I got it. All three links go down on load, which causes NFS to fail.
Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? Can you try to modify the behavior by changing the settings of your host interfaces, e.g.
ethtool -A em1 autoneg off rx off tx off
or ethtool -A em1 autoneg on rx on tx on ?
> > On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com> wrote: >> >> >> >> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: >>> >>> Also, if it helps, the hosts will sit there, quietly, for hours or >>> days before anything happens. They're up and working just fine. But >>> then, when I manually migrate a VM from one host to another, they >>> become completely inaccessible. >>> >> >> Can you share some details about your storage? >> Maybe there is a feature used during live migration, which triggers the issue. >> >> >>> >>> These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install >>> and configuration. >>> >>> On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. >>> <ej.albany@gmail.com> wrote: >>> > >>> > Hey Dominik, >>> > >>> > Thanks for helping. I really want to try to use ovirt. >>> > >>> > When these events happen, I cannot even SSH to the nodes due to the >>> > link being down. After a little while, the hosts come back... >>> > >>> > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com> wrote: >>> > > >>> > > Is you storage connected via NFS? >>> > > Can you manually access the storage on the host? >>> > > >>> > > >>> > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: >>> > >> >>> > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's >>> > >> not STP that's the problem. >>> > >> >>> > >> 2 of my hosts just went down when a few VMs tried to migrate. >>> > >> >>> > >> Do any of you have any idea what might be going on here? I don't even >>> > >> know where to start. I'm going to include the dmesg in case it helps. >>> > >> This happens on both of the hosts whenever any migration attempts to start. >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down >>> > >> [68099.246055] internal: port 1(em1) entered disabled state >>> > >> [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down >>> > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state >>> > >> [68184.177856] ovirtmgmt: topology change detected, propagating >>> > >> [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. >>> > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>> > >> disables this message. >>> > >> [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >>> > >> [68277.078727] Call Trace: >>> > >> [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >>> > >> [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 >>> > >> [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >>> > >> [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >>> > >> [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >>> > >> [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >>> > >> [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >>> > >> [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >>> > >> [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >>> > >> [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >>> > >> [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >>> > >> [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >>> > >> [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >>> > >> [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. >>> > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>> > >> disables this message. >>> > >> [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >>> > >> [68397.072439] Call Trace: >>> > >> [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >>> > >> [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 >>> > >> [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >>> > >> [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >>> > >> [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >>> > >> [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >>> > >> [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >>> > >> [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >>> > >> [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >>> > >> [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >>> > >> [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >>> > >> [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >>> > >> [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >>> > >> [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps >>> > >> full duplex >>> > >> >>> > >> [68401.573247] internal: port 1(em1) entered blocking state >>> > >> [68401.573255] internal: port 1(em1) entered listening state >>> > >> [68403.576985] internal: port 1(em1) entered learning state >>> > >> [68405.580907] internal: port 1(em1) entered forwarding state >>> > >> [68405.580916] internal: topology change detected, propagating >>> > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out >>> > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out >>> > >> [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow >>> > >> Control: RX/TX >>> > >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state >>> > >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state >>> > >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state >>> > >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state >>> > >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu >>> > >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! >>> > >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with >>> > >> incorrect client ID >>> > >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with >>> > >> incorrect client ID >>> > >> >>> > >> >>> > >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: >>> > >> > >>> > >> > Thanks, I'm just going to revert back to bridges. >>> > >> > >>> > >> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote: >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: >>> > >> > >> >>> > >> > >> Seems like the STP options are so common and necessary that it would >>> > >> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm >>> > >> > >> not even a networking guy - never even heard of half of the >>> > >> > >> bridge_opts that have switches in the UI. >>> > >> > >> >>> > >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my >>> > >> > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup >>> > >> > >> option for the first host. I made a new Cluster for my nodes, added >>> > >> > >> them all to the new cluster, created a new "logical network" for the >>> > >> > >> internal network and attached it to the internal network ports. >>> > >> > >> >>> > >> > >> Now, when I go to create a new VM, I don't even have either the >>> > >> > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is >>> > >> > >> empy as if I don't have any vnic-profiles. >>> > >> > >> >>> > >> > > >>> > >> > > openvswitch clusters are limited to ovn networks. >>> > >> > > You can create one like described in >>> > >> > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... >>> > >> > > >>> > >> > > >>> > >> > >> >>> > >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: >>> > >> > >> > >>> > >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. >>> > >> > >> > Thanks >>> > >> > >> > >>> > >> > >> > >>> > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote: >>> > >> > >> >> >>> > >> > >> >> >>> > >> > >> >> >>> > >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >>> > >> > >> >>> >>> > >> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: >>> > >> > >> >>> > >>> > >> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. >>> > >> > >> >>> > That issue relates to the ethernet and 10gb fc links that I have on my >>> > >> > >> >>> > cluster being disabled any time a migration occurs. >>> > >> > >> >>> > >>> > >> > >> >>> > I believe this is because I need to have STP turned on in order to >>> > >> > >> >>> > participate with the switch. However, there does not seem to be any >>> > >> > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. >>> > >> > >> >>> > >>> > >> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 >>> > >> > >> >>> > minute, the migration issue disappears.... >>> > >> > >> >>> > >>> > >> > >> >>> > Is there any way at all to do without this cronjob and set STP to be >>> > >> > >> >>> > ON without having to resort to such a silly solution? >>> > >> > >> >>> >>> > >> > >> >>> Vdsm exposes a per bridge STP knob that you can use for this. By >>> > >> > >> >>> default it is set to false, which is probably why you had to use this >>> > >> > >> >>> shenanigan. >>> > >> > >> >>> >>> > >> > >> >>> You can, for instance: >>> > >> > >> >>> >>> > >> > >> >>> # show present state >>> > >> > >> >>> [vagrant@vdsm ~]$ ip a >>> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> > >> > >> >>> group default qlen 1000 >>> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> > >> > >> >>> inet 127.0.0.1/8 scope host lo >>> > >> > >> >>> valid_lft forever preferred_lft forever >>> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> > >> > >> >>> state UP group default qlen 1000 >>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> > >> > >> >>> state UP group default qlen 1000 >>> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>> > >> > >> >>> valid_lft forever preferred_lft forever >>> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>> > >> > >> >>> valid_lft forever preferred_lft forever >>> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > >> > >> >>> group default qlen 1000 >>> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> >>> > >> > >> >>> # show example bridge configuration - you're looking for the STP knob here. >>> > >> > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp >>> > >> > >> >>> { >>> > >> > >> >>> "bondings": {}, >>> > >> > >> >>> "networks": { >>> > >> > >> >>> "test-network": { >>> > >> > >> >>> "nic": "eth0", >>> > >> > >> >>> "switch": "legacy", >>> > >> > >> >>> "bridged": true, >>> > >> > >> >>> "stp": true >>> > >> > >> >>> } >>> > >> > >> >>> }, >>> > >> > >> >>> "options": { >>> > >> > >> >>> "connectivityCheck": false >>> > >> > >> >>> } >>> > >> > >> >>> } >>> > >> > >> >>> >>> > >> > >> >>> # issue setup networks command: >>> > >> > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >>> > >> > >> >>> { >>> > >> > >> >>> "code": 0, >>> > >> > >> >>> "message": "Done" >>> > >> > >> >>> } >>> > >> > >> >>> >>> > >> > >> >>> # show bridges >>> > >> > >> >>> [root@vdsm ~]$ brctl show >>> > >> > >> >>> bridge name bridge id STP enabled interfaces >>> > >> > >> >>> ;vdsmdummy; 8000.000000000000 no >>> > >> > >> >>> test-network 8000.52540041fb37 yes eth0 >>> > >> > >> >>> >>> > >> > >> >>> # show final state >>> > >> > >> >>> [root@vdsm ~]$ ip a >>> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> > >> > >> >>> group default qlen 1000 >>> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> > >> > >> >>> inet 127.0.0.1/8 scope host lo >>> > >> > >> >>> valid_lft forever preferred_lft forever >>> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> > >> > >> >>> master test-network state UP group default qlen 1000 >>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> > >> > >> >>> state UP group default qlen 1000 >>> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>> > >> > >> >>> valid_lft forever preferred_lft forever >>> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>> > >> > >> >>> valid_lft forever preferred_lft forever >>> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > >> > >> >>> group default qlen 1000 >>> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>> > >> > >> >>> noqueue state UP group default qlen 1000 >>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> >>> > >> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik >>> > >> > >> >>> Holler , could you confirm ? What are our plans for it ? >>> > >> > >> >>> >>> > >> > >> >> >>> > >> > >> >> STP is only available via REST-API, see >>> > >> > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network >>> > >> > >> >> please find an example how to enable STP in >>> > >> > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 >>> > >> > >> >> >>> > >> > >> >> We have no plans to add STP to the web ui, >>> > >> > >> >> but new feature requests are always welcome on >>> > >> > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine >>> > >> > >> >> >>> > >> > >> >> >>> > >> > >> >>> >>> > >> > >> >>> > >>> > >> > >> >>> > Here are some details about my systems, if you need it. >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > selinux is disabled. >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > >>> > >> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt >>> > >> > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >>> > >> > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch >>> > >> > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >>> > >> > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >>> > >> > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >>> > >> > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >>> > >> > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >>> > >> > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >>> > >> > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >>> > >> > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch >>> > >> > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >>> > >> > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch >>> > >> > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >>> > >> > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >>> > >> > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >>> > >> > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 >>> > >> > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >>> > >> > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >>> > >> > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >>> > >> > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release >>> > >> > >> >>> > CentOS Linux release 7.6.1810 (Core) >>> > >> > >> >>> > [root@swm-02 ~]# uname -r >>> > >> > >> >>> > 3.10.0-957.27.2.el7.x86_64 >>> > >> > >> >>> > You have new mail in /var/spool/mail/root >>> > >> > >> >>> > [root@swm-02 ~]# ip a >>> > >> > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> > >> > >> >>> > group default qlen 1000 >>> > >> > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> > >> > >> >>> > inet 127.0.0.1/8 scope host lo >>> > >> > >> >>> > valid_lft forever preferred_lft forever >>> > >> > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > >> > >> >>> > test state UP group default qlen 1000 >>> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > >> > >> >>> > default qlen 1000 >>> > >> > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > >> > >> >>> > ovirtmgmt state UP group default qlen 1000 >>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > >> > >> >>> > default qlen 1000 >>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > >> > >> >>> > group default qlen 1000 >>> > >> > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > >> > >> >>> > default qlen 1000 >>> > >> > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >>> > >> > >> >>> > state UP group default qlen 1000 >>> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >>> > >> > >> >>> > valid_lft forever preferred_lft forever >>> > >> > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>> > >> > >> >>> > noqueue state UP group default qlen 1000 >>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >>> > >> > >> >>> > valid_lft forever preferred_lft forever >>> > >> > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > >> > >> >>> > group default qlen 1000 >>> > >> > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > >> > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 >>> > >> > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >>> > >> > >> >>> > [root@swm-02 ~]# free -m >>> > >> > >> >>> > total used free shared buff/cache available >>> > >> > >> >>> > Mem: 64413 1873 61804 9 735 62062 >>> > >> > >> >>> > Swap: 16383 0 16383 >>> > >> > >> >>> > [root@swm-02 ~]# free -h >>> > >> > >> >>> > total used free shared buff/cache available >>> > >> > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G >>> > >> > >> >>> > Swap: 15G 0B 15G >>> > >> > >> >>> > [root@swm-02 ~]# ls >>> > >> > >> >>> > ls lsb_release lshw lslocks >>> > >> > >> >>> > lsmod lspci lssubsys >>> > >> > >> >>> > lsusb.py >>> > >> > >> >>> > lsattr lscgroup lsinitrd lslogins >>> > >> > >> >>> > lsns lss16toppm lstopo-no-graphics >>> > >> > >> >>> > lsblk lscpu lsipc lsmem >>> > >> > >> >>> > lsof lsscsi lsusb >>> > >> > >> >>> > [root@swm-02 ~]# lscpu >>> > >> > >> >>> > Architecture: x86_64 >>> > >> > >> >>> > CPU op-mode(s): 32-bit, 64-bit >>> > >> > >> >>> > Byte Order: Little Endian >>> > >> > >> >>> > CPU(s): 16 >>> > >> > >> >>> > On-line CPU(s) list: 0-15 >>> > >> > >> >>> > Thread(s) per core: 2 >>> > >> > >> >>> > Core(s) per socket: 4 >>> > >> > >> >>> > Socket(s): 2 >>> > >> > >> >>> > NUMA node(s): 2 >>> > >> > >> >>> > Vendor ID: GenuineIntel >>> > >> > >> >>> > CPU family: 6 >>> > >> > >> >>> > Model: 44 >>> > >> > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >>> > >> > >> >>> > Stepping: 2 >>> > >> > >> >>> > CPU MHz: 3192.064 >>> > >> > >> >>> > BogoMIPS: 6384.12 >>> > >> > >> >>> > Virtualization: VT-x >>> > >> > >> >>> > L1d cache: 32K >>> > >> > >> >>> > L1i cache: 32K >>> > >> > >> >>> > L2 cache: 256K >>> > >> > >> >>> > L3 cache: 12288K >>> > >> > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >>> > >> > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >>> > >> > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep >>> > >> > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >>> > >> > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >>> > >> > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq >>> > >> > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca >>> > >> > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >>> > >> > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >>> > >> > >> >>> > [root@swm-02 ~]# >>> > >> > >> >>> > _______________________________________________ >>> > >> > >> >>> > Users mailing list -- users@ovirt.org >>> > >> > >> >>> > To unsubscribe send an email to users-leave@ovirt.org >>> > >> > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> > >> > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>> > >> > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... >>> > >> > >> >> >>> > >> > >> >> _______________________________________________ >>> > >> > >> >> Users mailing list -- users@ovirt.org >>> > >> > >> >> To unsubscribe send an email to users-leave@ovirt.org >>> > >> > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> > >> > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>> > >> > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

Is the nic to the network staying up or going down for a period? I'm just thinking, if the network has been configured to block unknown unicast traffic, I think the VM would need to send a layer 2 frame to the network before the network would send any frames to that switch port destined for the VM. After migration, could you use the VM console to send a packet and then see if you can SSH in? Is the default Gateway for the VM on the network side? A ping to the Gateway should be good enough in that case. On Sat., 24 Aug. 2019, 04:20 Curtis E. Combs Jr., <ej.albany@gmail.com> wrote:
It took a while for my servers to come back on the network this time. I think it's due to ovirt continuing to try to migrate the VMs around like I requested. The 3 servers' names are "swm-01, swm-02 and swm-03". Eventually (about 2-3 minutes ago) they all came back online.
So I disabled and stopped the lldpad service.
Nope. Started some more migrations and swm-02 and swm-03 disappeared again. No ping, SSH hung, same as before - almost as soon as the migration started.
If you wall have any ideas what switch-level setting might be enabled, let me know, cause I'm stumped. I can add it to the ticket that's requesting the port configurations. I've already added the port numbers and switch name that I got from CDP.
Thanks again, I really appreciate the help! cecjr
On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <dholler@redhat.com>
On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
This little cluster isn't in production or anything like that yet.
So, I went ahead and used your ethtool commands to disable pause frames on both interfaces of each server. I then, chose a few VMs to migrate around at random.
swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't ssh, and the SSH session that I had open was unresponsive.
Any other ideas?
Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. This is a strong indication that the root cause is related to the switch. Maybe you can get some information about the switch config by 'lldptool get-tlv -n -i em1'
Another guess: After the optional 'lldptool get-tlv -n -i em1' 'systemctl stop lldpad' another try to migrate.
On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com>
wrote:
On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <
ej.albany@gmail.com> wrote:
Unfortunately, I can't check on the switch. Trust me, I've tried. These servers are in a Co-Lo and I've put 5 tickets in asking about the port configuration. They just get ignored - but that's par for
coarse for IT here. Only about 2 out of 10 of our tickets get any response and usually the response doesn't help. Then the system they use auto-closes the ticket. That was why I was suspecting STP before.
I can do ethtool. I do have root on these servers, though. Are you trying to get me to turn off link-speed auto-negotiation? Would you like me to try that?
It is just a suspicion, that the reason is pause frames. Let's start on a NIC which is not used for ovirtmgmt, I guess em1. Does 'ethtool -S em1 | grep pause' show something? Does 'ethtool em1 | grep pause' indicates support for pause? The current config is shown by 'ethtool -a em1'. '-A autoneg' "Specifies whether pause autonegotiation should be enabled." according to ethtool doc. Assuming flow control is enabled by default, I would try to disable it via 'ethtool -A em1 autoneg off rx off tx off' and check if it is applied via 'ethtool -a em1' and check if the behavior under load changes.
On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com>
wrote:
> > > > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >> >> Sure! Right now, I only have a 500gb partition on each node shared over NFS, added as storage domains. This is on each node - so, currently 3. >> >> How can the storage cause a node to drop out? >> > > Thanks, I got it. > All three links go down on load, which causes NFS to fail. > > Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? > Can you try to modify the behavior by changing the settings of your host interfaces, e.g. > > ethtool -A em1 autoneg off rx off tx off > > or > ethtool -A em1 autoneg on rx on tx on > ? > > > > >> >> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler < dholler@redhat.com> wrote: >>> >>> >>> >>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >>>> >>>> Also, if it helps, the hosts will sit there, quietly, for hours or >>>> days before anything happens. They're up and working just fine. But >>>> then, when I manually migrate a VM from one host to another,
>>>> become completely inaccessible. >>>> >>> >>> Can you share some details about your storage? >>> Maybe there is a feature used during live migration, which
>>> >>> >>>> >>>> These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install >>>> and configuration. >>>> >>>> On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. >>>> <ej.albany@gmail.com> wrote: >>>> > >>>> > Hey Dominik, >>>> > >>>> > Thanks for helping. I really want to try to use ovirt. >>>> > >>>> > When these events happen, I cannot even SSH to the nodes due to the >>>> > link being down. After a little while, the hosts come back... >>>> > >>>> > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler < dholler@redhat.com> wrote: >>>> > > >>>> > > Is you storage connected via NFS? >>>> > > Can you manually access the storage on the host? >>>> > > >>>> > > >>>> > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >>>> > >> >>>> > >> Sorry to dead bump this, but I'm beginning to suspect
>>>> > >> not STP that's the problem. >>>> > >> >>>> > >> 2 of my hosts just went down when a few VMs tried to migrate. >>>> > >> >>>> > >> Do any of you have any idea what might be going on here? I don't even >>>> > >> know where to start. I'm going to include the dmesg in case it helps. >>>> > >> This happens on both of the hosts whenever any migration attempts to start. >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down >>>> > >> [68099.246055] internal: port 1(em1) entered disabled state >>>> > >> [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down >>>> > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state >>>> > >> [68184.177856] ovirtmgmt: topology change detected,
>>>> > >> [68277.078671] INFO: task qemu-kvm:8888 blocked for more
>>>> > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>>> > >> disables this message. >>>> > >> [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >>>> > >> [68277.078727] Call Trace: >>>> > >> [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >>>> > >> [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 >>>> > >> [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >>>> > >> [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >>>> > >> [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >>>> > >> [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >>>> > >> [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >>>> > >> [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >>>> > >> [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >>>> > >> [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >>>> > >> [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >>>> > >> [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >>>> > >> [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >>>> > >> [68397.072384] INFO: task qemu-kvm:8888 blocked for more
>>>> > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>>> > >> disables this message. >>>> > >> [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 >>>> > >> [68397.072439] Call Trace: >>>> > >> [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 >>>> > >> [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 >>>> > >> [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 >>>> > >> [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 >>>> > >> [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] >>>> > >> [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 >>>> > >> [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 >>>> > >> [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 >>>> > >> [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 >>>> > >> [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 >>>> > >> [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 >>>> > >> [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 >>>> > >> [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 >>>> > >> [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps >>>> > >> full duplex >>>> > >> >>>> > >> [68401.573247] internal: port 1(em1) entered blocking state >>>> > >> [68401.573255] internal: port 1(em1) entered listening state >>>> > >> [68403.576985] internal: port 1(em1) entered learning state >>>> > >> [68405.580907] internal: port 1(em1) entered forwarding state >>>> > >> [68405.580916] internal: topology change detected,
>>>> > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out >>>> > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out >>>> > >> [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow >>>> > >> Control: RX/TX >>>> > >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state >>>> > >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state >>>> > >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state >>>> > >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state >>>> > >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu >>>> > >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! >>>> > >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with >>>> > >> incorrect client ID >>>> > >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with >>>> > >> incorrect client ID >>>> > >> >>>> > >> >>>> > >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >>>> > >> > >>>> > >> > Thanks, I'm just going to revert back to bridges. >>>> > >> > >>>> > >> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler < dholler@redhat.com> wrote: >>>> > >> > > >>>> > >> > > >>>> > >> > > >>>> > >> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. < ej.albany@gmail.com> wrote: >>>> > >> > >> >>>> > >> > >> Seems like the STP options are so common and necessary that it would >>>> > >> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm >>>> > >> > >> not even a networking guy - never even heard of half of the >>>> > >> > >> bridge_opts that have switches in the UI. >>>> > >> > >> >>>> > >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my >>>> > >> > >> nodes and used "openvswitch (Technology Preview)" as
>>>> > >> > >> option for the first host. I made a new Cluster for my nodes, added >>>> > >> > >> them all to the new cluster, created a new "logical network" for the >>>> > >> > >> internal network and attached it to the internal network ports. >>>> > >> > >> >>>> > >> > >> Now, when I go to create a new VM, I don't even have either the >>>> > >> > >> ovirtmgmt switch OR the internal switch as an
>>>> > >> > >> empy as if I don't have any vnic-profiles. >>>> > >> > >> >>>> > >> > > >>>> > >> > > openvswitch clusters are limited to ovn networks. >>>> > >> > > You can create one like described in >>>> > >> > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... >>>> > >> > > >>>> > >> > > >>>> > >> > >> >>>> > >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce < tonyppe@gmail.com> wrote: >>>> > >> > >> > >>>> > >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. >>>> > >> > >> > Thanks >>>> > >> > >> > >>>> > >> > >> > >>>> > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, < dholler@redhat.com> wrote: >>>> > >> > >> >> >>>> > >> > >> >> >>>> > >> > >> >> >>>> > >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >>>> > >> > >> >>> >>>> > >> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM < ej.albany@gmail.com> wrote: >>>> > >> > >> >>> > >>>> > >> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. >>>> > >> > >> >>> > That issue relates to the ethernet and 10gb fc
>>>> > >> > >> >>> > cluster being disabled any time a migration occurs. >>>> > >> > >> >>> > >>>> > >> > >> >>> > I believe this is because I need to have STP turned on in order to >>>> > >> > >> >>> > participate with the switch. However, there does not seem to be any >>>> > >> > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. >>>> > >> > >> >>> > >>>> > >> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 >>>> > >> > >> >>> > minute, the migration issue disappears.... >>>> > >> > >> >>> > >>>> > >> > >> >>> > Is there any way at all to do without this cronjob and set STP to be >>>> > >> > >> >>> > ON without having to resort to such a silly solution? >>>> > >> > >> >>> >>>> > >> > >> >>> Vdsm exposes a per bridge STP knob that you can use for this. By >>>> > >> > >> >>> default it is set to false, which is probably why you had to use this >>>> > >> > >> >>> shenanigan. >>>> > >> > >> >>> >>>> > >> > >> >>> You can, for instance: >>>> > >> > >> >>> >>>> > >> > >> >>> # show present state >>>> > >> > >> >>> [vagrant@vdsm ~]$ ip a >>>> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>>> > >> > >> >>> group default qlen 1000 >>>> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>>> > >> > >> >>> inet 127.0.0.1/8 scope host lo >>>> > >> > >> >>> valid_lft forever preferred_lft forever >>>> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>>> > >> > >> >>> state UP group default qlen 1000 >>>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>>> > >> > >> >>> state UP group default qlen 1000 >>>> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>>> > >> > >> >>> valid_lft forever preferred_lft forever >>>> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>>> > >> > >> >>> valid_lft forever preferred_lft forever >>>> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>>> > >> > >> >>> group default qlen 1000 >>>> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> >>>> > >> > >> >>> # show example bridge configuration - you're looking for the STP knob here. >>>> > >> > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp >>>> > >> > >> >>> { >>>> > >> > >> >>> "bondings": {}, >>>> > >> > >> >>> "networks": { >>>> > >> > >> >>> "test-network": { >>>> > >> > >> >>> "nic": "eth0", >>>> > >> > >> >>> "switch": "legacy", >>>> > >> > >> >>> "bridged": true, >>>> > >> > >> >>> "stp": true >>>> > >> > >> >>> } >>>> > >> > >> >>> }, >>>> > >> > >> >>> "options": { >>>> > >> > >> >>> "connectivityCheck": false >>>> > >> > >> >>> } >>>> > >> > >> >>> } >>>> > >> > >> >>> >>>> > >> > >> >>> # issue setup networks command: >>>> > >> > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >>>> > >> > >> >>> { >>>> > >> > >> >>> "code": 0, >>>> > >> > >> >>> "message": "Done" >>>> > >> > >> >>> } >>>> > >> > >> >>> >>>> > >> > >> >>> # show bridges >>>> > >> > >> >>> [root@vdsm ~]$ brctl show >>>> > >> > >> >>> bridge name bridge id STP enabled interfaces >>>> > >> > >> >>> ;vdsmdummy; 8000.000000000000 no >>>> > >> > >> >>> test-network 8000.52540041fb37 yes eth0 >>>> > >> > >> >>> >>>> > >> > >> >>> # show final state >>>> > >> > >> >>> [root@vdsm ~]$ ip a >>>> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>>> > >> > >> >>> group default qlen 1000 >>>> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>>> > >> > >> >>> inet 127.0.0.1/8 scope host lo >>>> > >> > >> >>> valid_lft forever preferred_lft forever >>>> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>>> > >> > >> >>> master test-network state UP group default qlen 1000 >>>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>>> > >> > >> >>> state UP group default qlen 1000 >>>> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>>> > >> > >> >>> valid_lft forever preferred_lft forever >>>> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>>> > >> > >> >>> valid_lft forever preferred_lft forever >>>> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>>> > >> > >> >>> group default qlen 1000 >>>> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>>> > >> > >> >>> noqueue state UP group default qlen 1000 >>>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> >>>> > >> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik >>>> > >> > >> >>> Holler , could you confirm ? What are our plans for it ? >>>> > >> > >> >>> >>>> > >> > >> >> >>>> > >> > >> >> STP is only available via REST-API, see >>>> > >> > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network >>>> > >> > >> >> please find an example how to enable STP in >>>> > >> > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 >>>> > >> > >> >> >>>> > >> > >> >> We have no plans to add STP to the web ui, >>>> > >> > >> >> but new feature requests are always welcome on >>>> > >> > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine >>>> > >> > >> >> >>>> > >> > >> >> >>>> > >> > >> >>> >>>> > >> > >> >>> > >>>> > >> > >> >>> > Here are some details about my systems, if you need it. >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > selinux is disabled. >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt >>>> > >> > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >>>> > >> > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch >>>> > >> > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >>>> > >> > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >>>> > >> > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >>>> > >> > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >>>> > >> > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >>>> > >> > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >>>> > >> > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >>>> > >> > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch >>>> > >> > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >>>> > >> > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch >>>> > >> > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >>>> > >> > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >>>> > >> > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >>>> > >> > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 >>>> > >> > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >>>> > >> > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >>>> > >> > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >>>> > >> > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release >>>> > >> > >> >>> > CentOS Linux release 7.6.1810 (Core) >>>> > >> > >> >>> > [root@swm-02 ~]# uname -r >>>> > >> > >> >>> > 3.10.0-957.27.2.el7.x86_64 >>>> > >> > >> >>> > You have new mail in /var/spool/mail/root >>>> > >> > >> >>> > [root@swm-02 ~]# ip a >>>> > >> > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>>> > >> > >> >>> > group default qlen 1000 >>>> > >> > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>>> > >> > >> >>> > inet 127.0.0.1/8 scope host lo >>>> > >> > >> >>> > valid_lft forever preferred_lft forever >>>> > >> > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>>> > >> > >> >>> > test state UP group default qlen 1000 >>>> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>>> > >> > >> >>> > default qlen 1000 >>>> > >> > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>>> > >> > >> >>> > ovirtmgmt state UP group default qlen 1000 >>>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>>> > >> > >> >>> > default qlen 1000 >>>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>>> > >> > >> >>> > group default qlen 1000 >>>> > >> > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>>> > >> > >> >>> > default qlen 1000 >>>> > >> > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >>>> > >> > >> >>> > state UP group default qlen 1000 >>>> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >>>> > >> > >> >>> > valid_lft forever preferred_lft forever >>>> > >> > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>>> > >> > >> >>> > noqueue state UP group default qlen 1000 >>>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >>>> > >> > >> >>> > valid_lft forever preferred_lft forever >>>> > >> > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>>> > >> > >> >>> > group default qlen 1000 >>>> > >> > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>>> > >> > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 >>>> > >> > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >>>> > >> > >> >>> > [root@swm-02 ~]# free -m >>>> > >> > >> >>> > total used free shared buff/cache available >>>> > >> > >> >>> > Mem: 64413 1873 61804 9 735 62062 >>>> > >> > >> >>> > Swap: 16383 0 16383 >>>> > >> > >> >>> > [root@swm-02 ~]# free -h >>>> > >> > >> >>> > total used free shared buff/cache available >>>> > >> > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G >>>> > >> > >> >>> > Swap: 15G 0B 15G >>>> > >> > >> >>> > [root@swm-02 ~]# ls >>>> > >> > >> >>> > ls lsb_release lshw lslocks >>>> > >> > >> >>> > lsmod lspci lssubsys >>>> > >> > >> >>> > lsusb.py >>>> > >> > >> >>> > lsattr lscgroup lsinitrd lslogins >>>> > >> > >> >>> > lsns lss16toppm lstopo-no-graphics >>>> > >> > >> >>> > lsblk lscpu lsipc lsmem >>>> > >> > >> >>> > lsof lsscsi lsusb >>>> > >> > >> >>> > [root@swm-02 ~]# lscpu >>>> > >> > >> >>> > Architecture: x86_64 >>>> > >> > >> >>> > CPU op-mode(s): 32-bit, 64-bit >>>> > >> > >> >>> > Byte Order: Little Endian >>>> > >> > >> >>> > CPU(s): 16 >>>> > >> > >> >>> > On-line CPU(s) list: 0-15 >>>> > >> > >> >>> > Thread(s) per core: 2 >>>> > >> > >> >>> > Core(s) per socket: 4 >>>> > >> > >> >>> > Socket(s): 2 >>>> > >> > >> >>> > NUMA node(s): 2 >>>> > >> > >> >>> > Vendor ID: GenuineIntel >>>> > >> > >> >>> > CPU family: 6 >>>> > >> > >> >>> > Model: 44 >>>> > >> > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >>>> > >> > >> >>> > Stepping: 2 >>>> > >> > >> >>> > CPU MHz: 3192.064 >>>> > >> > >> >>> > BogoMIPS: 6384.12 >>>> > >> > >> >>> > Virtualization: VT-x >>>> > >> > >> >>> > L1d cache: 32K >>>> > >> > >> >>> > L1i cache: 32K >>>> > >> > >> >>> > L2 cache: 256K >>>> > >> > >> >>> > L3 cache: 12288K >>>> > >> > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >>>> > >> > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >>>> > >> > >> >>> > Flags: fpu vme de pse tsc msr
wrote: the they triggers the issue. that maybe it's propagating than 120 seconds. than 120 seconds. propagating the engine-setup option. The drop-down is links that I have on my pae mce cx8 apic sep
>>>> > >> > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >>>> > >> > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >>>> > >> > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq >>>> > >> > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca >>>> > >> > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >>>> > >> > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >>>> > >> > >> >>> > [root@swm-02 ~]# >>>> > >> > >> >>> > _______________________________________________ >>>> > >> > >> >>> > Users mailing list -- users@ovirt.org >>>> > >> > >> >>> > To unsubscribe send an email to users-leave@ovirt.org >>>> > >> > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>> > >> > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>>> > >> > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... >>>> > >> > >> >> >>>> > >> > >> >> _______________________________________________ >>>> > >> > >> >> Users mailing list -- users@ovirt.org >>>> > >> > >> >> To unsubscribe send an email to users-leave@ovirt.org >>>> > >> > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>> > >> > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>>> > >> > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...

On Fri, Aug 23, 2019 at 8:50 PM Tony Pearce <tonyppe@gmail.com> wrote:
Is the nic to the network staying up or going down for a period?
Which nic? The one on the pserver or the virtual machine? For clarity, I've only ever referred to the one on the pserver. I can't even reach the VM when the pserver becomes unresponse during a migration, so I can't reach the console for the VM even if the proxy is not involved in the link down event.
I'm just thinking, if the network has been configured to block unknown unicast traffic, I think the VM would need to send a layer 2 frame to the network before the network would send any frames to that switch port destined for the VM.
Please, any and all ideas are welcome, and I'll try all of them. I love this product, and I want to see it working well. I don't know that it's configured to block unknown unicast traffic, I don't know that it isn't - NOC doesn't tell me squat. Just inductively, I can, maybe, think that if the switch somehow recognizes a mac on more than one port, it shuts them both down - but that's just speculation on a good day; I have no evidence of that. I can test again to be sure, but I believe that both the sending and receiving migration server go down during this event. Even more curious is that it happens over the 10g FC link as well as the 1g copper, TP port. Would it be a good test of your theory to set up an indefinite ping on a VM when I can reach it and then migrate it to see if the outage happens with the VM migrating?
After migration, could you use the VM console to send a packet and then see if you can SSH in? Is the default Gateway for the VM on the network side? A ping to the Gateway should be good enough in that case.
During times when it's up after it's thrown its little fit, I can change the migration network to either the 1g or 10g networks. So I could set up a ping and let it go like I was saying before. It's worth a shot....
On Sat., 24 Aug. 2019, 04:20 Curtis E. Combs Jr., <ej.albany@gmail.com> wrote:
It took a while for my servers to come back on the network this time. I think it's due to ovirt continuing to try to migrate the VMs around like I requested. The 3 servers' names are "swm-01, swm-02 and swm-03". Eventually (about 2-3 minutes ago) they all came back online.
So I disabled and stopped the lldpad service.
Nope. Started some more migrations and swm-02 and swm-03 disappeared again. No ping, SSH hung, same as before - almost as soon as the migration started.
If you wall have any ideas what switch-level setting might be enabled, let me know, cause I'm stumped. I can add it to the ticket that's requesting the port configurations. I've already added the port numbers and switch name that I got from CDP.
Thanks again, I really appreciate the help! cecjr
On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
This little cluster isn't in production or anything like that yet.
So, I went ahead and used your ethtool commands to disable pause frames on both interfaces of each server. I then, chose a few VMs to migrate around at random.
swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't ssh, and the SSH session that I had open was unresponsive.
Any other ideas?
Sorry, no. Looks like two different NICs with different drivers and frimware goes down together. This is a strong indication that the root cause is related to the switch. Maybe you can get some information about the switch config by 'lldptool get-tlv -n -i em1'
Another guess: After the optional 'lldptool get-tlv -n -i em1' 'systemctl stop lldpad' another try to migrate.
On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <dholler@redhat.com> wrote:
On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > > Unfortunately, I can't check on the switch. Trust me, I've tried. > These servers are in a Co-Lo and I've put 5 tickets in asking about > the port configuration. They just get ignored - but that's par for the > coarse for IT here. Only about 2 out of 10 of our tickets get any > response and usually the response doesn't help. Then the system they > use auto-closes the ticket. That was why I was suspecting STP before. > > I can do ethtool. I do have root on these servers, though. Are you > trying to get me to turn off link-speed auto-negotiation? Would you > like me to try that? >
It is just a suspicion, that the reason is pause frames. Let's start on a NIC which is not used for ovirtmgmt, I guess em1. Does 'ethtool -S em1 | grep pause' show something? Does 'ethtool em1 | grep pause' indicates support for pause? The current config is shown by 'ethtool -a em1'. '-A autoneg' "Specifies whether pause autonegotiation should be enabled." according to ethtool doc. Assuming flow control is enabled by default, I would try to disable it via 'ethtool -A em1 autoneg off rx off tx off' and check if it is applied via 'ethtool -a em1' and check if the behavior under load changes.
> > On Fri, Aug 23, 2019 at 12:24 PM Dominik Holler <dholler@redhat.com> wrote: > > > > > > > > On Fri, Aug 23, 2019 at 5:49 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >> > >> Sure! Right now, I only have a 500gb partition on each node shared over NFS, added as storage domains. This is on each node - so, currently 3. > >> > >> How can the storage cause a node to drop out? > >> > > > > Thanks, I got it. > > All three links go down on load, which causes NFS to fail. > > > > Can you check in the switch port configuration if there is some kind of Ethernet flow control enabled? > > Can you try to modify the behavior by changing the settings of your host interfaces, e.g. > > > > ethtool -A em1 autoneg off rx off tx off > > > > or > > ethtool -A em1 autoneg on rx on tx on > > ? > > > > > > > > > >> > >> On Fri, Aug 23, 2019, 11:46 AM Dominik Holler <dholler@redhat.com> wrote: > >>> > >>> > >>> > >>> On Fri, Aug 23, 2019 at 5:41 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >>>> > >>>> Also, if it helps, the hosts will sit there, quietly, for hours or > >>>> days before anything happens. They're up and working just fine. But > >>>> then, when I manually migrate a VM from one host to another, they > >>>> become completely inaccessible. > >>>> > >>> > >>> Can you share some details about your storage? > >>> Maybe there is a feature used during live migration, which triggers the issue. > >>> > >>> > >>>> > >>>> These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install > >>>> and configuration. > >>>> > >>>> On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. > >>>> <ej.albany@gmail.com> wrote: > >>>> > > >>>> > Hey Dominik, > >>>> > > >>>> > Thanks for helping. I really want to try to use ovirt. > >>>> > > >>>> > When these events happen, I cannot even SSH to the nodes due to the > >>>> > link being down. After a little while, the hosts come back... > >>>> > > >>>> > On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com> wrote: > >>>> > > > >>>> > > Is you storage connected via NFS? > >>>> > > Can you manually access the storage on the host? > >>>> > > > >>>> > > > >>>> > > On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >>>> > >> > >>>> > >> Sorry to dead bump this, but I'm beginning to suspect that maybe it's > >>>> > >> not STP that's the problem. > >>>> > >> > >>>> > >> 2 of my hosts just went down when a few VMs tried to migrate. > >>>> > >> > >>>> > >> Do any of you have any idea what might be going on here? I don't even > >>>> > >> know where to start. I'm going to include the dmesg in case it helps. > >>>> > >> This happens on both of the hosts whenever any migration attempts to start. > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> [68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down > >>>> > >> [68099.246055] internal: port 1(em1) entered disabled state > >>>> > >> [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down > >>>> > >> [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state > >>>> > >> [68184.177856] ovirtmgmt: topology change detected, propagating > >>>> > >> [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > >>>> > >> [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > >>>> > >> disables this message. > >>>> > >> [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > >>>> > >> [68277.078727] Call Trace: > >>>> > >> [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > >>>> > >> [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 > >>>> > >> [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > >>>> > >> [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > >>>> > >> [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > >>>> > >> [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > >>>> > >> [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > >>>> > >> [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > >>>> > >> [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > >>>> > >> [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > >>>> > >> [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > >>>> > >> [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > >>>> > >> [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > >>>> > >> [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. > >>>> > >> [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > >>>> > >> disables this message. > >>>> > >> [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 > >>>> > >> [68397.072439] Call Trace: > >>>> > >> [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 > >>>> > >> [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 > >>>> > >> [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 > >>>> > >> [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 > >>>> > >> [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] > >>>> > >> [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 > >>>> > >> [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 > >>>> > >> [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 > >>>> > >> [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 > >>>> > >> [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 > >>>> > >> [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 > >>>> > >> [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 > >>>> > >> [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 > >>>> > >> [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps > >>>> > >> full duplex > >>>> > >> > >>>> > >> [68401.573247] internal: port 1(em1) entered blocking state > >>>> > >> [68401.573255] internal: port 1(em1) entered listening state > >>>> > >> [68403.576985] internal: port 1(em1) entered learning state > >>>> > >> [68405.580907] internal: port 1(em1) entered forwarding state > >>>> > >> [68405.580916] internal: topology change detected, propagating > >>>> > >> [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out > >>>> > >> [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out > >>>> > >> [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow > >>>> > >> Control: RX/TX > >>>> > >> [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state > >>>> > >> [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state > >>>> > >> [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state > >>>> > >> [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state > >>>> > >> [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu > >>>> > >> [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > >>>> > >> [68494.777996] NFSD: client 10.15.28.22 testing state ID with > >>>> > >> incorrect client ID > >>>> > >> [68494.778580] NFSD: client 10.15.28.22 testing state ID with > >>>> > >> incorrect client ID > >>>> > >> > >>>> > >> > >>>> > >> On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >>>> > >> > > >>>> > >> > Thanks, I'm just going to revert back to bridges. > >>>> > >> > > >>>> > >> > On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote: > >>>> > >> > > > >>>> > >> > > > >>>> > >> > > > >>>> > >> > > On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote: > >>>> > >> > >> > >>>> > >> > >> Seems like the STP options are so common and necessary that it would > >>>> > >> > >> be a priority over seldom-used bridge_opts. I know what STP is and I'm > >>>> > >> > >> not even a networking guy - never even heard of half of the > >>>> > >> > >> bridge_opts that have switches in the UI. > >>>> > >> > >> > >>>> > >> > >> Anyway. I wanted to try the openvswitches, so I reinstalled all of my > >>>> > >> > >> nodes and used "openvswitch (Technology Preview)" as the engine-setup > >>>> > >> > >> option for the first host. I made a new Cluster for my nodes, added > >>>> > >> > >> them all to the new cluster, created a new "logical network" for the > >>>> > >> > >> internal network and attached it to the internal network ports. > >>>> > >> > >> > >>>> > >> > >> Now, when I go to create a new VM, I don't even have either the > >>>> > >> > >> ovirtmgmt switch OR the internal switch as an option. The drop-down is > >>>> > >> > >> empy as if I don't have any vnic-profiles. > >>>> > >> > >> > >>>> > >> > > > >>>> > >> > > openvswitch clusters are limited to ovn networks. > >>>> > >> > > You can create one like described in > >>>> > >> > > https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html... > >>>> > >> > > > >>>> > >> > > > >>>> > >> > >> > >>>> > >> > >> On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: > >>>> > >> > >> > > >>>> > >> > >> > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > >>>> > >> > >> > Thanks > >>>> > >> > >> > > >>>> > >> > >> > > >>>> > >> > >> > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote: > >>>> > >> > >> >> > >>>> > >> > >> >> > >>>> > >> > >> >> > >>>> > >> > >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: > >>>> > >> > >> >>> > >>>> > >> > >> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > Hello. I have been trying to figure out an issue for a very long time. > >>>> > >> > >> >>> > That issue relates to the ethernet and 10gb fc links that I have on my > >>>> > >> > >> >>> > cluster being disabled any time a migration occurs. > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > I believe this is because I need to have STP turned on in order to > >>>> > >> > >> >>> > participate with the switch. However, there does not seem to be any > >>>> > >> > >> >>> > way to tell oVirt to stop turning it off! Very frustrating. > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > After entering a cronjob that enables stp on all bridges every 1 > >>>> > >> > >> >>> > minute, the migration issue disappears.... > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > Is there any way at all to do without this cronjob and set STP to be > >>>> > >> > >> >>> > ON without having to resort to such a silly solution? > >>>> > >> > >> >>> > >>>> > >> > >> >>> Vdsm exposes a per bridge STP knob that you can use for this. By > >>>> > >> > >> >>> default it is set to false, which is probably why you had to use this > >>>> > >> > >> >>> shenanigan. > >>>> > >> > >> >>> > >>>> > >> > >> >>> You can, for instance: > >>>> > >> > >> >>> > >>>> > >> > >> >>> # show present state > >>>> > >> > >> >>> [vagrant@vdsm ~]$ ip a > >>>> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >>>> > >> > >> >>> group default qlen 1000 > >>>> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >>>> > >> > >> >>> inet 127.0.0.1/8 scope host lo > >>>> > >> > >> >>> valid_lft forever preferred_lft forever > >>>> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >>>> > >> > >> >>> state UP group default qlen 1000 > >>>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >>>> > >> > >> >>> state UP group default qlen 1000 > >>>> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >>>> > >> > >> >>> valid_lft forever preferred_lft forever > >>>> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >>>> > >> > >> >>> valid_lft forever preferred_lft forever > >>>> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>>> > >> > >> >>> group default qlen 1000 > >>>> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > >>>> > >> > >> >>> # show example bridge configuration - you're looking for the STP knob here. > >>>> > >> > >> >>> [root@vdsm ~]$ cat bridged_net_with_stp > >>>> > >> > >> >>> { > >>>> > >> > >> >>> "bondings": {}, > >>>> > >> > >> >>> "networks": { > >>>> > >> > >> >>> "test-network": { > >>>> > >> > >> >>> "nic": "eth0", > >>>> > >> > >> >>> "switch": "legacy", > >>>> > >> > >> >>> "bridged": true, > >>>> > >> > >> >>> "stp": true > >>>> > >> > >> >>> } > >>>> > >> > >> >>> }, > >>>> > >> > >> >>> "options": { > >>>> > >> > >> >>> "connectivityCheck": false > >>>> > >> > >> >>> } > >>>> > >> > >> >>> } > >>>> > >> > >> >>> > >>>> > >> > >> >>> # issue setup networks command: > >>>> > >> > >> >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks > >>>> > >> > >> >>> { > >>>> > >> > >> >>> "code": 0, > >>>> > >> > >> >>> "message": "Done" > >>>> > >> > >> >>> } > >>>> > >> > >> >>> > >>>> > >> > >> >>> # show bridges > >>>> > >> > >> >>> [root@vdsm ~]$ brctl show > >>>> > >> > >> >>> bridge name bridge id STP enabled interfaces > >>>> > >> > >> >>> ;vdsmdummy; 8000.000000000000 no > >>>> > >> > >> >>> test-network 8000.52540041fb37 yes eth0 > >>>> > >> > >> >>> > >>>> > >> > >> >>> # show final state > >>>> > >> > >> >>> [root@vdsm ~]$ ip a > >>>> > >> > >> >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >>>> > >> > >> >>> group default qlen 1000 > >>>> > >> > >> >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >>>> > >> > >> >>> inet 127.0.0.1/8 scope host lo > >>>> > >> > >> >>> valid_lft forever preferred_lft forever > >>>> > >> > >> >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >>>> > >> > >> >>> master test-network state UP group default qlen 1000 > >>>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > >>>> > >> > >> >>> state UP group default qlen 1000 > >>>> > >> > >> >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 > >>>> > >> > >> >>> valid_lft forever preferred_lft forever > >>>> > >> > >> >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link > >>>> > >> > >> >>> valid_lft forever preferred_lft forever > >>>> > >> > >> >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>>> > >> > >> >>> group default qlen 1000 > >>>> > >> > >> >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >>>> > >> > >> >>> noqueue state UP group default qlen 1000 > >>>> > >> > >> >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > >>>> > >> > >> >>> I don't think this STP parameter is exposed via engine UI; @Dominik > >>>> > >> > >> >>> Holler , could you confirm ? What are our plans for it ? > >>>> > >> > >> >>> > >>>> > >> > >> >> > >>>> > >> > >> >> STP is only available via REST-API, see > >>>> > >> > >> >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network > >>>> > >> > >> >> please find an example how to enable STP in > >>>> > >> > >> >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 > >>>> > >> > >> >> > >>>> > >> > >> >> We have no plans to add STP to the web ui, > >>>> > >> > >> >> but new feature requests are always welcome on > >>>> > >> > >> >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine > >>>> > >> > >> >> > >>>> > >> > >> >> > >>>> > >> > >> >>> > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > Here are some details about my systems, if you need it. > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > selinux is disabled. > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > > >>>> > >> > >> >>> > [root@swm-02 ~]# rpm -qa | grep ovirt > >>>> > >> > >> >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 > >>>> > >> > >> >>> > ovirt-release43-4.3.5.2-1.el7.noarch > >>>> > >> > >> >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch > >>>> > >> > >> >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch > >>>> > >> > >> >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch > >>>> > >> > >> >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch > >>>> > >> > >> >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch > >>>> > >> > >> >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch > >>>> > >> > >> >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch > >>>> > >> > >> >>> > cockpit-machines-ovirt-195.1-1.el7.noarch > >>>> > >> > >> >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch > >>>> > >> > >> >>> > ovirt-vmconsole-1.0.7-2.el7.noarch > >>>> > >> > >> >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch > >>>> > >> > >> >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch > >>>> > >> > >> >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch > >>>> > >> > >> >>> > ovirt-host-4.3.4-1.el7.x86_64 > >>>> > >> > >> >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 > >>>> > >> > >> >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 > >>>> > >> > >> >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch > >>>> > >> > >> >>> > [root@swm-02 ~]# cat /etc/redhat-release > >>>> > >> > >> >>> > CentOS Linux release 7.6.1810 (Core) > >>>> > >> > >> >>> > [root@swm-02 ~]# uname -r > >>>> > >> > >> >>> > 3.10.0-957.27.2.el7.x86_64 > >>>> > >> > >> >>> > You have new mail in /var/spool/mail/root > >>>> > >> > >> >>> > [root@swm-02 ~]# ip a > >>>> > >> > >> >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN > >>>> > >> > >> >>> > group default qlen 1000 > >>>> > >> > >> >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > >>>> > >> > >> >>> > inet 127.0.0.1/8 scope host lo > >>>> > >> > >> >>> > valid_lft forever preferred_lft forever > >>>> > >> > >> >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >>>> > >> > >> >>> > test state UP group default qlen 1000 > >>>> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >>>> > >> > >> >>> > default qlen 1000 > >>>> > >> > >> >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >>>> > >> > >> >>> > ovirtmgmt state UP group default qlen 1000 > >>>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >>>> > >> > >> >>> > default qlen 1000 > >>>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>>> > >> > >> >>> > group default qlen 1000 > >>>> > >> > >> >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group > >>>> > >> > >> >>> > default qlen 1000 > >>>> > >> > >> >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue > >>>> > >> > >> >>> > state UP group default qlen 1000 > >>>> > >> > >> >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test > >>>> > >> > >> >>> > valid_lft forever preferred_lft forever > >>>> > >> > >> >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc > >>>> > >> > >> >>> > noqueue state UP group default qlen 1000 > >>>> > >> > >> >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt > >>>> > >> > >> >>> > valid_lft forever preferred_lft forever > >>>> > >> > >> >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN > >>>> > >> > >> >>> > group default qlen 1000 > >>>> > >> > >> >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master > >>>> > >> > >> >>> > ovirtmgmt state UNKNOWN group default qlen 1000 > >>>> > >> > >> >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff > >>>> > >> > >> >>> > [root@swm-02 ~]# free -m > >>>> > >> > >> >>> > total used free shared buff/cache available > >>>> > >> > >> >>> > Mem: 64413 1873 61804 9 735 62062 > >>>> > >> > >> >>> > Swap: 16383 0 16383 > >>>> > >> > >> >>> > [root@swm-02 ~]# free -h > >>>> > >> > >> >>> > total used free shared buff/cache available > >>>> > >> > >> >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G > >>>> > >> > >> >>> > Swap: 15G 0B 15G > >>>> > >> > >> >>> > [root@swm-02 ~]# ls > >>>> > >> > >> >>> > ls lsb_release lshw lslocks > >>>> > >> > >> >>> > lsmod lspci lssubsys > >>>> > >> > >> >>> > lsusb.py > >>>> > >> > >> >>> > lsattr lscgroup lsinitrd lslogins > >>>> > >> > >> >>> > lsns lss16toppm lstopo-no-graphics > >>>> > >> > >> >>> > lsblk lscpu lsipc lsmem > >>>> > >> > >> >>> > lsof lsscsi lsusb > >>>> > >> > >> >>> > [root@swm-02 ~]# lscpu > >>>> > >> > >> >>> > Architecture: x86_64 > >>>> > >> > >> >>> > CPU op-mode(s): 32-bit, 64-bit > >>>> > >> > >> >>> > Byte Order: Little Endian > >>>> > >> > >> >>> > CPU(s): 16 > >>>> > >> > >> >>> > On-line CPU(s) list: 0-15 > >>>> > >> > >> >>> > Thread(s) per core: 2 > >>>> > >> > >> >>> > Core(s) per socket: 4 > >>>> > >> > >> >>> > Socket(s): 2 > >>>> > >> > >> >>> > NUMA node(s): 2 > >>>> > >> > >> >>> > Vendor ID: GenuineIntel > >>>> > >> > >> >>> > CPU family: 6 > >>>> > >> > >> >>> > Model: 44 > >>>> > >> > >> >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz > >>>> > >> > >> >>> > Stepping: 2 > >>>> > >> > >> >>> > CPU MHz: 3192.064 > >>>> > >> > >> >>> > BogoMIPS: 6384.12 > >>>> > >> > >> >>> > Virtualization: VT-x > >>>> > >> > >> >>> > L1d cache: 32K > >>>> > >> > >> >>> > L1i cache: 32K > >>>> > >> > >> >>> > L2 cache: 256K > >>>> > >> > >> >>> > L3 cache: 12288K > >>>> > >> > >> >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > >>>> > >> > >> >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > >>>> > >> > >> >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > >>>> > >> > >> >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > >>>> > >> > >> >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > >>>> > >> > >> >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > >>>> > >> > >> >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > >>>> > >> > >> >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi > >>>> > >> > >> >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d > >>>> > >> > >> >>> > [root@swm-02 ~]# > >>>> > >> > >> >>> > _______________________________________________ > >>>> > >> > >> >>> > Users mailing list -- users@ovirt.org > >>>> > >> > >> >>> > To unsubscribe send an email to users-leave@ovirt.org > >>>> > >> > >> >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >>>> > >> > >> >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >>>> > >> > >> >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... > >>>> > >> > >> >> > >>>> > >> > >> >> _______________________________________________ > >>>> > >> > >> >> Users mailing list -- users@ovirt.org > >>>> > >> > >> >> To unsubscribe send an email to users-leave@ovirt.org > >>>> > >> > >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >>>> > >> > >> >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > >>>> > >> > >> >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...
participants (6)
-
Curtis E. Combs Jr.
-
Dominik Holler
-
ej.albany@gmail.com
-
Miguel Duarte de Mora Barroso
-
Staniforth, Paul
-
Tony Pearce