:) They might be using Cisco's per-vlan spanning tree on the network
side. It is possible to capture the packets coming in from the network
and confirm that.
Attached screenshot of wireshark for you for reference.
Glad you're all working :)
Regards
Tony Pearce
On Tue, 20 Aug 2019 at 09:52, Curtis E. Combs Jr. <ej.albany(a)gmail.com> wrote:
Hey Tony!
I only know the basics of Spanning Tree. At the current moment the
only way to get migrations to work at all without breaking the whole
oVirt cluster is to have it on. After changing it according to Paul's
instruction, it works like it has never worked before. Every migration
event was successful. Whereas before and even at times with the
cronjob (when vdsm set STP to off between cron runs) the link would
drop out and oVirt would say that the host was "unresponsive".
It would be too - it wouldn't respond to SSH, ping, arp
requests...nothing. I never got a good idea of how long this would be
for, but it would, eventually go away and the link would come back
online.
I have no access to the hardware. From using tcpdump to get some CDP
packets, I do know that it's Cisco switches but the IT team here is
completely unresponsive (they literally ignore our tickets) and the
co-lo where our servers are hosted won't even pick up the phone for
anyone but them....
Unfortunately, this is what I'm going to have to do. The cluster is
very functional, though. I created around 15 VMs today and
migrated them from host to host without any problem.
Anything else you'd like me to try? This is currently dev, so I can
really do anything I want and I can just IPMI reboot the nodes if it
causes issues...
Thanks!
cecjr
On Mon, Aug 19, 2019 at 9:37 PM Tony Pearce <tonyppe(a)gmail.com> wrote:
>
> e.albany,
>
> STP is meant to block loops in layer 2. In basic operation, a root
> bridge is elected which is the root of the tree. This bridge sends,
> essentially 'hello' messages as multicast packets. The switches then
> detect the loop in the network and block one of the links to prevent
> such things as a broadcast storm.
>
> There are different flavours of STP but "STP" usually means the hellos
> are sent over VLAN 1 (or no vlan). Therefore if you have multiple
> VLANs on links, the hellos are still only sent over VLAN 1 and all
> VLANs are dealt with that way. Meaning if a link is blocked then all
> VLANs are blocked on that link,
>
> Then came the different flavours, one of which is per-vlan STP. This
> allows individual VLANs to be blocked and gives more flexibility.
>
> After STP has dealt with the blocking, this link blocking will
> continue until a change in the network is detected. This is detected
> by the absence of the STP packets or the presence of new STP packets
> where there shouldnt be. When this happens, STP packets are flooded
> everywhere to discover the new network topology. Ultimately, the loop
> will be blocked again.
>
> I think that you have two STP versions running in your network and
> it's causing the issue. An easy test would be to remove the loop
> manually in the network and leave STP off on the ovirt host. You can
> view the topology as-per the network STP devices by obtaining info
> from the devices such as bridge priorities etc. What is your network
> hardware?
>
> Regards,
>
> Tony
>
>
> On Tue, 20 Aug 2019 at 08:22, Staniforth, Paul
> <P.Staniforth(a)leedsbeckett.ac.uk> wrote:
> >
> > I haven't used FC with oVirt but in the following it shows the bridge
options available and how to enable Ethtool and FCoE.
> >
> >
https://ovirt.org/documentation/admin-guide/appe-Custom_Network_Propertie...
> >
> >
> > Regards,
> > Paul S.
> >
> > ________________________________________
> > From: ej.albany(a)gmail.com <ej.albany(a)gmail.com>
> > Sent: 17 August 2019 10:25
> > To: users(a)ovirt.org
> > Subject: [ovirt-users] Need to enable STP on ovirt bridges
> >
> > Hello. I have been trying to figure out an issue for a very long time.
> > That issue relates to the ethernet and 10gb fc links that I have on my
> > cluster being disabled any time a migration occurs.
> >
> > I believe this is because I need to have STP turned on in order to
> > participate with the switch. However, there does not seem to be any
> > way to tell oVirt to stop turning it off! Very frustrating.
> >
> > After entering a cronjob that enables stp on all bridges every 1
> > minute, the migration issue disappears....
> >
> > Is there any way at all to do without this cronjob and set STP to be
> > ON without having to resort to such a silly solution?
> >
> > Here are some details about my systems, if you need it.
> >
> >
> > selinux is disabled.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > [root@swm-02 ~]# rpm -qa | grep ovirt
> > ovirt-imageio-common-1.5.1-0.el7.x86_64
> > ovirt-release43-4.3.5.2-1.el7.noarch
> > ovirt-imageio-daemon-1.5.1-0.el7.noarch
> > ovirt-vmconsole-host-1.0.7-2.el7.noarch
> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch
> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch
> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch
> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch
> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch
> > cockpit-machines-ovirt-195.1-1.el7.noarch
> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch
> > ovirt-vmconsole-1.0.7-2.el7.noarch
> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch
> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch
> > ovirt-host-deploy-common-1.8.0-1.el7.noarch
> > ovirt-host-4.3.4-1.el7.x86_64
> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64
> > ovirt-host-dependencies-4.3.4-1.el7.x86_64
> > ovirt-ansible-repositories-1.1.5-1.el7.noarch
> > [root@swm-02 ~]# cat /etc/redhat-release
> > CentOS Linux release 7.6.1810 (Core)
> > [root@swm-02 ~]# uname -r
> > 3.10.0-957.27.2.el7.x86_64
> > You have new mail in /var/spool/mail/root
> > [root@swm-02 ~]# ip a
> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
> > group default qlen 1000
> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > inet 127.0.0.1/8 scope host lo
> > valid_lft forever preferred_lft forever
> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
> > test state UP group default qlen 1000
> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff
> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
> > default qlen 1000
> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff
> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
> > ovirtmgmt state UP group default qlen 1000
> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff
> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
> > default qlen 1000
> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff
> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> > group default qlen 1000
> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff
> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
> > default qlen 1000
> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff
> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> > state UP group default qlen 1000
> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff
> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test
> > valid_lft forever preferred_lft forever
> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > noqueue state UP group default qlen 1000
> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff
> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt
> > valid_lft forever preferred_lft forever
> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> > group default qlen 1000
> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff
> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
> > ovirtmgmt state UNKNOWN group default qlen 1000
> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff
> > [root@swm-02 ~]# free -m
> > total used free shared buff/cache
available
> > Mem: 64413 1873 61804 9 735
62062
> > Swap: 16383 0 16383
> > [root@swm-02 ~]# free -h
> > total used free shared buff/cache
available
> > Mem: 62G 1.8G 60G 9.5M 735M
60G
> > Swap: 15G 0B 15G
> > [root@swm-02 ~]# ls
> > ls lsb_release lshw lslocks
> > lsmod lspci lssubsys
> > lsusb.py
> > lsattr lscgroup lsinitrd lslogins
> > lsns lss16toppm lstopo-no-graphics
> > lsblk lscpu lsipc lsmem
> > lsof lsscsi lsusb
> > [root@swm-02 ~]# lscpu
> > Architecture: x86_64
> > CPU op-mode(s): 32-bit, 64-bit
> > Byte Order: Little Endian
> > CPU(s): 16
> > On-line CPU(s) list: 0-15
> > Thread(s) per core: 2
> > Core(s) per socket: 4
> > Socket(s): 2
> > NUMA node(s): 2
> > Vendor ID: GenuineIntel
> > CPU family: 6
> > Model: 44
> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz
> > Stepping: 2
> > CPU MHz: 3192.064
> > BogoMIPS: 6384.12
> > Virtualization: VT-x
> > L1d cache: 32K
> > L1i cache: 32K
> > L2 cache: 256K
> > L3 cache: 12288K
> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14
> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15
> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep
> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht
> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq
> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca
> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi
> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d
> > [root@swm-02 ~]#
> > _______________________________________________
> > Users mailing list -- users(a)ovirt.org
> > To unsubscribe send an email to users-leave(a)ovirt.org
> > Privacy Statement:
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovi...
> > oVirt Code of Conduct:
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovi...
> > List Archives:
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.o...
> > To view the terms under which this email is distributed, please go to:-
> >
http://leedsbeckett.ac.uk/disclaimer/email/
> > _______________________________________________
> > Users mailing list -- users(a)ovirt.org
> > To unsubscribe send an email to users-leave(a)ovirt.org
> > Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7367F4SFUQ...