
Also, if it helps, the hosts will sit there, quietly, for hours or days before anything happens. They're up and working just fine. But then, when I manually migrate a VM from one host to another, they become completely inaccessible. These are vanilla-as-possible CentOS7 nodes. Very basic ovirt install and configuration. On Fri, Aug 23, 2019 at 11:33 AM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Hey Dominik,
Thanks for helping. I really want to try to use ovirt.
When these events happen, I cannot even SSH to the nodes due to the link being down. After a little while, the hosts come back...
On Fri, Aug 23, 2019 at 11:30 AM Dominik Holler <dholler@redhat.com> wrote:
Is you storage connected via NFS? Can you manually access the storage on the host?
On Fri, Aug 23, 2019 at 5:19 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Sorry to dead bump this, but I'm beginning to suspect that maybe it's not STP that's the problem.
2 of my hosts just went down when a few VMs tried to migrate.
Do any of you have any idea what might be going on here? I don't even know where to start. I'm going to include the dmesg in case it helps. This happens on both of the hosts whenever any migration attempts to start.
[68099.245833] bnx2 0000:01:00.0 em1: NIC Copper Link is Down [68099.246055] internal: port 1(em1) entered disabled state [68184.177343] ixgbe 0000:03:00.0 p1p1: NIC Link is Down [68184.177789] ovirtmgmt: port 1(p1p1) entered disabled state [68184.177856] ovirtmgmt: topology change detected, propagating [68277.078671] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68277.078700] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68277.078723] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68277.078727] Call Trace: [68277.078738] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68277.078743] [<ffffffff97d69f19>] schedule+0x29/0x70 [68277.078746] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68277.078751] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68277.078765] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68277.078768] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68277.078769] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68277.078771] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68277.078774] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68277.078778] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68277.078782] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68277.078784] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68277.078786] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68397.072384] INFO: task qemu-kvm:8888 blocked for more than 120 seconds. [68397.072413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68397.072436] qemu-kvm D ffff9db40c359040 0 8888 1 0x000001a0 [68397.072439] Call Trace: [68397.072453] [<ffffffff978fd2ac>] ? avc_has_perm_flags+0xdc/0x1c0 [68397.072458] [<ffffffff97d69f19>] schedule+0x29/0x70 [68397.072462] [<ffffffff9785f3d9>] inode_dio_wait+0xd9/0x100 [68397.072467] [<ffffffff976c4010>] ? wake_bit_function+0x40/0x40 [68397.072480] [<ffffffffc09d6dd6>] nfs_getattr+0x1b6/0x250 [nfs] [68397.072485] [<ffffffff97848109>] vfs_getattr+0x49/0x80 [68397.072486] [<ffffffff97848185>] vfs_fstat+0x45/0x80 [68397.072488] [<ffffffff978486f4>] SYSC_newfstat+0x24/0x60 [68397.072491] [<ffffffff97d76d21>] ? system_call_after_swapgs+0xae/0x146 [68397.072495] [<ffffffff97739f34>] ? __audit_syscall_entry+0xb4/0x110 [68397.072498] [<ffffffff9763aaeb>] ? syscall_trace_enter+0x16b/0x220 [68397.072500] [<ffffffff97848ace>] SyS_newfstat+0xe/0x10 [68397.072502] [<ffffffff97d7706b>] tracesys+0xa3/0xc9 [68401.573141] bnx2 0000:01:00.0 em1: NIC Copper Link is Up, 1000 Mbps full duplex
[68401.573247] internal: port 1(em1) entered blocking state [68401.573255] internal: port 1(em1) entered listening state [68403.576985] internal: port 1(em1) entered learning state [68405.580907] internal: port 1(em1) entered forwarding state [68405.580916] internal: topology change detected, propagating [68469.565589] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68469.565840] nfs: server swm-01.hpc.moffitt.org not responding, timed out [68487.193932] ixgbe 0000:03:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [68487.194105] ovirtmgmt: port 1(p1p1) entered blocking state [68487.194114] ovirtmgmt: port 1(p1p1) entered listening state [68489.196508] ovirtmgmt: port 1(p1p1) entered learning state [68491.200400] ovirtmgmt: port 1(p1p1) entered forwarding state [68491.200405] ovirtmgmt: topology change detected, sending tcn bpdu [68493.672423] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [68494.777996] NFSD: client 10.15.28.22 testing state ID with incorrect client ID [68494.778580] NFSD: client 10.15.28.22 testing state ID with incorrect client ID
On Thu, Aug 22, 2019 at 2:53 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Thanks, I'm just going to revert back to bridges.
On Thu, Aug 22, 2019 at 11:50 AM Dominik Holler <dholler@redhat.com> wrote:
On Thu, Aug 22, 2019 at 3:06 PM Curtis E. Combs Jr. <ej.albany@gmail.com> wrote:
Seems like the STP options are so common and necessary that it would be a priority over seldom-used bridge_opts. I know what STP is and I'm not even a networking guy - never even heard of half of the bridge_opts that have switches in the UI.
Anyway. I wanted to try the openvswitches, so I reinstalled all of my nodes and used "openvswitch (Technology Preview)" as the engine-setup option for the first host. I made a new Cluster for my nodes, added them all to the new cluster, created a new "logical network" for the internal network and attached it to the internal network ports.
Now, when I go to create a new VM, I don't even have either the ovirtmgmt switch OR the internal switch as an option. The drop-down is empy as if I don't have any vnic-profiles.
openvswitch clusters are limited to ovn networks. You can create one like described in https://www.ovirt.org/documentation/admin-guide/chap-External_Providers.html...
On Thu, Aug 22, 2019 at 7:34 AM Tony Pearce <tonyppe@gmail.com> wrote: > > Hi Dominik, would you mind sharing the use case for stp via API Only? I am keen to know this. > Thanks > > > On Thu., 22 Aug. 2019, 19:24 Dominik Holler, <dholler@redhat.com> wrote: >> >> >> >> On Thu, Aug 22, 2019 at 1:08 PM Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> wrote: >>> >>> On Sat, Aug 17, 2019 at 11:27 AM <ej.albany@gmail.com> wrote: >>> > >>> > Hello. I have been trying to figure out an issue for a very long time. >>> > That issue relates to the ethernet and 10gb fc links that I have on my >>> > cluster being disabled any time a migration occurs. >>> > >>> > I believe this is because I need to have STP turned on in order to >>> > participate with the switch. However, there does not seem to be any >>> > way to tell oVirt to stop turning it off! Very frustrating. >>> > >>> > After entering a cronjob that enables stp on all bridges every 1 >>> > minute, the migration issue disappears.... >>> > >>> > Is there any way at all to do without this cronjob and set STP to be >>> > ON without having to resort to such a silly solution? >>> >>> Vdsm exposes a per bridge STP knob that you can use for this. By >>> default it is set to false, which is probably why you had to use this >>> shenanigan. >>> >>> You can, for instance: >>> >>> # show present state >>> [vagrant@vdsm ~]$ ip a >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> group default qlen 1000 >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> inet 127.0.0.1/8 scope host lo >>> valid_lft forever preferred_lft forever >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> state UP group default qlen 1000 >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> state UP group default qlen 1000 >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>> valid_lft forever preferred_lft forever >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>> valid_lft forever preferred_lft forever >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> group default qlen 1000 >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>> >>> # show example bridge configuration - you're looking for the STP knob here. >>> [root@vdsm ~]$ cat bridged_net_with_stp >>> { >>> "bondings": {}, >>> "networks": { >>> "test-network": { >>> "nic": "eth0", >>> "switch": "legacy", >>> "bridged": true, >>> "stp": true >>> } >>> }, >>> "options": { >>> "connectivityCheck": false >>> } >>> } >>> >>> # issue setup networks command: >>> [root@vdsm ~]$ vdsm-client -f bridged_net_with_stp Host setupNetworks >>> { >>> "code": 0, >>> "message": "Done" >>> } >>> >>> # show bridges >>> [root@vdsm ~]$ brctl show >>> bridge name bridge id STP enabled interfaces >>> ;vdsmdummy; 8000.000000000000 no >>> test-network 8000.52540041fb37 yes eth0 >>> >>> # show final state >>> [root@vdsm ~]$ ip a >>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> group default qlen 1000 >>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> inet 127.0.0.1/8 scope host lo >>> valid_lft forever preferred_lft forever >>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> master test-network state UP group default qlen 1000 >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>> state UP group default qlen 1000 >>> link/ether 52:54:00:83:5b:6f brd ff:ff:ff:ff:ff:ff >>> inet 192.168.50.50/24 brd 192.168.50.255 scope global noprefixroute eth1 >>> valid_lft forever preferred_lft forever >>> inet6 fe80::5054:ff:fe83:5b6f/64 scope link >>> valid_lft forever preferred_lft forever >>> 19: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> group default qlen 1000 >>> link/ether 8e:5c:2e:87:fa:0b brd ff:ff:ff:ff:ff:ff >>> 432: test-network: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>> noqueue state UP group default qlen 1000 >>> link/ether 52:54:00:41:fb:37 brd ff:ff:ff:ff:ff:ff >>> >>> I don't think this STP parameter is exposed via engine UI; @Dominik >>> Holler , could you confirm ? What are our plans for it ? >>> >> >> STP is only available via REST-API, see >> http://ovirt.github.io/ovirt-engine-api-model/4.3/#types/network >> please find an example how to enable STP in >> https://gist.github.com/dominikholler/4e70c9ef9929d93b6807f56d43a70b95 >> >> We have no plans to add STP to the web ui, >> but new feature requests are always welcome on >> https://bugzilla.redhat.com/enter_bug.cgi?product=ovirt-engine >> >> >>> >>> > >>> > Here are some details about my systems, if you need it. >>> > >>> > >>> > selinux is disabled. >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > [root@swm-02 ~]# rpm -qa | grep ovirt >>> > ovirt-imageio-common-1.5.1-0.el7.x86_64 >>> > ovirt-release43-4.3.5.2-1.el7.noarch >>> > ovirt-imageio-daemon-1.5.1-0.el7.noarch >>> > ovirt-vmconsole-host-1.0.7-2.el7.noarch >>> > ovirt-hosted-engine-setup-2.3.11-1.el7.noarch >>> > ovirt-ansible-hosted-engine-setup-1.0.26-1.el7.noarch >>> > python2-ovirt-host-deploy-1.8.0-1.el7.noarch >>> > ovirt-ansible-engine-setup-1.1.9-1.el7.noarch >>> > python2-ovirt-setup-lib-1.2.0-1.el7.noarch >>> > cockpit-machines-ovirt-195.1-1.el7.noarch >>> > ovirt-hosted-engine-ha-2.3.3-1.el7.noarch >>> > ovirt-vmconsole-1.0.7-2.el7.noarch >>> > cockpit-ovirt-dashboard-0.13.5-1.el7.noarch >>> > ovirt-provider-ovn-driver-1.2.22-1.el7.noarch >>> > ovirt-host-deploy-common-1.8.0-1.el7.noarch >>> > ovirt-host-4.3.4-1.el7.x86_64 >>> > python-ovirt-engine-sdk4-4.3.2-2.el7.x86_64 >>> > ovirt-host-dependencies-4.3.4-1.el7.x86_64 >>> > ovirt-ansible-repositories-1.1.5-1.el7.noarch >>> > [root@swm-02 ~]# cat /etc/redhat-release >>> > CentOS Linux release 7.6.1810 (Core) >>> > [root@swm-02 ~]# uname -r >>> > 3.10.0-957.27.2.el7.x86_64 >>> > You have new mail in /var/spool/mail/root >>> > [root@swm-02 ~]# ip a >>> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>> > group default qlen 1000 >>> > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>> > inet 127.0.0.1/8 scope host lo >>> > valid_lft forever preferred_lft forever >>> > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > test state UP group default qlen 1000 >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>> > 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > default qlen 1000 >>> > link/ether d4:ae:52:8d:50:49 brd ff:ff:ff:ff:ff:ff >>> > 4: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > ovirtmgmt state UP group default qlen 1000 >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>> > 5: p1p2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > default qlen 1000 >>> > link/ether 90:e2:ba:1e:14:81 brd ff:ff:ff:ff:ff:ff >>> > 6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > group default qlen 1000 >>> > link/ether a2:b8:d6:e8:b3:d8 brd ff:ff:ff:ff:ff:ff >>> > 7: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group >>> > default qlen 1000 >>> > link/ether 96:a0:c1:4a:45:4b brd ff:ff:ff:ff:ff:ff >>> > 25: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >>> > state UP group default qlen 1000 >>> > link/ether d4:ae:52:8d:50:48 brd ff:ff:ff:ff:ff:ff >>> > inet 10.15.11.21/24 brd 10.15.11.255 scope global test >>> > valid_lft forever preferred_lft forever >>> > 26: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >>> > noqueue state UP group default qlen 1000 >>> > link/ether 90:e2:ba:1e:14:80 brd ff:ff:ff:ff:ff:ff >>> > inet 10.15.28.31/24 brd 10.15.28.255 scope global ovirtmgmt >>> > valid_lft forever preferred_lft forever >>> > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN >>> > group default qlen 1000 >>> > link/ether 62:e5:e5:07:99:eb brd ff:ff:ff:ff:ff:ff >>> > 29: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master >>> > ovirtmgmt state UNKNOWN group default qlen 1000 >>> > link/ether fe:6f:9c:95:00:02 brd ff:ff:ff:ff:ff:ff >>> > [root@swm-02 ~]# free -m >>> > total used free shared buff/cache available >>> > Mem: 64413 1873 61804 9 735 62062 >>> > Swap: 16383 0 16383 >>> > [root@swm-02 ~]# free -h >>> > total used free shared buff/cache available >>> > Mem: 62G 1.8G 60G 9.5M 735M 60G >>> > Swap: 15G 0B 15G >>> > [root@swm-02 ~]# ls >>> > ls lsb_release lshw lslocks >>> > lsmod lspci lssubsys >>> > lsusb.py >>> > lsattr lscgroup lsinitrd lslogins >>> > lsns lss16toppm lstopo-no-graphics >>> > lsblk lscpu lsipc lsmem >>> > lsof lsscsi lsusb >>> > [root@swm-02 ~]# lscpu >>> > Architecture: x86_64 >>> > CPU op-mode(s): 32-bit, 64-bit >>> > Byte Order: Little Endian >>> > CPU(s): 16 >>> > On-line CPU(s) list: 0-15 >>> > Thread(s) per core: 2 >>> > Core(s) per socket: 4 >>> > Socket(s): 2 >>> > NUMA node(s): 2 >>> > Vendor ID: GenuineIntel >>> > CPU family: 6 >>> > Model: 44 >>> > Model name: Intel(R) Xeon(R) CPU X5672 @ 3.20GHz >>> > Stepping: 2 >>> > CPU MHz: 3192.064 >>> > BogoMIPS: 6384.12 >>> > Virtualization: VT-x >>> > L1d cache: 32K >>> > L1i cache: 32K >>> > L2 cache: 256K >>> > L3 cache: 12288K >>> > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 >>> > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 >>> > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep >>> > mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht >>> > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts >>> > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq >>> > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca >>> > sse4_1 sse4_2 popcnt aes lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi >>> > flexpriority ept vpid dtherm ida arat spec_ctrl intel_stibp flush_l1d >>> > [root@swm-02 ~]# >>> > _______________________________________________ >>> > Users mailing list -- users@ovirt.org >>> > To unsubscribe send an email to users-leave@ovirt.org >>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>> > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTMZ5MF4CF2VR2... >> >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBA7NYKAJNREIV...