Configuration with versions

Next email will have log files.

 

2 sites

First site: Bayview

4 nodes BL460 gen9 with 4 x 10G nics

Node 1-3 have not been changed since 4.3.2 upgrade. These nodes have the network sync issue and cannot migrate VMs.

OS Version:

RHEL - 7 - 6.1810.2.el7.centos

OS Description:

CentOS Linux 7 (Core)

Kernel Version:

3.10.0 - 957.10.1.el7.x86_64

KVM Version:

2.12.0 - 18.el7_6.3.1

LIBVIRT Version:

libvirt-4.5.0-10.el7_6.6

VDSM Version:

vdsm-4.30.11-1.el7

SPICE Version:

0.14.0 - 6.el7_6.1

CEPH Version:

librbd1-10.2.5-4.el7

Open vSwitch Version:

openvswitch-2.10.1-3.el7

Kernel Features:

PTI: 1, IBRS: 0, RETP: 1

VNC Encryption:

Disabled

 

I evacuated node 4 and did update to 4.3.3, still had issues so tried ISO OVNode install. It does not have network sync issue. However I cannot upgrade all the nodes without scheduling cluster downtime.

OS Version:

RHEL - 7 - 6.1810.2.el7.centos

OS Description:

oVirt Node 4.3.3.1

Kernel Version:

3.10.0 - 957.10.1.el7.x86_64

KVM Version:

2.12.0 - 18.el7_6.3.1

LIBVIRT Version:

libvirt-4.5.0-10.el7_6.6

VDSM Version:

vdsm-4.30.13-1.el7

SPICE Version:

0.14.0 - 6.el7_6.1

CEPH Version:

librbd1-10.2.5-4.el7

Open vSwitch Version:

openvswitch-2.10.1-3.el7

Kernel Features:

PTI: 1, IBRS: 0, RETP: 1, SSBD: 3

VNC Encryption:

Disabled

 

1 engine, DL360g10.  4.3.3.5-1.el7 CentOS 7 patched up to today

Storage: Dell Unity ISCSI

 

What I see in the engine is that the first 3 nodes all have the public network listed as out of sync with the DC. I can migrate form node 4 to the other 3 nodes but cannot migrate off the other 3 nodes. I also cannot sync the network on the other three nodes. There have been no recent network changes.

After the 4.3.3 upgrade I initially found some curious OVN errors in the logfile on nodes 3 and 4. Nodes 1 and 2 do not have these errors. However the engine did have 2 extra OVN ports defined.

ovs-vsctl show

be10cd3d-85fe-4985-9635-f447bfbc5e25

    Bridge br-int

        fail_mode: secure

        Port "ovn-877214-0"

            Interface "ovn-877214-0"

                type: geneve

                options: {csum="true", key=flow, remote_ip="137.187.160.14"}

                error: "could not add network device ovn-877214-0 to ofproto (File exists)"

        Port "ovn-48e040-0"

            Interface "ovn-48e040-0"

                type: geneve

                options: {csum="true", key=flow, remote_ip="137.187.160.18"}

        Port "ovn-d6eaa1-0"

            Interface "ovn-d6eaa1-0"

                type: geneve

                options: {csum="true", key=flow, remote_ip="137.187.160.13"}

                error: "could not add network device ovn-d6eaa1-0 to ofproto (File exists)"

        Port br-int

            Interface br-int

                type: internal

        Port "ovn-f0f789-0"

            Interface "ovn-f0f789-0"

                type: geneve

                options: {csum="true", key=flow, remote_ip="137.187.160.13"}

 

 

Second site:Harbor

I upgraded to 4.3.2 and stopped any upgrades until I could deal with the migration and restarting VDSMD issues.

Nodes are 3 supermicro 1U with 2x10G nics

All 3 nodes are the same and from the 4.3.2 update.

OS Version:

RHEL - 7 - 6.1810.2.el7.centos

OS Description:

CentOS Linux 7 (Core)

Kernel Version:

3.10.0 - 957.10.1.el7.x86_64

KVM Version:

2.12.0 - 18.el7_6.3.1

LIBVIRT Version:

libvirt-4.5.0-10.el7_6.6

VDSM Version:

vdsm-4.30.11-1.el7

SPICE Version:

0.14.0 - 6.el7_6.1

GlusterFS Version:

[N/A]

CEPH Version:

librbd1-10.2.5-4.el7

Open vSwitch Version:

openvswitch-2.10.1-3.el7

Kernel Features:

PTI: 1, IBRS: 0, RETP: 1

VNC Encryption:

Disabled

 

1 engine DL360 g10 with 2x10G nics running 4.3.2.1-1.el7

Storage: Dell Unity ISCSI

 

The only issue with Harbor site while running 4.3.2 is that when I migrate a node it never finishes in the engine until I restart VDSMD on the original host. It did not exhibit this issue with 4.2.x