Hi all,
I'm building a home lab using oVirt+GlusterFS in hyperconverged(ish) setup.
My setup consists of 2x nodes with ASRock H110M-STX motherboard, Intel
Pentium G4560 3,5 GHz CPU and 16 GB RAM. Motherboard has integrated
Intel Gigabit I219V LAN. At the moment I'm using RaspberryPi as Gluster
arbiter node. Nodes are connected to basic "desktop switch" without any
management available.
Hardware is nowhere near perfect, but it get its job done and is enough
for playing around. However I'm having problems getting OVN to work
properly and I'm clueless where to look next.
oVirt is setup like this:
oVirt engine host oe / 10.0.1.101
oVirt hypervisor host o2 / 10.0.1.18
oVirt hypervisor host o3 / 10.0.1.21
OVN network 10.0.200.0/24
When I spin up a VM in o2 and o3 with IP address in network 10.0.1.0/24
everything works fine. VMs can interact between each other without any
problems.
Problems show up when I try to use OVN based network between virtual
machines. If virtual machines are on same hypervisor then everything
seems to work ok. But if I have virtual machine on hypervisor o2 and
another one on hypervisor o3 then TCP connections doesn't work very
well. UDP seems to be ok and it's possible to ping hosts, do dns & ntp
queries and so on.
Problem with TCP is that for example when taking SSH connection to
another host at some point connection just hangs and most of the time
it's not even possible to even log in before connectiong hangs. If I
look into tcpdump at that point it looks like packets never reach
destination. Also, if I have multiple connections, then all of them hang
at the same time.
I have tried switching off tx checksum and other similar settings, but
it didn't make any difference.
I'm suspecting that hardware is not good enough. Before investigating
into new hardware I'd like to get some confirmation that everything is
setup correctly.
When setting up oVirt/OVN I had to run following undocumented command to
get it working at all: vdsm-tool ovn-config 10.0.1.101 10.0.1.21 (oVirt
engine IP, hypervisor IP). Especially this makes me think that I have
missed some crucial part in configuration.
On oVirt engine in /var/log/openvswitch/ovsdb-server-nb.log there are
error messages:
2018-05-06T08:30:05.418Z|00913|stream_ssl|WARN|SSL_read: unexpected SSL
connection close
2018-05-06T08:30:05.418Z|00914|jsonrpc|WARN|ssl:127.0.0.1:53152: receive
error: Protocol error
2018-05-06T08:30:05.419Z|00915|reconnect|WARN|ssl:127.0.0.1:53152:
connection dropped (Protocol error)
To be honest, I'm not sure what's causing those error messages or are
they related. I found out some bug reports stating that they are not
critical.
Any ideas what to do next or should I just get better hardware? :)
Best regards,
Samuli Heinonen