When i try the above commands on the node hosts the following happens:[root@ath01-ovirt01 certs]# ovn-nbctl get-sslPrivate key: /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopassCertificate: /etc/pki/ovirt-engine/certs/ovn-ndb.cerCA Certificate: /etc/pki/ovirt-engine/ca.pemBootstrap: false[root@ath01-ovirt01 certs]# ovn-nbctl get-connectionptcp:6641[root@ath01-ovirt01 certs]# ovn-sbctl get-sslPrivate key: /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopassCertificate: /etc/pki/ovirt-engine/certs/ovn-sdb.cerCA Certificate: /etc/pki/ovirt-engine/ca.pemBootstrap: false[root@ath01-ovirt01 certs]# ovn-sbctl get-connectionread-write role="" ptcp:6642[root@ath01-ovirt01 certs]# ls -l /etc/pki/ovirt-engine/keys/ovn-*-rw-r-----. 1 root hugetlbfs 1828 Jun 25 11:08 /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass-rw-------. 1 root root 2893 Jun 25 11:08 /etc/pki/ovirt-engine/keys/ovn-ndb.p12-rw-r-----. 1 root hugetlbfs 1828 Jun 25 11:08 /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass-rw-------. 1 root root 2893 Jun 25 11:08 /etc/pki/ovirt-engine/keys/ovn-sdb.p12
2020-09-14T07:18:38.187Z|219636|reconnect|WARN|tcp:DC02-host01:33146: connection dropped (Protocol error)2020-09-14T07:18:41.946Z|219637|reconnect|WARN|tcp:DC01-host01:51188: connection dropped (Protocol error)2020-09-14T07:18:43.033Z|219638|reconnect|WARN|tcp:DC01-host02:37044: connection dropped (Protocol error)2020-09-14T07:18:46.198Z|219639|reconnect|WARN|tcp:DC02-host01:33148: connection dropped (Protocol error)2020-09-14T07:18:50.069Z|219640|jsonrpc|WARN|Dropped 4 log messages in last 12 seconds (most recently, 4 seconds ago) due to excessive rate2020-09-14T07:18:50.069Z|219641|jsonrpc|WARN|tcp:DC01-host01:51190: error parsing stream: line 0, column 0, byte 0: invalid character U+00162020-09-14T07:18:50.069Z|219642|jsonrpc|WARN|Dropped 4 log messages in last 12 seconds (most recently, 4 seconds ago) due to excessive rate2020-09-14T07:18:50.069Z|219643|jsonrpc|WARN|tcp:DC01-host01:51190: received SSL data on JSON-RPC channel2020-09-14T07:18:50.070Z|219644|reconnect|WARN|tcp:DC01-host01:51190: connection dropped (Protocol error)2020-09-14T07:18:51.147Z|219645|reconnect|WARN|tcp:DC01-host02:37046: connection dropped (Protocol error)2020-09-14T07:18:54.209Z|219646|reconnect|WARN|tcp:DC02-host01:33150: connection dropped (Protocol error)2020-09-14T07:18:58.192Z|219647|reconnect|WARN|tcp:DC01-host01:51192: connection dropped (Protocol error)2020-09-14T07:18:59.262Z|219648|jsonrpc|WARN|Dropped 3 log messages in last 8 seconds (most recently, 1 seconds ago) due to excessive rate2020-09-14T07:18:59.262Z|219649|jsonrpc|WARN|tcp:DC01-host02:37048: error parsing stream: line 0, column 0, byte 0: invalid character U+00162020-09-14T07:18:59.263Z|219650|jsonrpc|WARN|Dropped 3 log messages in last 8 seconds (most recently, 1 seconds ago) due to excessive rate2020-09-14T07:18:59.263Z|219651|jsonrpc|WARN|tcp:DC01-host02:37048: received SSL data on JSON-RPC channel2020-09-14T07:18:59.263Z|219652|reconnect|WARN|tcp:DC01-host02:37048: connection dropped (Protocol error)2020-09-14T07:19:02.220Z|219653|reconnect|WARN|tcp:DC02-host01:33152: connection dropped (Protocol error)2020-09-14T07:19:06.316Z|219654|reconnect|WARN|tcp:DC01-host01:51194: connection dropped (Protocol error)2020-09-14T07:19:07.386Z|219655|reconnect|WARN|tcp:DC01-host02:37050: connection dropped (Protocol error)2020-09-14T07:19:10.232Z|219656|reconnect|WARN|tcp:DC02-host01:33154: connection dropped (Protocol error)2020-09-14T07:19:14.439Z|219657|jsonrpc|WARN|Dropped 4 log messages in last 12 seconds (most recently, 4 seconds ago) due to excessive rate2020-09-14T07:19:14.439Z|219658|jsonrpc|WARN|tcp:DC01-host01:51196: error parsing stream: line 0, column 0, byte 0: invalid character U+00162020-09-14T07:19:14.439Z|219659|jsonrpc|WARN|Dropped 4 log messages in last 12 seconds (most recently, 4 seconds ago) due to excessive rate2020-09-14T07:19:14.439Z|219660|jsonrpc|WARN|tcp:DC01-host01:51196: received SSL data on JSON-RPC channel2020-09-14T07:19:14.440Z|219661|reconnect|WARN|tcp:DC01-host01:51196: connection dropped (Protocol error)2020-09-14T07:19:15.505Z|219662|reconnect|WARN|tcp:DC01-host02:37052: connection dropped (Protocol error)
Looks still like the ovn-controller on the host has problems communicating with ovn-southbound.Are there any hints in /var/log/openvswitch/*.log, especially in /var/log/openvswitch/ovsdb-server-sb.log ?Can you please check the output ofovn-nbctl get-sslovn-nbctl get-connectionovn-sbctl get-sslovn-sbctl get-connectionls -l /etc/pki/ovirt-engine/keys/ovn-*it should be similar to[root@ovirt-43 ~]# ovn-nbctl get-ssl
Private key: /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
Certificate: /etc/pki/ovirt-engine/certs/ovn-ndb.cer
CA Certificate: /etc/pki/ovirt-engine/ca.pem
Bootstrap: false
[root@ovirt-43 ~]# ovn-nbctl get-connection
pssl:6641:[::]
[root@ovirt-43 ~]# ovn-sbctl get-ssl
Private key: /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
Certificate: /etc/pki/ovirt-engine/certs/ovn-sdb.cer
CA Certificate: /etc/pki/ovirt-engine/ca.pem
Bootstrap: false
[root@ovirt-43 ~]# ovn-sbctl get-connection
read-write role="" pssl:6642:[::][root@ovirt-43 ~]# ls -l /etc/pki/ovirt-engine/keys/ovn-*
-rw-r-----. 1 root hugetlbfs 1828 Oct 14 2019 /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
-rw-------. 1 root root 2709 Oct 14 2019 /etc/pki/ovirt-engine/keys/ovn-ndb.p12
-rw-r-----. 1 root hugetlbfs 1828 Oct 14 2019 /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
-rw-------. 1 root root 2709 Oct 14 2019 /etc/pki/ovirt-engine/keys/ovn-sdb.p12On Fri, Sep 11, 2020 at 1:10 PM Konstantinos Betsis <k.betsis@gmail.com> wrote:I did a restart of the ovn-controller, this is the output of the ovn-controller.log2020-09-11T10:54:07.566Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovn-controller.log
2020-09-11T10:54:07.568Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2020-09-11T10:54:07.568Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2020-09-11T10:54:07.570Z|00004|main|INFO|OVS IDL reconnected, force recompute.
2020-09-11T10:54:07.571Z|00005|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connecting...
2020-09-11T10:54:07.571Z|00006|main|INFO|OVNSB IDL reconnected, force recompute.
2020-09-11T10:54:07.685Z|00007|stream_ssl|WARN|SSL_connect: unexpected SSL connection close
2020-09-11T10:54:07.685Z|00008|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connection attempt failed (Protocol error)
2020-09-11T10:54:08.685Z|00009|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connecting...
2020-09-11T10:54:08.800Z|00010|stream_ssl|WARN|SSL_connect: unexpected SSL connection close
2020-09-11T10:54:08.800Z|00011|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connection attempt failed (Protocol error)
2020-09-11T10:54:08.800Z|00012|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: waiting 2 seconds before reconnect
2020-09-11T10:54:10.802Z|00013|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connecting...
2020-09-11T10:54:10.917Z|00014|stream_ssl|WARN|SSL_connect: unexpected SSL connection close
2020-09-11T10:54:10.917Z|00015|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connection attempt failed (Protocol error)
2020-09-11T10:54:10.917Z|00016|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: waiting 4 seconds before reconnect
2020-09-11T10:54:14.921Z|00017|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connecting...
2020-09-11T10:54:15.036Z|00018|stream_ssl|WARN|SSL_connect: unexpected SSL connection close
2020-09-11T10:54:15.036Z|00019|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: connection attempt failed (Protocol error)
2020-09-11T10:54:15.036Z|00020|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: continuing to reconnect in the background but suppressing further loggingI have also done the vdsm-tool ovn-config OVIRT_ENGINE_IP OVIRTMGMT_NETWORK_DCThis is how the OVIRT_ENGINE_IP is provided in the ovn controller, i can redo it if you wan.After the restart of the ovn-controller the OVIRT ENGINE still shows only two geneve connections one with DC01-host02 and DC02-host01.Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144"
hostname: "dc02-host01"
Encap geneve
ip: "DC02-host01_IP"
options: {csum="true"}
Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c"
hostname: "DC01-host02"
Encap geneve
ip: "DC01-host02"
options: {csum="true"}I've re-done the vdsm-tool command and nothing changed.... again....with the same errors as the systemctl restart ovn-controllerOn Fri, Sep 11, 2020 at 1:49 PM Dominik Holler <dholler@redhat.com> wrote:Please include ovirt-users list in your reply, to share the knowledge and experience with the community!On Fri, Sep 11, 2020 at 12:12 PM Konstantinos Betsis <k.betsis@gmail.com> wrote:Ok below the output per node and DCDC01node01[root@dc01-node01 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-remote"ssl:OVIRT_ENGINE_IP:6642"[root@ dc01-node01 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-encap-typegeneve[root@ dc01-node01 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-encap-ip"OVIRTMGMT_IP_DC01-NODE01"node02[root@dc01-node02 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-remote"ssl:OVIRT_ENGINE_IP:6642"[root@ dc01-node02 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-encap-typegeneve[root@ dc01-node02 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-encap-ipDC02"OVIRTMGMT_IP_DC01-NODE02"node01[root@dc02-node01 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-remote"ssl:OVIRT_ENGINE_IP:6642"[root@ dc02-node01 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-encap-typegeneve[root@ dc02-node01 ~]# ovs-vsctl --no-wait get open . external-ids:ovn-encap-ip"OVIRTMGMT_IP_DC02-NODE01"Looks good.DC01 node01 and node02 share the same VM networks and VMs deployed on top of them cannot talk to VM on the other hypervisor.Maybe there is a hint on ovn-controller.log on dc01-node02 ? Maybe restarting ovn-controller creates more helpful log messages?You can also try restart the ovn configuration on all hosts by executingvdsm-tool ovn-config OVIRT_ENGINE_IP LOCAL_OVIRTMGMT_IPon each host, this would triggerinternally.So I would expect to see the same output for node01 to have a geneve tunnel to node02 and vice versa.Me too.On Fri, Sep 11, 2020 at 12:14 PM Dominik Holler <dholler@redhat.com> wrote:On Fri, Sep 11, 2020 at 10:53 AM Konstantinos Betsis <k.betsis@gmail.com> wrote:Hi DominikOVN is selected as the default network provider on the clusters and the hosts.sounds good.This configuration is required already during the host is added to oVirt Engine, because OVN is configured during this step.The "ovn-sbctl show" works on the ovirt engine and shows only two hosts, 1 per DC.Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144"hostname: "dc01-node02"Encap geneveip: "X.X.X.X"options: {csum="true"}Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c"hostname: "dc02-node1"Encap geneveip: "A.A.A.A"options: {csum="true"}The new node is not listed (dc01-node1).When executed on the nodes the same command (ovn-sbctl show) times-out on all nodes.....The output of the /var/log/openvswitch/ovn-conntroller.log lists on all logs2020-09-11T08:46:55.197Z|07361|stream_ssl|WARN|SSL_connect: unexpected SSL connection closeCan you please compare the output ofovs-vsctl --no-wait get open . external-ids:ovn-remoteovs-vsctl --no-wait get open . external-ids:ovn-encap-typeovs-vsctl --no-wait get open . external-ids:ovn-encap-ipof the working hosts, e.g. dc01-node02, and the failing host dc01-node1?This should point us the relevant difference in the configuration.Please include ovirt-users list in your replay, to share the knowledge and experience with the community.Thank youBest regardsKonstantinos BetsisOn Fri, Sep 11, 2020 at 11:01 AM Dominik Holler <dholler@redhat.com> wrote:On Thu, Sep 10, 2020 at 6:26 PM Konstantinos B <k.betsis@gmail.com> wrote:Hi all
We have a small installation based on OVIRT 4.3.
1 Cluster is based on Centos 7 and the other on OVIRT NG Node image.
The environment was stable till an upgrade took place a couple of months ago.
As such we had to re-install one of the Centos 7 node and start from scratch.To trigger the automatic configuration of the host, it is required to configure ovirt-provider-ovn as the default network provider for the cluster before adding the host to oVirt.Even though the installation completed successfully and VMs are created, the following are not working as expected:
1. ovn geneve tunnels are not established with the other Centos 7 node in the cluster.
2. Centos 7 node is configured by ovirt engine however no geneve tunnel is established when "ovn-sbctl show" is issued on the engine.Does "ovn-sbctl show" list the hosts?3. no flows are shown on the engine on port 6642 for the ovs db.
Does anyone have any experience on how to troubleshoot OVN on ovirt?
/var/log/openvswitch/ovncontroller.log on the host should contain a helpful hint.Thank you
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LBVGLQJBWJF3EKFITPR72LBPA5A43WWW/