Looks still like the ovn-controller on the host has problems communicating
with ovn-southbound.
Are there any hints in /var/log/openvswitch/*.log,
especially in /var/log/openvswitch/ovsdb-server-sb.log ?
Can you please check the output of
ovn-nbctl get-ssl
ovn-nbctl get-connection
ovn-sbctl get-ssl
ovn-sbctl get-connection
ls -l /etc/pki/ovirt-engine/keys/ovn-*
it should be similar to
[root@ovirt-43 ~]# ovn-nbctl get-ssl
Private key: /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
Certificate: /etc/pki/ovirt-engine/certs/ovn-ndb.cer
CA Certificate: /etc/pki/ovirt-engine/ca.pem
Bootstrap: false
[root@ovirt-43 ~]# ovn-nbctl get-connection
pssl:6641:[::]
[root@ovirt-43 ~]# ovn-sbctl get-ssl
Private key: /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
Certificate: /etc/pki/ovirt-engine/certs/ovn-sdb.cer
CA Certificate: /etc/pki/ovirt-engine/ca.pem
Bootstrap: false
[root@ovirt-43 ~]# ovn-sbctl get-connection
read-write role="" pssl:6642:[::]
[root@ovirt-43 ~]# ls -l /etc/pki/ovirt-engine/keys/ovn-*
-rw-r-----. 1 root hugetlbfs 1828 Oct 14 2019
/etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
-rw-------. 1 root root 2709 Oct 14 2019
/etc/pki/ovirt-engine/keys/ovn-ndb.p12
-rw-r-----. 1 root hugetlbfs 1828 Oct 14 2019
/etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
-rw-------. 1 root root 2709 Oct 14 2019
/etc/pki/ovirt-engine/keys/ovn-sdb.p12
On Fri, Sep 11, 2020 at 1:10 PM Konstantinos Betsis <k.betsis(a)gmail.com>
wrote:
I did a restart of the ovn-controller, this is the output of the
ovn-controller.log
2020-09-11T10:54:07.566Z|00001|vlog|INFO|opened log file
/var/log/openvswitch/ovn-controller.log
2020-09-11T10:54:07.568Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connecting...
2020-09-11T10:54:07.568Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connected
2020-09-11T10:54:07.570Z|00004|main|INFO|OVS IDL reconnected, force
recompute.
2020-09-11T10:54:07.571Z|00005|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connecting...
2020-09-11T10:54:07.571Z|00006|main|INFO|OVNSB IDL reconnected, force
recompute.
2020-09-11T10:54:07.685Z|00007|stream_ssl|WARN|SSL_connect: unexpected SSL
connection close
2020-09-11T10:54:07.685Z|00008|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connection attempt failed (Protocol error)
2020-09-11T10:54:08.685Z|00009|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connecting...
2020-09-11T10:54:08.800Z|00010|stream_ssl|WARN|SSL_connect: unexpected SSL
connection close
2020-09-11T10:54:08.800Z|00011|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connection attempt failed (Protocol error)
2020-09-11T10:54:08.800Z|00012|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
waiting 2 seconds before reconnect
2020-09-11T10:54:10.802Z|00013|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connecting...
2020-09-11T10:54:10.917Z|00014|stream_ssl|WARN|SSL_connect: unexpected SSL
connection close
2020-09-11T10:54:10.917Z|00015|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connection attempt failed (Protocol error)
2020-09-11T10:54:10.917Z|00016|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
waiting 4 seconds before reconnect
2020-09-11T10:54:14.921Z|00017|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connecting...
2020-09-11T10:54:15.036Z|00018|stream_ssl|WARN|SSL_connect: unexpected SSL
connection close
2020-09-11T10:54:15.036Z|00019|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
connection attempt failed (Protocol error)
2020-09-11T10:54:15.036Z|00020|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
continuing to reconnect in the background but suppressing further logging
I have also done the vdsm-tool ovn-config OVIRT_ENGINE_IP
OVIRTMGMT_NETWORK_DC
This is how the OVIRT_ENGINE_IP is provided in the ovn controller, i can
redo it if you wan.
After the restart of the ovn-controller the OVIRT ENGINE still shows only
two geneve connections one with DC01-host02 and DC02-host01.
Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144"
hostname: "dc02-host01"
Encap geneve
ip: "DC02-host01_IP"
options: {csum="true"}
Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c"
hostname: "DC01-host02"
Encap geneve
ip: "DC01-host02"
options: {csum="true"}
I've re-done the vdsm-tool command and nothing changed.... again....with
the same errors as the systemctl restart ovn-controller
On Fri, Sep 11, 2020 at 1:49 PM Dominik Holler <dholler(a)redhat.com> wrote:
> Please include ovirt-users list in your reply, to share the knowledge and
> experience with the community!
>
> On Fri, Sep 11, 2020 at 12:12 PM Konstantinos Betsis <k.betsis(a)gmail.com>
> wrote:
>
>> Ok below the output per node and DC
>> DC01
>> node01
>>
>> [root@dc01-node01 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-remote
>> "ssl:*OVIRT_ENGINE_IP*:6642"
>> [root@ dc01-node01 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-encap-type
>> geneve
>> [root@ dc01-node01 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-encap-ip
>>
>> "*OVIRTMGMT_IP_DC01-NODE01*"
>>
>> node02
>>
>> [root@dc01-node02 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-remote
>> "ssl:*OVIRT_ENGINE_IP*:6642"
>> [root@ dc01-node02 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-encap-type
>> geneve
>> [root@ dc01-node02 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-encap-ip
>>
>> "*OVIRTMGMT_IP_DC01-NODE02*"
>>
>> DC02
>> node01
>>
>> [root@dc02-node01 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-remote
>> "ssl:*OVIRT_ENGINE_IP*:6642"
>> [root@ dc02-node01 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-encap-type
>> geneve
>> [root@ dc02-node01 ~]# ovs-vsctl --no-wait get open .
>> external-ids:ovn-encap-ip
>>
>> "*OVIRTMGMT_IP_DC02-NODE01*"
>>
>>
> Looks good.
>
>
>> DC01 node01 and node02 share the same VM networks and VMs deployed on
>> top of them cannot talk to VM on the other hypervisor.
>>
>
> Maybe there is a hint on ovn-controller.log on dc01-node02 ? Maybe
> restarting ovn-controller creates more helpful log messages?
>
> You can also try restart the ovn configuration on all hosts by executing
> vdsm-tool ovn-config OVIRT_ENGINE_IP LOCAL_OVIRTMGMT_IP
> on each host, this would trigger
>
>
https://github.com/oVirt/ovirt-provider-ovn/blob/master/driver/scripts/se...
> internally.
>
>
>> So I would expect to see the same output for node01 to have a geneve
>> tunnel to node02 and vice versa.
>>
>>
> Me too.
>
>
>> On Fri, Sep 11, 2020 at 12:14 PM Dominik Holler <dholler(a)redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Sep 11, 2020 at 10:53 AM Konstantinos Betsis <
>>> k.betsis(a)gmail.com> wrote:
>>>
>>>> Hi Dominik
>>>>
>>>> OVN is selected as the default network provider on the clusters and
>>>> the hosts.
>>>>
>>>>
>>> sounds good.
>>> This configuration is required already during the host is added to
>>> oVirt Engine, because OVN is configured during this step.
>>>
>>>
>>>> The "ovn-sbctl show" works on the ovirt engine and shows only
two
>>>> hosts, 1 per DC.
>>>>
>>>> Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144"
>>>> hostname: "dc01-node02"
>>>> Encap geneve
>>>> ip: "X.X.X.X"
>>>> options: {csum="true"}
>>>> Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c"
>>>> hostname: "dc02-node1"
>>>> Encap geneve
>>>> ip: "A.A.A.A"
>>>> options: {csum="true"}
>>>>
>>>>
>>>> The new node is not listed (dc01-node1).
>>>>
>>>> When executed on the nodes the same command (ovn-sbctl show) times-out
>>>> on all nodes.....
>>>>
>>>> The output of the /var/log/openvswitch/ovn-conntroller.log lists on
>>>> all logs
>>>>
>>>> 2020-09-11T08:46:55.197Z|07361|stream_ssl|WARN|SSL_connect: unexpected
>>>> SSL connection close
>>>>
>>>>
>>>>
>>> Can you please compare the output of
>>>
>>> ovs-vsctl --no-wait get open . external-ids:ovn-remote
>>> ovs-vsctl --no-wait get open . external-ids:ovn-encap-type
>>> ovs-vsctl --no-wait get open . external-ids:ovn-encap-ip
>>>
>>> of the working hosts, e.g. dc01-node02, and the failing host dc01-node1?
>>> This should point us the relevant difference in the configuration.
>>>
>>> Please include ovirt-users list in your replay, to share the knowledge
>>> and experience with the community.
>>>
>>>
>>>
>>>> Thank you
>>>> Best regards
>>>> Konstantinos Betsis
>>>>
>>>>
>>>> On Fri, Sep 11, 2020 at 11:01 AM Dominik Holler
<dholler(a)redhat.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 10, 2020 at 6:26 PM Konstantinos B
<k.betsis(a)gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all
>>>>>>
>>>>>> We have a small installation based on OVIRT 4.3.
>>>>>> 1 Cluster is based on Centos 7 and the other on OVIRT NG Node
image.
>>>>>>
>>>>>> The environment was stable till an upgrade took place a couple
of
>>>>>> months ago.
>>>>>> As such we had to re-install one of the Centos 7 node and start
from
>>>>>> scratch.
>>>>>>
>>>>>
>>>>> To trigger the automatic configuration of the host, it is required
to
>>>>> configure ovirt-provider-ovn as the default network provider for the
>>>>> cluster before adding the host to oVirt.
>>>>>
>>>>>
>>>>>> Even though the installation completed successfully and VMs are
>>>>>> created, the following are not working as expected:
>>>>>> 1. ovn geneve tunnels are not established with the other Centos
7
>>>>>> node in the cluster.
>>>>>> 2. Centos 7 node is configured by ovirt engine however no geneve
>>>>>> tunnel is established when "ovn-sbctl show" is issued
on the engine.
>>>>>>
>>>>>
>>>>> Does "ovn-sbctl show" list the hosts?
>>>>>
>>>>>
>>>>>> 3. no flows are shown on the engine on port 6642 for the ovs db.
>>>>>>
>>>>>> Does anyone have any experience on how to troubleshoot OVN on
ovirt?
>>>>>>
>>>>>>
>>>>> /var/log/openvswitch/ovncontroller.log on the host should contain a
>>>>> helpful hint.
>>>>>
>>>>>
>>>>>
>>>>>> Thank you
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
>>>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
>>>>>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LBVGLQJBWJF...
>>>>>>
>>>>>