Hosts not coming back into oVirt

Hi all, Recently deployed oVirt version 4.3.1 It's in a self-hosted engine environment Used the steps via cockpit to install the engine, and was able to add the rest of the oVirt nodes without any specific problems We tested the HA of the hosted-engine without a problem, and then at one point of turn off the machine that was hosting the engine, to mimic failure to see how it goes; the vm was able to move over successfully, but some of the oVirt started to go into Unassigned. From a total of 6 oVirt hosts, I have 4 of them in this state. Clicking on the host, I see the following message in the events. I can get to the hosts via the engine, and ping the machine, so not sure what it's doing that it's no longer working VDSM <snip> command Get Host Capabilities failed: Message timeout which can be caused by communication issues Mind you, I have been trying to resolve this issue since Monday, and have tried various things, like rebooting and re-installing the oVirt hosts, without having much luck So any assistance on this would be grateful, maybe I've missed something really simple, and I am overlooking it -- regards, Arif Ali

On Thu, Mar 21, 2019 at 3:47 PM Arif Ali <mail@arif-ali.co.uk> wrote:
Hi all,
Recently deployed oVirt version 4.3.1
It's in a self-hosted engine environment
Used the steps via cockpit to install the engine, and was able to add the rest of the oVirt nodes without any specific problems
We tested the HA of the hosted-engine without a problem, and then at one point of turn off the machine that was hosting the engine, to mimic failure to see how it goes; the vm was able to move over successfully, but some of the oVirt started to go into Unassigned. From a total of 6 oVirt hosts, I have 4 of them in this state.
Clicking on the host, I see the following message in the events. I can get to the hosts via the engine, and ping the machine, so not sure what it's doing that it's no longer working
VDSM <snip> command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Mind you, I have been trying to resolve this issue since Monday, and have tried various things, like rebooting and re-installing the oVirt hosts, without having much luck
So any assistance on this would be grateful, maybe I've missed something really simple, and I am overlooking it
Can you please check that VDSM is correctly running on that nodes? Are you able to correctly reach that nodes from the engine VM?
-- regards,
Arif Ali _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYG7NEV24JCCR4...

Hi, I have exactly the same issue after upgrade from 4.2.8 to 4.3.2. I can reach the host from SHE but the VDSM is constantly failing to start on the host after upgrade. чт, 21 мар. 2019 г., 19:48 Simone Tiraboschi <stirabos@redhat.com>:
On Thu, Mar 21, 2019 at 3:47 PM Arif Ali <mail@arif-ali.co.uk> wrote:
Hi all,
Recently deployed oVirt version 4.3.1
It's in a self-hosted engine environment
Used the steps via cockpit to install the engine, and was able to add the rest of the oVirt nodes without any specific problems
We tested the HA of the hosted-engine without a problem, and then at one point of turn off the machine that was hosting the engine, to mimic failure to see how it goes; the vm was able to move over successfully, but some of the oVirt started to go into Unassigned. From a total of 6 oVirt hosts, I have 4 of them in this state.
Clicking on the host, I see the following message in the events. I can get to the hosts via the engine, and ping the machine, so not sure what it's doing that it's no longer working
VDSM <snip> command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Mind you, I have been trying to resolve this issue since Monday, and have tried various things, like rebooting and re-installing the oVirt hosts, without having much luck
So any assistance on this would be grateful, maybe I've missed something really simple, and I am overlooking it
Can you please check that VDSM is correctly running on that nodes? Are you able to correctly reach that nodes from the engine VM?
-- regards,
Arif Ali _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYG7NEV24JCCR4...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y7YVXSLFJ3XCQS...

On Fri, Mar 22, 2019 at 10:06 AM Artem Tambovskiy < artem.tambovskiy@gmail.com> wrote:
Hi,
I have exactly the same issue after upgrade from 4.2.8 to 4.3.2. I can reach the host from SHE but the VDSM is constantly failing to start on the host after upgrade.
Can you please attach the output of systemctl status vdsmd and your /var/log/vdsm/vdsm.log?
чт, 21 мар. 2019 г., 19:48 Simone Tiraboschi <stirabos@redhat.com>:
On Thu, Mar 21, 2019 at 3:47 PM Arif Ali <mail@arif-ali.co.uk> wrote:
Hi all,
Recently deployed oVirt version 4.3.1
It's in a self-hosted engine environment
Used the steps via cockpit to install the engine, and was able to add the rest of the oVirt nodes without any specific problems
We tested the HA of the hosted-engine without a problem, and then at one point of turn off the machine that was hosting the engine, to mimic failure to see how it goes; the vm was able to move over successfully, but some of the oVirt started to go into Unassigned. From a total of 6 oVirt hosts, I have 4 of them in this state.
Clicking on the host, I see the following message in the events. I can get to the hosts via the engine, and ping the machine, so not sure what it's doing that it's no longer working
VDSM <snip> command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Mind you, I have been trying to resolve this issue since Monday, and have tried various things, like rebooting and re-installing the oVirt hosts, without having much luck
So any assistance on this would be grateful, maybe I've missed something really simple, and I am overlooking it
Can you please check that VDSM is correctly running on that nodes? Are you able to correctly reach that nodes from the engine VM?
-- regards,
Arif Ali _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYG7NEV24JCCR4...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y7YVXSLFJ3XCQS...

Simone, Here it is. ● vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled) Active: activating (start-pre) since Fri 2019-03-22 12:46:48 MSK; 3s ago Process: 56712 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS) Control: 116050 (vdsmd_init_comm) Tasks: 3 CGroup: /system.slice/vdsmd.service └─control ├─116050 /bin/sh /usr/libexec/vdsm/vdsmd_init_common.sh --pre-start └─116066 /usr/bin/python2 /usr/libexec/vdsm/wait_for_ipv4s Mar 22 12:46:48 ovirt2.domain.org systemd[1]: Starting Virtual Desktop Server Manager... Mar 22 12:46:48 ovirt2.domain.org vdsmd_init_common.sh[116050]: vdsm: Running mkdirs Mar 22 12:46:49 ovirt2.domain.org vdsmd_init_common.sh[116050]: vdsm: Running configure_coredump Mar 22 12:46:49 ovirt2.domain.org vdsmd_init_common.sh[116050]: vdsm: Running configure_vdsm_logs Mar 22 12:46:49 ovirt2.domain.org vdsmd_init_common.sh[116050]: vdsm: Running wait_for_network vdsm.log attached. On Fri, Mar 22, 2019 at 12:10 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Fri, Mar 22, 2019 at 10:06 AM Artem Tambovskiy < artem.tambovskiy@gmail.com> wrote:
Hi,
I have exactly the same issue after upgrade from 4.2.8 to 4.3.2. I can reach the host from SHE but the VDSM is constantly failing to start on the host after upgrade.
Can you please attach the output of systemctl status vdsmd
and your /var/log/vdsm/vdsm.log?
чт, 21 мар. 2019 г., 19:48 Simone Tiraboschi <stirabos@redhat.com>:
On Thu, Mar 21, 2019 at 3:47 PM Arif Ali <mail@arif-ali.co.uk> wrote:
Hi all,
Recently deployed oVirt version 4.3.1
It's in a self-hosted engine environment
Used the steps via cockpit to install the engine, and was able to add the rest of the oVirt nodes without any specific problems
We tested the HA of the hosted-engine without a problem, and then at one point of turn off the machine that was hosting the engine, to mimic failure to see how it goes; the vm was able to move over successfully, but some of the oVirt started to go into Unassigned. From a total of 6 oVirt hosts, I have 4 of them in this state.
Clicking on the host, I see the following message in the events. I can get to the hosts via the engine, and ping the machine, so not sure what it's doing that it's no longer working
VDSM <snip> command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Mind you, I have been trying to resolve this issue since Monday, and have tried various things, like rebooting and re-installing the oVirt hosts, without having much luck
So any assistance on this would be grateful, maybe I've missed something really simple, and I am overlooking it
Can you please check that VDSM is correctly running on that nodes? Are you able to correctly reach that nodes from the engine VM?
-- regards,
Arif Ali _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYG7NEV24JCCR4...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y7YVXSLFJ3XCQS...
-- Regards, Artem

On 21-03-2019 17:47, Simone Tiraboschi wrote:
On Thu, Mar 21, 2019 at 3:47 PM Arif Ali <mail@arif-ali.co.uk> wrote:
Hi all,
Recently deployed oVirt version 4.3.1
It's in a self-hosted engine environment
Used the steps via cockpit to install the engine, and was able to add the rest of the oVirt nodes without any specific problems
We tested the HA of the hosted-engine without a problem, and then at one point of turn off the machine that was hosting the engine, to mimic failure to see how it goes; the vm was able to move over successfully, but some of the oVirt started to go into Unassigned. From a total of 6 oVirt hosts, I have 4 of them in this state.
Clicking on the host, I see the following message in the events. I can get to the hosts via the engine, and ping the machine, so not sure what it's doing that it's no longer working
VDSM <snip> command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Mind you, I have been trying to resolve this issue since Monday, and have tried various things, like rebooting and re-installing the oVirt hosts, without having much luck
So any assistance on this would be grateful, maybe I've missed something really simple, and I am overlooking it
Can you please check that VDSM is correctly running on that nodes? Are you able to correctly reach that nodes from the engine VM?
So, I have gone back, and re-installed the whole solution again with the 4.3.2 now, and I again have the same issue Checking the vdsm logs, I get the issue below in the logs. The host is either Unassigned or Connecting. I don't have the option to Activate or put the host into Maintenance mode. I have tried rebooting the node with no luck Mar 22 10:53:27 scvirt02 vdsm[32481]: WARN WORKER BLOCKED: <WORKER NAME=PERIODIC/2 RUNNING <TASK <OPERATION ACTION=<VDSM.VIRT.SAMPLING.HOSTMONITOR OBJECT AT 0X7EFED4180610> AT 0X7EFED4180650> TIMEOUT=15, DURATION=30.00 AT 0X7EFED4180810> TASK#=2 AT 0X7EFEF41987D0>, TRACEBACK: FILE: "/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 785, IN __BOOTSTRAP SELF.__BOOTSTRAP_INNER() FILE: "/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 812, IN __BOOTSTRAP_INNER SELF.RUN() FILE: "/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 765, IN RUN SELF.__TARGET(*SELF.__ARGS, **SELF.__KWARGS) FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/COMMON/CONCURRENT.PY", LINE 195, IN RUN RET = FUNC(*ARGS, **KWARGS) FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 301, IN _RUN SELF._EXECUTE_TASK() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 315, IN _EXECUTE_TASK TASK() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 391, IN __CALL__ SELF._CALLABLE() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/VIRT/PERIODIC.PY", LINE 186, IN __CALL__ SELF._FUNC() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/VIRT/SAMPLING.PY", LINE 481, IN __CALL__ STATS = HOSTAPI.GET_STATS(SELF._CIF, SELF._SAMPLES.STATS()) FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/HOST/API.PY", LINE 79, IN GET_STATS RET['HASTATS'] = _GETHAINFO() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/HOST/API.PY", LINE 177, IN _GETHAINFO STATS = INSTANCE.GET_ALL_STATS() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/OVIRT_HOSTED_ENGINE_HA/CLIENT/CLIENT.PY", LINE 94, IN GET_ALL_STATS STATS = BROKER.GET_STATS_FROM_STORAGE() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/OVIRT_HOSTED_ENGINE_HA/LIB/BROKERLINK.PY", LINE 143, IN GET_STATS_FROM_STORAGE RESULT = SELF._PROXY.GET_STATS() FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1233, IN __CALL__ RETURN SELF.__SEND(SELF.__NAME, ARGS) FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1591, IN __REQUEST VERBOSE=SELF.__VERBOSE FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1273, IN REQUEST RETURN SELF.SINGLE_REQUEST(HOST, HANDLER, REQUEST_BODY, VERBOSE) FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1303, IN SINGLE_REQUEST RESPONSE = H.GETRESPONSE(BUFFERING=TRUE) FILE: "/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 1113, IN GETRESPONSE RESPONSE.BEGIN() FILE: "/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 444, IN BEGIN VERSION, STATUS, REASON = SELF._READ_STATUS() FILE: "/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 400, IN _READ_STATUS LINE = SELF.FP.READLINE(_MAXLINE + 1) FILE: "/USR/LIB64/PYTHON2.7/SOCKET.PY", LINE 476, IN READLINE DATA = SELF._SOCK.RECV(SELF._RBUFSIZE) On the engine host, I continuously get the following messages too Mar 22 11:02:32 <snip> ovsdb-server[4724]: ovs|01900|jsonrpc|WARN|Dropped 3 log messages in last 14 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:02:32 <snip> ovsdb-server[4724]: ovs|01901|jsonrpc|WARN|ssl:[::ffff:192.168.203.205]:55658: send error: Protocol error Mar 22 11:02:32 <snip> ovsdb-server[4724]: ovs|01902|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55658: connection dropped (Protocol error) Mar 22 11:02:34 <snip> ovsdb-server[4724]: ovs|01903|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:34 <snip> ovsdb-server[4724]: ovs|01904|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49504: connection dropped (Protocol error) Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01905|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01906|jsonrpc|WARN|Dropped 1 log messages in last 5 seconds (most recently, 5 seconds ago) due to excessive rate Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01907|jsonrpc|WARN|ssl:[::ffff:192.168.203.203]:34114: send error: Protocol error Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01908|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34114: connection dropped (Protocol error) Mar 22 11:02:41 <snip> ovsdb-server[4724]: ovs|01909|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52034: connection dropped (Protocol error) Mar 22 11:02:48 <snip> ovsdb-server[4724]: ovs|01910|stream_ssl|WARN|Dropped 1 log messages in last 7 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:02:48 <snip> ovsdb-server[4724]: ovs|01911|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:48 <snip> ovsdb-server[4724]: ovs|01912|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55660: connection dropped (Protocol error) Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01913|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01914|jsonrpc|WARN|Dropped 2 log messages in last 9 seconds (most recently, 2 seconds ago) due to excessive rate Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01915|jsonrpc|WARN|ssl:[::ffff:192.168.203.202]:49506: send error: Protocol error Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01916|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49506: connection dropped (Protocol error) Mar 22 11:02:56 <snip> ovsdb-server[4724]: ovs|01917|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:56 <snip> ovsdb-server[4724]: ovs|01918|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34116: connection dropped (Protocol error) Mar 22 11:02:57 <snip> ovsdb-server[4724]: ovs|01919|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52036: connection dropped (Protocol error) Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01920|stream_ssl|WARN|Dropped 1 log messages in last 7 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01921|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01922|jsonrpc|WARN|Dropped 2 log messages in last 9 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01923|jsonrpc|WARN|ssl:[::ffff:192.168.203.205]:55662: send error: Protocol error Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01924|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55662: connection dropped (Protocol error) Mar 22 11:03:06 <snip> ovsdb-server[4724]: ovs|01925|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49508: connection dropped (Protocol error) Mar 22 11:03:12 <snip> ovsdb-server[4724]: ovs|01926|stream_ssl|WARN|Dropped 1 log messages in last 5 seconds (most recently, 5 seconds ago) due to excessive rate Mar 22 11:03:12 <snip> ovsdb-server[4724]: ovs|01927|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:03:12 <snip> ovsdb-server[4724]: ovs|01928|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34118: connection dropped (Protocol error) Mar 22 11:03:13 <snip> ovsdb-server[4724]: ovs|01929|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52038: connection dropped (Protocol error) -- regards, Arif Ali

On 22-03-2019 12:04, Arif Ali wrote:
On 21-03-2019 17:47, Simone Tiraboschi wrote:
On Thu, Mar 21, 2019 at 3:47 PM Arif Ali <mail@arif-ali.co.uk> wrote: Hi all,
Recently deployed oVirt version 4.3.1
It's in a self-hosted engine environment
Used the steps via cockpit to install the engine, and was able to add the rest of the oVirt nodes without any specific problems
We tested the HA of the hosted-engine without a problem, and then at one point of turn off the machine that was hosting the engine, to mimic failure to see how it goes; the vm was able to move over successfully, but some of the oVirt started to go into Unassigned. From a total of 6 oVirt hosts, I have 4 of them in this state.
Clicking on the host, I see the following message in the events. I can get to the hosts via the engine, and ping the machine, so not sure what it's doing that it's no longer working
VDSM <snip> command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Mind you, I have been trying to resolve this issue since Monday, and have tried various things, like rebooting and re-installing the oVirt hosts, without having much luck
So any assistance on this would be grateful, maybe I've missed something really simple, and I am overlooking it
Can you please check that VDSM is correctly running on that nodes? Are you able to correctly reach that nodes from the engine VM?
So, I have gone back, and re-installed the whole solution again with the 4.3.2 now, and I again have the same issue Checking the vdsm logs, I get the issue below in the logs. The host is either Unassigned or Connecting. I don't have the option to Activate or put the host into Maintenance mode. I have tried rebooting the node with no luck Mar 22 10:53:27 scvirt02 vdsm[32481]: WARN WORKER BLOCKED: <WORKER NAME=PERIODIC/2 RUNNING <TASK <OPERATION ACTION=<VDSM.VIRT.SAMPLING.HOSTMONITOR OBJECT AT 0X7EFED4180610> AT 0X7EFED4180650> TIMEOUT=15, DURATION=30.00 AT 0X7EFED4180810> TASK#=2 AT 0X7EFEF41987D0>, TRACEBACK: FILE: "/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 785, IN __BOOTSTRAP SELF.__BOOTSTRAP_INNER() FILE: "/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 812, IN __BOOTSTRAP_INNER SELF.RUN() FILE: "/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 765, IN RUN SELF.__TARGET(*SELF.__ARGS, **SELF.__KWARGS) FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/COMMON/CONCURRENT.PY", LINE 195, IN RUN RET = FUNC(*ARGS, **KWARGS) FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 301, IN _RUN SELF._EXECUTE_TASK() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 315, IN _EXECUTE_TASK TASK() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 391, IN __CALL__ SELF._CALLABLE() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/VIRT/PERIODIC.PY", LINE 186, IN __CALL__ SELF._FUNC() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/VIRT/SAMPLING.PY", LINE 481, IN __CALL__ STATS = HOSTAPI.GET_STATS(SELF._CIF, SELF._SAMPLES.STATS()) FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/HOST/API.PY", LINE 79, IN GET_STATS RET['HASTATS'] = _GETHAINFO() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/HOST/API.PY", LINE 177, IN _GETHAINFO STATS = INSTANCE.GET_ALL_STATS() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/OVIRT_HOSTED_ENGINE_HA/CLIENT/CLIENT.PY", LINE 94, IN GET_ALL_STATS STATS = BROKER.GET_STATS_FROM_STORAGE() FILE: "/USR/LIB/PYTHON2.7/SITE-PACKAGES/OVIRT_HOSTED_ENGINE_HA/LIB/BROKERLINK.PY", LINE 143, IN GET_STATS_FROM_STORAGE RESULT = SELF._PROXY.GET_STATS() FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1233, IN __CALL__ RETURN SELF.__SEND(SELF.__NAME, ARGS) FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1591, IN __REQUEST VERBOSE=SELF.__VERBOSE FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1273, IN REQUEST RETURN SELF.SINGLE_REQUEST(HOST, HANDLER, REQUEST_BODY, VERBOSE) FILE: "/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1303, IN SINGLE_REQUEST RESPONSE = H.GETRESPONSE(BUFFERING=TRUE) FILE: "/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 1113, IN GETRESPONSE RESPONSE.BEGIN() FILE: "/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 444, IN BEGIN VERSION, STATUS, REASON = SELF._READ_STATUS() FILE: "/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 400, IN _READ_STATUS LINE = SELF.FP.READLINE(_MAXLINE + 1) FILE: "/USR/LIB64/PYTHON2.7/SOCKET.PY", LINE 476, IN READLINE DATA = SELF._SOCK.RECV(SELF._RBUFSIZE) On the engine host, I continuously get the following messages too Mar 22 11:02:32 <snip> ovsdb-server[4724]: ovs|01900|jsonrpc|WARN|Dropped 3 log messages in last 14 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:02:32 <snip> ovsdb-server[4724]: ovs|01901|jsonrpc|WARN|ssl:[::ffff:192.168.203.205]:55658: send error: Protocol error Mar 22 11:02:32 <snip> ovsdb-server[4724]: ovs|01902|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55658: connection dropped (Protocol error) Mar 22 11:02:34 <snip> ovsdb-server[4724]: ovs|01903|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:34 <snip> ovsdb-server[4724]: ovs|01904|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49504: connection dropped (Protocol error) Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01905|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01906|jsonrpc|WARN|Dropped 1 log messages in last 5 seconds (most recently, 5 seconds ago) due to excessive rate Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01907|jsonrpc|WARN|ssl:[::ffff:192.168.203.203]:34114: send error: Protocol error Mar 22 11:02:40 <snip> ovsdb-server[4724]: ovs|01908|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34114: connection dropped (Protocol error) Mar 22 11:02:41 <snip> ovsdb-server[4724]: ovs|01909|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52034: connection dropped (Protocol error) Mar 22 11:02:48 <snip> ovsdb-server[4724]: ovs|01910|stream_ssl|WARN|Dropped 1 log messages in last 7 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:02:48 <snip> ovsdb-server[4724]: ovs|01911|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:48 <snip> ovsdb-server[4724]: ovs|01912|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55660: connection dropped (Protocol error) Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01913|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01914|jsonrpc|WARN|Dropped 2 log messages in last 9 seconds (most recently, 2 seconds ago) due to excessive rate Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01915|jsonrpc|WARN|ssl:[::ffff:192.168.203.202]:49506: send error: Protocol error Mar 22 11:02:50 <snip> ovsdb-server[4724]: ovs|01916|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49506: connection dropped (Protocol error) Mar 22 11:02:56 <snip> ovsdb-server[4724]: ovs|01917|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:02:56 <snip> ovsdb-server[4724]: ovs|01918|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34116: connection dropped (Protocol error) Mar 22 11:02:57 <snip> ovsdb-server[4724]: ovs|01919|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52036: connection dropped (Protocol error) Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01920|stream_ssl|WARN|Dropped 1 log messages in last 7 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01921|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01922|jsonrpc|WARN|Dropped 2 log messages in last 9 seconds (most recently, 7 seconds ago) due to excessive rate Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01923|jsonrpc|WARN|ssl:[::ffff:192.168.203.205]:55662: send error: Protocol error Mar 22 11:03:04 <snip> ovsdb-server[4724]: ovs|01924|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55662: connection dropped (Protocol error) Mar 22 11:03:06 <snip> ovsdb-server[4724]: ovs|01925|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49508: connection dropped (Protocol error) Mar 22 11:03:12 <snip> ovsdb-server[4724]: ovs|01926|stream_ssl|WARN|Dropped 1 log messages in last 5 seconds (most recently, 5 seconds ago) due to excessive rate Mar 22 11:03:12 <snip> ovsdb-server[4724]: ovs|01927|stream_ssl|WARN|SSL_accept: unexpected SSL connection close Mar 22 11:03:12 <snip> ovsdb-server[4724]: ovs|01928|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34118: connection dropped (Protocol error) Mar 22 11:03:13 <snip> ovsdb-server[4724]: ovs|01929|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52038: connection dropped (Protocol error) I found my issue, and managed to resolve it nothing wrong with oVirt The ovirtmgmt network is 10G, I by default set the MTU to 9000 as I would normally for these type of the network, but found out later that the network team at this site were not supporting 9000, so back to 1500 and all worked without a problem Thanks to all for everyone's assistance -- regards, Arif Ali
participants (3)
-
Arif Ali
-
Artem Tambovskiy
-
Simone Tiraboschi