[ovirt-users] vdsm issues between engine and host
Piotr Kliczewski
piotr.kliczewski at gmail.com
Tue Feb 21 08:59:02 UTC 2017
On Mon, Feb 20, 2017 at 9:47 PM, cmc <iucounu at gmail.com> wrote:
> Hi,
>
> Due to networking and DNS issues. our engine was offlined (it is
> physical machine currently, will be converting it to a VM in the
> future when time allows). When service was restored, I noticed that
> all the VMs were listed as being in an unknown state on one host. The
> VMs were fine, but the engine could not ascertain their status as the
> host itself was in an unknown state. vdsm was reporting errors and was
> not running on the engine (or at least was in status 'failed' in
> systemd). I tried starting vdsmd on the engine but it would not start.
> I decided to try to restart vdsmd on the host and that did allow the
> state of the VMs to be discovered, and the engine listed the host as
> up again. However, there are still errors with vdsmd on both the host
> and the engine, and the engine cannot start vdsmd. I guess it is able
> to monitor the hosts in a limited way as it says they are both up.
> There are communication errors between one of the hosts and the
> engine: the host is refusing connections by the look of it
>
> from the engine log:
>
> 2017-02-20 18:41:51,226Z ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> (DefaultQuartzScheduler2) [f8aa18b3-97b9-48e2-a681-cf3aaed330a5]
> Command 'GetCapabilitiesVDSCommand(HostName = k
> vm-ldn-01, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> hostId='e050c27f-8709-404c-b03e-59c0167a824b',
> vds='Host[kvm-ldn-01,e050c27f-8709-404c-b03e-59c0167a824b]'})'
> execution failed: java.net.ConnectExce
> ption: Connection refused
> 2017-02-20 18:41:51,226Z ERROR
> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
> (DefaultQuartzScheduler2) [f8aa18b3-97b9-48e2-a681-cf3aaed330a5]
> Failure to refresh host 'kvm-ldn-01' runtime info: java.n
> et.ConnectException: Connection refused
> 2017-02-20 18:41:52,772Z ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> (DefaultQuartzScheduler6) [f8aa18b3-97b9-48e2-a681-cf3aaed330a5]
> Command 'GetAllVmStatsVDSCommand(HostName = kvm-ldn-01,
> VdsIdVDSCommandParametersBase:{runAsync='true',
> hostId='e050c27f-8709-404c-b03e-59c0167a824b'})' execution failed:
> VDSGenericException: VDSNetworkException: Connection reset by peer
> 2017-02-20 18:41:54,256Z ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> (DefaultQuartzScheduler7) [f8aa18b3-97b9-48e2-a681-cf3aaed330a5]
> Command 'GetCapabilitiesVDSCommand(HostName = kvm-ldn-01,
> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> hostId='e050c27f-8709-404c-b03e-59c0167a824b',
> vds='Host[kvm-ldn-01,e050c27f-8709-404c-b03e-59c0167a824b]'})'
> execution failed: java.net.ConnectException: Connection refused
>
I checked your engine logs and I saw dns issues much later then the error above:
2017-02-20 19:47:56,516Z ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler6) [f8aa18b3-97b9-48e2-a681-cf3aaed330a5]
Failure to refresh host 'kvm-ldn-01' runtime info:
java.net.UnknownHostException: kvm-ldn-01
> from the vdsm.log on the host:
>
>
> Feb 20 18:44:20 kvm-ldn-01 vdsm[42308]: vdsm vds.dispatcher ERROR SSL
> error receiving from <yajsonrpc.betterAsyncore.Dispatcher connected
> ('::ffff:172.16.75.16', 38350, 0, 0) at 0x33b9bd8>: unexpected eof
> Feb 20 18:44:24 kvm-ldn-01 vdsm[42308]: vdsm jsonrpc.JsonRpcServer
> ERROR Internal server error
> Traceback (most recent call last):
> File
> "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 547, in
> _handle_request...
>
> Any ideas what might be going on here?
I see that ~13 vm was move to up state.
Can you please say which host is causing issues and provide the logs.
>
> Thanks,
>
> Cam
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
More information about the Users
mailing list