[ovirt-users] Non-responsive host, VM's are still running - how to resolve?

Piotr Kliczewski piotr.kliczewski at gmail.com
Tue Nov 14 19:19:57 UTC 2017


On Tue, Nov 14, 2017 at 7:09 PM, Artem Tambovskiy
<artem.tambovskiy at gmail.com> wrote:
> Thanks, Darrell!
>
> Restarted vdsmd but it didn't helped.
> systemctl status vdsmd -l showing following:
>
> ● vdsmd.service - Virtual Desktop Server Manager
>    Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor
> preset: enabled)
>    Active: active (running) since Tue 2017-11-14 21:01:31 MSK; 4min 53s ago
>   Process: 54674 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh
> --post-stop (code=exited, status=0/SUCCESS)
>   Process: 54677 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
> --pre-start (code=exited, status=0/SUCCESS)
>  Main PID: 54971 (vdsm)
>    CGroup: /system.slice/vdsmd.service
>            ├─54971 /usr/bin/python2 /usr/share/vdsm/vdsm
>            └─55099 /usr/libexec/ioprocess --read-pipe-fd 84 --write-pipe-fd
> 83 --max-threads 10 --max-queued-requests 10
>
> Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|e0970bbf-11d8-4517-acff-0f8dccbb10a9'
> args={u'e0970bbf-11d8-4517-acff-0f8dccbb10a9': {'status': 'Up',
> 'displayInfo': [{'tlsPort': '5901', 'ipAddress': '80.239.162.106', 'type':
> u'spice', 'port': '-1'}], 'hash': '-6982259661244130819', 'displayIp':
> '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5901',
> 'timeOffset': u'0', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> '0.00', 'monitorResponse': '0', 'elapsedTime': '370019', 'displayType':
> 'qxl', 'cpuSys': '0.00', 'clientIp': '172.16.11.6', 'vcpuPeriod': 100000L}}
> Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|b366e466-b0ea-4a09-866b-d0248d7523a6'
> args={u'b366e466-b0ea-4a09-866b-d0248d7523a6': {'status': 'Up',
> 'displayInfo': [{'tlsPort': '5900', 'ipAddress': '0', 'type': u'spice',
> 'port': '-1'}], 'hash': '1858968312777883492', 'displayIp': '0',
> 'displayPort': '-1', 'displaySecurePort': '5900', 'timeOffset': '0',
> 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser': '0.00',
> 'monitorResponse': '0', 'elapsedTime': '453444', 'displayType': 'qxl',
> 'cpuSys': '0.00', 'clientIp': '', 'vcpuPeriod': 100000L}}
> Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|ca2815c5-f815-469d-869d-a8fe1cb8c2e7'
> args={u'ca2815c5-f815-469d-869d-a8fe1cb8c2e7': {'status': 'Up',
> 'displayInfo': [{'tlsPort': '5904', 'ipAddress': '80.239.162.106', 'type':
> u'spice', 'port': '-1'}], 'hash': '1149212890076264321', 'displayIp':
> '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5904',
> 'timeOffset': u'0', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> '0.00', 'monitorResponse': '0', 'elapsedTime': '105160', 'displayType':
> 'qxl', 'cpuSys': '0.00', 'clientIp': '172.16.11.6', 'vcpuPeriod': 100000L}}
> Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|a083da47-3e39-458c-8822-459af3d2d93a'
> args={u'a083da47-3e39-458c-8822-459af3d2d93a': {'status': 'Up',
> 'displayInfo': [{'tlsPort': '5902', 'ipAddress': '80.239.162.106', 'type':
> u'spice', 'port': '-1'}], 'hash': '5529949835126538749', 'displayIp':
> '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5902',
> 'timeOffset': u'0', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> '0.00', 'monitorResponse': '0', 'elapsedTime': '365326', 'displayType':
> 'qxl', 'cpuSys': '0.00', 'clientIp': '', 'vcpuPeriod': 100000L}}
> Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|0b7d02df-0286-4e0e-a50b-1d02915ba81c'
> args={u'0b7d02df-0286-4e0e-a50b-1d02915ba81c': {'status': 'Up',
> 'displayInfo': [{'tlsPort': '5903', 'ipAddress': '80.239.162.106', 'type':
> u'spice', 'port': '-1'}], 'hash': '3267121054607612619', 'displayIp':
> '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5903',
> 'timeOffset': '-1', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> '0.00', 'monitorResponse': '0', 'elapsedTime': '275708', 'displayType':
> 'qxl', 'cpuSys': '0.00', 'clientIp': '', 'vcpuPeriod': 100000L}}
> Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm throttled WARN MOM not
> available.
> Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm throttled WARN MOM not
> available, KSM stats will be missing.
> Nov 14 21:01:34 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|0b7d02df-0286-4e0e-a50b-1d02915ba81c'
> args={u'0b7d02df-0286-4e0e-a50b-1d02915ba81c': {'status': 'Up', 'username':
> 'Unknown', 'memUsage': '36', 'guestFQDN': '', 'memoryStats': {u'swap_out':
> '0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '548192',
> u'mem_free': '2679664', u'mem_buffers': '231016', u'swap_in': '0',
> u'swap_total': '786428', u'pageflt': '4346', u'mem_total': '3922564',
> u'mem_unused': '1900456'}, 'session': 'Unknown', 'netIfaces': [],
> 'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}
> Nov 14 21:01:34 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|a083da47-3e39-458c-8822-459af3d2d93a'
> args={u'a083da47-3e39-458c-8822-459af3d2d93a': {'status': 'Up', 'username':
> 'Unknown', 'memUsage': '49', 'guestFQDN': '', 'memoryStats': {u'swap_out':
> '0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '549844',
> u'mem_free': '1054040', u'mem_buffers': '2080', u'swap_in': '0',
> u'swap_total': '4064252', u'pageflt': '148', u'mem_total': '1815524',
> u'mem_unused': '502116'}, 'session': 'Unknown', 'netIfaces': [],
> 'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}
> Nov 14 21:01:34 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready yet,
> ignoring event u'|virt|VM_status|ca2815c5-f815-469d-869d-a8fe1cb8c2e7'
> args={u'ca2815c5-f815-469d-869d-a8fe1cb8c2e7': {'status': 'Up', 'username':
> 'Unknown', 'memUsage': '14', 'guestFQDN': '', 'memoryStats': {u'swap_out':
> '0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '497136',
> u'mem_free': '1801440', u'mem_buffers': '102108', u'swap_in': '0',
> u'swap_total': '1046524', u'pageflt': '64', u'mem_total': '2046116',
> u'mem_unused': '1202196'}, 'session': 'Unknown', 'netIfaces': [],
> 'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}

Above logs say that there is no connection from the engine so events
won't be sent.
Can you share engine logs?

>
> On Tue, Nov 14, 2017 at 8:49 PM, Darrell Budic <budic at onholyground.com>
> wrote:
>>
>> Try restarting vdsmd from the shell, “systemctl restart vdsmd”.
>>
>>
>> ________________________________
>> From: Artem Tambovskiy <artem.tambovskiy at gmail.com>
>> Subject: [ovirt-users] Non-responsive host, VM's are still running - how
>> to resolve?
>> Date: November 14, 2017 at 11:23:32 AM CST
>> To: users
>>
>> Apparently, i lost the host which was running hosted-engine and another 4
>> VM's exactly during migration of second host from bare-metal to second host
>> in the cluster. For some reason first host entered the "Non reponsive"
>> state. The interesting thing is that hosted-engine and all other VM's up and
>> running, so its like a communication problem between hosted-engine and host.
>>
>> The engine.log at hosted-engine is full of following messages:
>>
>> 2017-11-14 17:06:43,158Z INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to ovirt2/80.239.162.106
>> 2017-11-14 17:06:43,159Z ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
>> (DefaultQuartzScheduler9) [50938c3] Command
>> 'GetAllVmStatsVDSCommand(HostName = ovirt2.telia.ru,
>> VdsIdVDSCommandParametersBase:{runAsync='true',
>> hostId='3970247c-69eb-4bd8-b263-9100703a8243'})' execution failed:
>> java.net.NoRouteToHostException: No route to host
>> 2017-11-14 17:06:43,159Z INFO
>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
>> (DefaultQuartzScheduler9) [50938c3] Failed to fetch vms info for host
>> 'ovirt2.telia.ru' - skipping VMs monitoring.
>> 2017-11-14 17:06:45,929Z INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to ovirt2/80.239.162.106
>> 2017-11-14 17:06:45,930Z ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
>> (DefaultQuartzScheduler2) [6080f1cc] Command
>> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
>> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
>> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
>> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
>> execution failed: java.net.NoRouteToHostException: No route to host
>> 2017-11-14 17:06:45,930Z ERROR
>> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
>> (DefaultQuartzScheduler2) [6080f1cc] Failure to refresh host
>> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No route to
>> host
>> 2017-11-14 17:06:48,933Z INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to ovirt2/80.239.162.106
>> 2017-11-14 17:06:48,934Z ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
>> (DefaultQuartzScheduler6) [1a64dfea] Command
>> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
>> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
>> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
>> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
>> execution failed: java.net.NoRouteToHostException: No route to host
>> 2017-11-14 17:06:48,934Z ERROR
>> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
>> (DefaultQuartzScheduler6) [1a64dfea] Failure to refresh host
>> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No route to
>> host
>> 2017-11-14 17:06:50,931Z INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to ovirt2/80.239.162.106
>> 2017-11-14 17:06:50,932Z ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand]
>> (DefaultQuartzScheduler4) [6b19d168] Command 'SpmStatusVDSCommand(HostName =
>> ovirt2.telia.ru, SpmStatusVDSCommandParameters:{runAsync='true',
>> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
>> storagePoolId='5a044257-02ec-0382-0243-0000000001f2'})' execution failed:
>> java.net.NoRouteToHostException: No route to host
>> 2017-11-14 17:06:50,939Z INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to ovirt2/80.239.162.106
>> 2017-11-14 17:06:50,940Z ERROR
>> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
>> (DefaultQuartzScheduler4) [6b19d168]
>> IrsBroker::Failed::GetStoragePoolInfoVDS
>> 2017-11-14 17:06:50,940Z ERROR
>> [org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand]
>> (DefaultQuartzScheduler4) [6b19d168] Command 'GetStoragePoolInfoVDSCommand(
>> GetStoragePoolInfoVDSCommandParameters:{runAsync='true',
>> storagePoolId='5a044257-02ec-0382-0243-0000000001f2',
>> ignoreFailoverLimit='true'})' execution failed: IRSProtocolException:
>> 2017-11-14 17:06:51,937Z INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to ovirt2/80.239.162.106
>> 2017-11-14 17:06:51,938Z ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
>> (DefaultQuartzScheduler7) [7f23a3bd] Command
>> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
>> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
>> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
>> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
>> execution failed: java.net.NoRouteToHostException: No route to host
>> 2017-11-14 17:06:51,938Z ERROR
>> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
>> (DefaultQuartzScheduler7) [7f23a3bd] Failure to refresh host
>> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No route to
>> host
>> 2017-11-14 17:06:54,941Z INFO
>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor)
>> [] Connecting to ovirt2/80.239.162.106
>> 2017-11-14 17:06:54,942Z ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
>> (DefaultQuartzScheduler2) [7a769f6c] Command
>> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
>> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
>> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
>> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
>> execution failed: java.net.NoRouteToHostException: No route to host
>> 2017-11-14 17:06:54,942Z ERROR
>> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
>> (DefaultQuartzScheduler2) [7a769f6c] Failure to refresh host
>> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No route to
>> host
>>
>> Its a bit weird, since I can ping and login via ssh to the host from
>> hosted-engine with no problem. I have added second host to the cluster, but
>> it not running hosted-engine. Any suggestion for the further steps? Just
>> reboot the host and hope for the best?
>>
>> Regards,
>> Artem
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>


More information about the Users mailing list