[ovirt-users] Non-responsive host, VM's are still running - how to resolve?
Artem Tambovskiy
artem.tambovskiy at gmail.com
Tue Nov 14 20:11:07 UTC 2017
Piotr,
full engine log attached.
On Tue, Nov 14, 2017 at 10:19 PM, Piotr Kliczewski <
piotr.kliczewski at gmail.com> wrote:
> On Tue, Nov 14, 2017 at 7:09 PM, Artem Tambovskiy
> <artem.tambovskiy at gmail.com> wrote:
> > Thanks, Darrell!
> >
> > Restarted vdsmd but it didn't helped.
> > systemctl status vdsmd -l showing following:
> >
> > ● vdsmd.service - Virtual Desktop Server Manager
> > Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled;
> vendor
> > preset: enabled)
> > Active: active (running) since Tue 2017-11-14 21:01:31 MSK; 4min 53s
> ago
> > Process: 54674 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh
> > --post-stop (code=exited, status=0/SUCCESS)
> > Process: 54677 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
> > --pre-start (code=exited, status=0/SUCCESS)
> > Main PID: 54971 (vdsm)
> > CGroup: /system.slice/vdsmd.service
> > ├─54971 /usr/bin/python2 /usr/share/vdsm/vdsm
> > └─55099 /usr/libexec/ioprocess --read-pipe-fd 84
> --write-pipe-fd
> > 83 --max-threads 10 --max-queued-requests 10
> >
> > Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|e0970bbf-11d8-4517-acff-0f8dccbb10a9'
> > args={u'e0970bbf-11d8-4517-acff-0f8dccbb10a9': {'status': 'Up',
> > 'displayInfo': [{'tlsPort': '5901', 'ipAddress': '80.239.162.106',
> 'type':
> > u'spice', 'port': '-1'}], 'hash': '-6982259661244130819', 'displayIp':
> > '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5901',
> > 'timeOffset': u'0', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> > '0.00', 'monitorResponse': '0', 'elapsedTime': '370019', 'displayType':
> > 'qxl', 'cpuSys': '0.00', 'clientIp': '172.16.11.6', 'vcpuPeriod':
> 100000L}}
> > Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|b366e466-b0ea-4a09-866b-d0248d7523a6'
> > args={u'b366e466-b0ea-4a09-866b-d0248d7523a6': {'status': 'Up',
> > 'displayInfo': [{'tlsPort': '5900', 'ipAddress': '0', 'type': u'spice',
> > 'port': '-1'}], 'hash': '1858968312777883492', 'displayIp': '0',
> > 'displayPort': '-1', 'displaySecurePort': '5900', 'timeOffset': '0',
> > 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser': '0.00',
> > 'monitorResponse': '0', 'elapsedTime': '453444', 'displayType': 'qxl',
> > 'cpuSys': '0.00', 'clientIp': '', 'vcpuPeriod': 100000L}}
> > Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|ca2815c5-f815-469d-869d-a8fe1cb8c2e7'
> > args={u'ca2815c5-f815-469d-869d-a8fe1cb8c2e7': {'status': 'Up',
> > 'displayInfo': [{'tlsPort': '5904', 'ipAddress': '80.239.162.106',
> 'type':
> > u'spice', 'port': '-1'}], 'hash': '1149212890076264321', 'displayIp':
> > '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5904',
> > 'timeOffset': u'0', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> > '0.00', 'monitorResponse': '0', 'elapsedTime': '105160', 'displayType':
> > 'qxl', 'cpuSys': '0.00', 'clientIp': '172.16.11.6', 'vcpuPeriod':
> 100000L}}
> > Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|a083da47-3e39-458c-8822-459af3d2d93a'
> > args={u'a083da47-3e39-458c-8822-459af3d2d93a': {'status': 'Up',
> > 'displayInfo': [{'tlsPort': '5902', 'ipAddress': '80.239.162.106',
> 'type':
> > u'spice', 'port': '-1'}], 'hash': '5529949835126538749', 'displayIp':
> > '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5902',
> > 'timeOffset': u'0', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> > '0.00', 'monitorResponse': '0', 'elapsedTime': '365326', 'displayType':
> > 'qxl', 'cpuSys': '0.00', 'clientIp': '', 'vcpuPeriod': 100000L}}
> > Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|0b7d02df-0286-4e0e-a50b-1d02915ba81c'
> > args={u'0b7d02df-0286-4e0e-a50b-1d02915ba81c': {'status': 'Up',
> > 'displayInfo': [{'tlsPort': '5903', 'ipAddress': '80.239.162.106',
> 'type':
> > u'spice', 'port': '-1'}], 'hash': '3267121054607612619', 'displayIp':
> > '80.239.162.106', 'displayPort': '-1', 'displaySecurePort': '5903',
> > 'timeOffset': '-1', 'pauseCode': 'NOERR', 'vcpuQuota': '-1', 'cpuUser':
> > '0.00', 'monitorResponse': '0', 'elapsedTime': '275708', 'displayType':
> > 'qxl', 'cpuSys': '0.00', 'clientIp': '', 'vcpuPeriod': 100000L}}
> > Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm throttled WARN MOM not
> > available.
> > Nov 14 21:01:33 ovirt2.telia.ru vdsm[54971]: vdsm throttled WARN MOM not
> > available, KSM stats will be missing.
> > Nov 14 21:01:34 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|0b7d02df-0286-4e0e-a50b-1d02915ba81c'
> > args={u'0b7d02df-0286-4e0e-a50b-1d02915ba81c': {'status': 'Up',
> 'username':
> > 'Unknown', 'memUsage': '36', 'guestFQDN': '', 'memoryStats':
> {u'swap_out':
> > '0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '548192',
> > u'mem_free': '2679664', u'mem_buffers': '231016', u'swap_in': '0',
> > u'swap_total': '786428', u'pageflt': '4346', u'mem_total': '3922564',
> > u'mem_unused': '1900456'}, 'session': 'Unknown', 'netIfaces': [],
> > 'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}
> > Nov 14 21:01:34 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|a083da47-3e39-458c-8822-459af3d2d93a'
> > args={u'a083da47-3e39-458c-8822-459af3d2d93a': {'status': 'Up',
> 'username':
> > 'Unknown', 'memUsage': '49', 'guestFQDN': '', 'memoryStats':
> {u'swap_out':
> > '0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '549844',
> > u'mem_free': '1054040', u'mem_buffers': '2080', u'swap_in': '0',
> > u'swap_total': '4064252', u'pageflt': '148', u'mem_total': '1815524',
> > u'mem_unused': '502116'}, 'session': 'Unknown', 'netIfaces': [],
> > 'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}
> > Nov 14 21:01:34 ovirt2.telia.ru vdsm[54971]: vdsm vds WARN Not ready
> yet,
> > ignoring event u'|virt|VM_status|ca2815c5-f815-469d-869d-a8fe1cb8c2e7'
> > args={u'ca2815c5-f815-469d-869d-a8fe1cb8c2e7': {'status': 'Up',
> 'username':
> > 'Unknown', 'memUsage': '14', 'guestFQDN': '', 'memoryStats':
> {u'swap_out':
> > '0', u'majflt': '0', u'swap_usage': '0', u'mem_cached': '497136',
> > u'mem_free': '1801440', u'mem_buffers': '102108', u'swap_in': '0',
> > u'swap_total': '1046524', u'pageflt': '64', u'mem_total': '2046116',
> > u'mem_unused': '1202196'}, 'session': 'Unknown', 'netIfaces': [],
> > 'guestCPUCount': -1, 'appsList': (), 'guestIPs': '', 'disksUsage': []}}
>
> Above logs say that there is no connection from the engine so events
> won't be sent.
> Can you share engine logs?
>
> >
> > On Tue, Nov 14, 2017 at 8:49 PM, Darrell Budic <budic at onholyground.com>
> > wrote:
> >>
> >> Try restarting vdsmd from the shell, “systemctl restart vdsmd”.
> >>
> >>
> >> ________________________________
> >> From: Artem Tambovskiy <artem.tambovskiy at gmail.com>
> >> Subject: [ovirt-users] Non-responsive host, VM's are still running - how
> >> to resolve?
> >> Date: November 14, 2017 at 11:23:32 AM CST
> >> To: users
> >>
> >> Apparently, i lost the host which was running hosted-engine and another
> 4
> >> VM's exactly during migration of second host from bare-metal to second
> host
> >> in the cluster. For some reason first host entered the "Non reponsive"
> >> state. The interesting thing is that hosted-engine and all other VM's
> up and
> >> running, so its like a communication problem between hosted-engine and
> host.
> >>
> >> The engine.log at hosted-engine is full of following messages:
> >>
> >> 2017-11-14 17:06:43,158Z INFO
> >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
> Reactor)
> >> [] Connecting to ovirt2/80.239.162.106
> >> 2017-11-14 17:06:43,159Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
> >> (DefaultQuartzScheduler9) [50938c3] Command
> >> 'GetAllVmStatsVDSCommand(HostName = ovirt2.telia.ru,
> >> VdsIdVDSCommandParametersBase:{runAsync='true',
> >> hostId='3970247c-69eb-4bd8-b263-9100703a8243'})' execution failed:
> >> java.net.NoRouteToHostException: No route to host
> >> 2017-11-14 17:06:43,159Z INFO
> >> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher]
> >> (DefaultQuartzScheduler9) [50938c3] Failed to fetch vms info for host
> >> 'ovirt2.telia.ru' - skipping VMs monitoring.
> >> 2017-11-14 17:06:45,929Z INFO
> >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
> Reactor)
> >> [] Connecting to ovirt2/80.239.162.106
> >> 2017-11-14 17:06:45,930Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> >> (DefaultQuartzScheduler2) [6080f1cc] Command
> >> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
> >> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> >> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
> >> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
> >> execution failed: java.net.NoRouteToHostException: No route to host
> >> 2017-11-14 17:06:45,930Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
> >> (DefaultQuartzScheduler2) [6080f1cc] Failure to refresh host
> >> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No
> route to
> >> host
> >> 2017-11-14 17:06:48,933Z INFO
> >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
> Reactor)
> >> [] Connecting to ovirt2/80.239.162.106
> >> 2017-11-14 17:06:48,934Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> >> (DefaultQuartzScheduler6) [1a64dfea] Command
> >> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
> >> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> >> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
> >> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
> >> execution failed: java.net.NoRouteToHostException: No route to host
> >> 2017-11-14 17:06:48,934Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
> >> (DefaultQuartzScheduler6) [1a64dfea] Failure to refresh host
> >> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No
> route to
> >> host
> >> 2017-11-14 17:06:50,931Z INFO
> >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
> Reactor)
> >> [] Connecting to ovirt2/80.239.162.106
> >> 2017-11-14 17:06:50,932Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand]
> >> (DefaultQuartzScheduler4) [6b19d168] Command
> 'SpmStatusVDSCommand(HostName =
> >> ovirt2.telia.ru, SpmStatusVDSCommandParameters:{runAsync='true',
> >> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
> >> storagePoolId='5a044257-02ec-0382-0243-0000000001f2'})' execution
> failed:
> >> java.net.NoRouteToHostException: No route to host
> >> 2017-11-14 17:06:50,939Z INFO
> >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
> Reactor)
> >> [] Connecting to ovirt2/80.239.162.106
> >> 2017-11-14 17:06:50,940Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> >> (DefaultQuartzScheduler4) [6b19d168]
> >> IrsBroker::Failed::GetStoragePoolInfoVDS
> >> 2017-11-14 17:06:50,940Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.irsbroker.
> GetStoragePoolInfoVDSCommand]
> >> (DefaultQuartzScheduler4) [6b19d168] Command
> 'GetStoragePoolInfoVDSCommand(
> >> GetStoragePoolInfoVDSCommandParameters:{runAsync='true',
> >> storagePoolId='5a044257-02ec-0382-0243-0000000001f2',
> >> ignoreFailoverLimit='true'})' execution failed: IRSProtocolException:
> >> 2017-11-14 17:06:51,937Z INFO
> >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
> Reactor)
> >> [] Connecting to ovirt2/80.239.162.106
> >> 2017-11-14 17:06:51,938Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> >> (DefaultQuartzScheduler7) [7f23a3bd] Command
> >> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
> >> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> >> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
> >> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
> >> execution failed: java.net.NoRouteToHostException: No route to host
> >> 2017-11-14 17:06:51,938Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
> >> (DefaultQuartzScheduler7) [7f23a3bd] Failure to refresh host
> >> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No
> route to
> >> host
> >> 2017-11-14 17:06:54,941Z INFO
> >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
> Reactor)
> >> [] Connecting to ovirt2/80.239.162.106
> >> 2017-11-14 17:06:54,942Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> >> (DefaultQuartzScheduler2) [7a769f6c] Command
> >> 'GetCapabilitiesVDSCommand(HostName = ovirt2.telia.ru,
> >> VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> >> hostId='3970247c-69eb-4bd8-b263-9100703a8243',
> >> vds='Host[ovirt2.telia.ru,3970247c-69eb-4bd8-b263-9100703a8243]'})'
> >> execution failed: java.net.NoRouteToHostException: No route to host
> >> 2017-11-14 17:06:54,942Z ERROR
> >> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
> >> (DefaultQuartzScheduler2) [7a769f6c] Failure to refresh host
> >> 'ovirt2.telia.ru' runtime info: java.net.NoRouteToHostException: No
> route to
> >> host
> >>
> >> Its a bit weird, since I can ping and login via ssh to the host from
> >> hosted-engine with no problem. I have added second host to the cluster,
> but
> >> it not running hosted-engine. Any suggestion for the further steps? Just
> >> reboot the host and hope for the best?
> >>
> >> Regards,
> >> Artem
> >> _______________________________________________
> >> Users mailing list
> >> Users at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
> >>
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20171114/b3ae3fb5/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: engine.log.bz2
Type: application/x-bzip2
Size: 748650 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20171114/b3ae3fb5/attachment.bz2>
More information about the Users
mailing list