Hi,
I have been seeing random failures of tests in different projects
caused by vdsm monitoring.
I need someone from vdsm to please help debug this issue.
Petr, is this the same problem like yesterday in
[ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
From what I can see, the test suspend/resume vm failed because we
could not query the status of the vm on the host.
you can see full log from failed tests here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/10208/arti...
Here are the errors that I can see in the vdsm which seem to suggest
that there is an issue getting stats on the vm which was suspended
and caused the failure of the test:
2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6)
[storage.StorageDomain] Removing remnants of deleted images []
(fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds]
Not sending QEMU-GA command 'guest-get-users' to
vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not
supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG
(periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp
4296118.49 elapsed 0.010 acquired True domains all (sampling:443)
2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer]
Calling 'Image.prepare' in bridge with {u'allowIllegal': True,
u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d',
u'imageID':
u'd4c831e6-02d2-4d89-b516-0ec4597
5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56',
u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'}
(__init__:329)
2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm]
(vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed
out (guestagent:556)
2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer]
Calling 'Host.getStats' in bridge with {} (__init__:329)
2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START
getStats() from=::ffff:192.168.201.4,49184 (api:47)
2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0
speed (nic:42)
2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1
speed (nic:42)
2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2
speed (nic:42)
2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3
speed (nic:42)
2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH
getStats error=[Errno 22] Invalid argument
from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400
ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error
(__init__:350) Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line
345, in _handle_request
res = method(**params)
File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line
202, in _dynamicMethod
result = fn(*methodArgs)
File "<string>", line 2, in getStats
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line
49, in method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in
getStats
multipath=True)}
File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49,
in get_stats
decStats = stats.produce(first_sample, last_sample)
File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line
71, in produce
stats.update(get_interfaces_stats())
File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line
153, in get_interfaces_stats
return net_api.network_stats()
File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line
63, in network_stats
return netstats.report()
File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py",
line 31, in report
stats = link_stats.report()
File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py",
line 41, in report
speed = vlan.speed(i.device)
File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py",
line 36, in speed
dev_speed = nic.read_speed_using_sysfs(dev_name)
File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py",
line 48, in read_speed_using_sysfs
s = int(f.read())
IOError: [Errno 22] Invalid argument
2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5)
[jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603)
in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG
(jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in
bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO
(jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47)
2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH
getAllVmStats return={'status': {'message': 'Done',
'code': 0},
'statsList': (suppressed)} from=::1,46772 (api:53)
2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer]
Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356)
2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2)
[jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00
seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1)
[jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in
bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO
(jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772
(api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host]
FINISH getAllVmIoTunePolicies return={'status': {'message':
'Done',
'code': 0}, 'io_tune_policies_dict':
{'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [],
'current_values': [{'ioTune': {'write_bytes_sec': 0L,
'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L,
'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path':
'/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f',
'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11
00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return
'Host.getAllVmIoTunePolicies' in bridge with
{'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [],
'current_values': [{'ioTune': {'write_bytes_sec': 0L,
'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L,
'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path':
'/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f',
'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO
(jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call
Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312)
2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7)
[jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with
{} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7)
[api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184
(api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host]
FINISH getAllVmStats return={'status': {'message': 'Done',
'code':
0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184
(api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7)
[jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with
(suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO
(jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats
succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400
DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command
'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc',
command is not supported (qemuguestagent:192) 2018-09-11
00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for
stale paused VMs (periodic:388)
I can also see this in Super vdsm which may be related but I an not
100% sure its related to current issue:
MainProcess|jsonrpc/2::DEBUG::2018-09-11
00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = '';
<rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11
00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running
scope as unit
2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/network-scripts/ifup-eth:
line 321: 20715 Terminated
/sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice
"on039d2d88d8ee4" does not exist.\nCannot find device
"on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d
8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic
e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does
not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d
8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic
e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does
not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d
8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic
e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does
not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not
exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11
00:33:20,238::concurrent::202::root::(run) FINISH thread
<Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py",
line 195, in run
ret = func(*args, **kwargs)
File
"/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py",
line 948, in _exec_ifup
_exec_ifup_by_name(iface.name, cgroup)
File
"/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py",
line 934, in _exec_ifup_by_name
raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '')
ConfigNetworkError: (29, '\n')
thanks,
Dafna