
Hi, I have been seeing random failures of tests in different projects caused by vdsm monitoring. I need someone from vdsm to please help debug this issue. From what I can see, the test suspend/resume vm failed because we could not query the status of the vm on the host. you can see full log from failed tests here: https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/10208/artifac... Here are the errors that I can see in the vdsm which seem to suggest that there is an issue getting stats on the vm which was suspended and caused the failure of the test: 2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4296118.49 elapsed 0.010 acquired True domains all (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in bridge with {u'allowIllegal': True, u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329) 2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed out (guestagent:556) 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START getStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2 speed (nic:42) 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3 speed (nic:42) 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH getStats error=[Errno 22] Invalid argument from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49, in get_stats decStats = stats.produce(first_sample, last_sample) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 71, in produce stats.update(get_interfaces_stats()) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 153, in get_interfaces_stats return net_api.network_stats() File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 31, in report stats = link_stats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 41, in report speed = vlan.speed(i.device) File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", line 36, in speed dev_speed = nic.read_speed_using_sysfs(dev_name) File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", line 48, in read_speed_using_sysfs s = int(f.read()) IOError: [Errno 22] Invalid argument 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,46772 (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388) I can also see this in Super vdsm which may be related but I an not 100% sure its related to current issue: MainProcess|jsonrpc/2::DEBUG::2018-09-11 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running scope as unit 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/network-scripts/ifup-eth: line 321: 20715 Terminated /sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice "on039d2d88d8ee4" does not exist.\nCannot find device "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 00:33:20,238::concurrent::202::root::(run) FINISH thread <Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 948, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 934, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n') thanks, Dafna

On Tue, 11 Sep 2018 10:26:02 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi,
I have been seeing random failures of tests in different projects caused by vdsm monitoring.
I need someone from vdsm to please help debug this issue.
Petr, is this the same problem like yesterday in [ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
From what I can see, the test suspend/resume vm failed because we could not query the status of the vm on the host.
you can see full log from failed tests here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/10208/artifac...
Here are the errors that I can see in the vdsm which seem to suggest that there is an issue getting stats on the vm which was suspended and caused the failure of the test:
2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4296118.49 elapsed 0.010 acquired True domains all (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in bridge with {u'allowIllegal': True, u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329)
2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed out (guestagent:556) 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START getStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2 speed (nic:42) 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3 speed (nic:42) 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH getStats error=[Errno 22] Invalid argument from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49, in get_stats decStats = stats.produce(first_sample, last_sample) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 71, in produce stats.update(get_interfaces_stats()) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 153, in get_interfaces_stats return net_api.network_stats() File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 31, in report stats = link_stats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 41, in report speed = vlan.speed(i.device) File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", line 36, in speed dev_speed = nic.read_speed_using_sysfs(dev_name) File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", line 48, in read_speed_using_sysfs s = int(f.read()) IOError: [Errno 22] Invalid argument 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,46772 (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388)
I can also see this in Super vdsm which may be related but I an not 100% sure its related to current issue:
MainProcess|jsonrpc/2::DEBUG::2018-09-11 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running scope as unit 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/network-scripts/ifup-eth: line 321: 20715 Terminated /sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice "on039d2d88d8ee4" does not exist.\nCannot find device "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 00:33:20,238::concurrent::202::root::(run) FINISH thread <Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 948, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 934, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')
thanks, Dafna

vdsm.log is the same issue, supervdsm.log seems unrelated. 2018-09-11 11:59 GMT+02:00 Dominik Holler <dholler@redhat.com>:
On Tue, 11 Sep 2018 10:26:02 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi,
I have been seeing random failures of tests in different projects caused by vdsm monitoring.
I need someone from vdsm to please help debug this issue.
Petr, is this the same problem like yesterday in
[ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
From what I can see, the test suspend/resume vm failed because we could not query the status of the vm on the host.
you can see full log from failed tests here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/10208/artifact/basic-suite.el7.x86_64/test_logs/ basic-suite-master/post-004_basic_sanity.py/
Here are the errors that I can see in the vdsm which seem to suggest that there is an issue getting stats on the vm which was suspended and caused the failure of the test:
2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4296118.49 elapsed 0.010 acquired True domains all (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in bridge with {u'allowIllegal': True, u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329)
2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed out (guestagent:556) 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START getStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2 speed (nic:42) 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3 speed (nic:42) 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH getStats error=[Errno 22] Invalid argument from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49, in get_stats decStats = stats.produce(first_sample, last_sample) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 71, in produce stats.update(get_interfaces_stats()) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 153, in get_interfaces_stats return net_api.network_stats() File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 31, in report stats = link_stats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 41, in report speed = vlan.speed(i.device) File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", line 36, in speed dev_speed = nic.read_speed_using_sysfs(dev_name) File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", line 48, in read_speed_using_sysfs s = int(f.read()) IOError: [Errno 22] Invalid argument 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,46772 (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35- b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1- 6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35- b7bf-870c4e07d995/images/9a3659dc-339e-4b3f-a1d1- 6ee3debb6168/e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388)
I can also see this in Super vdsm which may be related but I an not 100% sure its related to current issue:
MainProcess|jsonrpc/2::DEBUG::2018-09-11 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running scope as unit 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/ sysconfig/network-scripts/ifup-eth: line 321: 20715 Terminated /sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice "on039d2d88d8ee4" does not exist.\nCannot find device "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 00:33:20,238::concurrent::202::root::(run) FINISH thread <Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 948, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 934, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')
thanks, Dafna

Can someone take ownership to fix it? On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek <phoracek@redhat.com> wrote:
vdsm.log is the same issue, supervdsm.log seems unrelated.
2018-09-11 11:59 GMT+02:00 Dominik Holler <dholler@redhat.com>:
On Tue, 11 Sep 2018 10:26:02 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi,
I have been seeing random failures of tests in different projects caused by vdsm monitoring.
I need someone from vdsm to please help debug this issue.
Petr, is this the same problem like yesterday in
[ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
From what I can see, the test suspend/resume vm failed because we could not query the status of the vm on the host.
you can see full log from failed tests here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-test er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- suite-master/post-004_basic_sanity.py/
Here are the errors that I can see in the vdsm which seem to suggest that there is an issue getting stats on the vm which was suspended and caused the failure of the test:
2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4296118.49 elapsed 0.010 acquired True domains all (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in bridge with {u'allowIllegal': True, u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329)
2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed out (guestagent:556) 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START getStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2 speed (nic:42) 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3 speed (nic:42) 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH getStats error=[Errno 22] Invalid argument from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49, in get_stats decStats = stats.produce(first_sample, last_sample) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 71, in produce stats.update(get_interfaces_stats()) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 153, in get_interfaces_stats return net_api.network_stats() File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 31, in report stats = link_stats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 41, in report speed = vlan.speed(i.device) File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", line 36, in speed dev_speed = nic.read_speed_using_sysfs(dev_name) File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", line 48, in read_speed_using_sysfs s = int(f.read()) IOError: [Errno 22] Invalid argument 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,46772 (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388)
I can also see this in Super vdsm which may be related but I an not 100% sure its related to current issue:
MainProcess|jsonrpc/2::DEBUG::2018-09-11 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running scope as unit 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/ network-scripts/ifup-eth: line 321: 20715 Terminated /sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice "on039d2d88d8ee4" does not exist.\nCannot find device "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 00:33:20,238::concurrent::202::root::(run) FINISH thread <Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 948, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 934, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')
thanks, Dafna

On Tue, 11 Sep 2018 12:22:21 +0100 Dafna Ron <dron@redhat.com> wrote:
Can someone take ownership to fix it?
I will take ownership and track the fix.
On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek <phoracek@redhat.com> wrote:
vdsm.log is the same issue, supervdsm.log seems unrelated.
2018-09-11 11:59 GMT+02:00 Dominik Holler <dholler@redhat.com>:
On Tue, 11 Sep 2018 10:26:02 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi,
I have been seeing random failures of tests in different projects caused by vdsm monitoring.
I need someone from vdsm to please help debug this issue.
Petr, is this the same problem like yesterday in
[ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
From what I can see, the test suspend/resume vm failed because we could not query the status of the vm on the host.
you can see full log from failed tests here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-test er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- suite-master/post-004_basic_sanity.py/
Here are the errors that I can see in the vdsm which seem to suggest that there is an issue getting stats on the vm which was suspended and caused the failure of the test:
2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4296118.49 elapsed 0.010 acquired True domains all (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in bridge with {u'allowIllegal': True, u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329)
2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed out (guestagent:556) 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START getStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2 speed (nic:42) 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3 speed (nic:42) 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH getStats error=[Errno 22] Invalid argument from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49, in get_stats decStats = stats.produce(first_sample, last_sample) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 71, in produce stats.update(get_interfaces_stats()) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 153, in get_interfaces_stats return net_api.network_stats() File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 31, in report stats = link_stats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 41, in report speed = vlan.speed(i.device) File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", line 36, in speed dev_speed = nic.read_speed_using_sysfs(dev_name) File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", line 48, in read_speed_using_sysfs s = int(f.read()) IOError: [Errno 22] Invalid argument 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,46772 (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388)
I can also see this in Super vdsm which may be related but I an not 100% sure its related to current issue:
MainProcess|jsonrpc/2::DEBUG::2018-09-11 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running scope as unit 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/ network-scripts/ifup-eth: line 321: 20715 Terminated /sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice "on039d2d88d8ee4" does not exist.\nCannot find device "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 00:33:20,238::concurrent::202::root::(run) FINISH thread <Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 948, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 934, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')
thanks, Dafna

On Tue, 11 Sep 2018 17:18:15 +0200 Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 12:22:21 +0100 Dafna Ron <dron@redhat.com> wrote:
Can someone take ownership to fix it?
I will take ownership and track the fix.
On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek <phoracek@redhat.com> wrote:
vdsm.log is the same issue, supervdsm.log seems unrelated.
2018-09-11 11:59 GMT+02:00 Dominik Holler <dholler@redhat.com>:
On Tue, 11 Sep 2018 10:26:02 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi,
I have been seeing random failures of tests in different projects caused by vdsm monitoring.
I need someone from vdsm to please help debug this issue.
Petr, is this the same problem like yesterday in
[ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
From what I can see, the test suspend/resume vm failed because we could not query the status of the vm on the host.
you can see full log from failed tests here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-test er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- suite-master/post-004_basic_sanity.py/
Here are the errors that I can see in the vdsm which seem to suggest that there is an issue getting stats on the vm which was suspended and caused the failure of the test:
2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4296118.49 elapsed 0.010 acquired True domains all (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in bridge with {u'allowIllegal': True, u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329)
2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed out (guestagent:556) 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START getStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2 speed (nic:42) 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3 speed (nic:42) 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH getStats error=[Errno 22] Invalid argument from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49, in get_stats decStats = stats.produce(first_sample, last_sample) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 71, in produce stats.update(get_interfaces_stats()) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 153, in get_interfaces_stats return net_api.network_stats() File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 31, in report stats = link_stats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 41, in report speed = vlan.speed(i.device) File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", line 36, in speed dev_speed = nic.read_speed_using_sysfs(dev_name) File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", line 48, in read_speed_using_sysfs s = int(f.read()) IOError: [Errno 22] Invalid argument 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,46772 (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388)
I can also see this in Super vdsm which may be related but I an not 100% sure its related to current issue:
MainProcess|jsonrpc/2::DEBUG::2018-09-11 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running scope as unit 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/ network-scripts/ifup-eth: line 321: 20715 Terminated /sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice "on039d2d88d8ee4" does not exist.\nCannot find device "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 00:33:20,238::concurrent::202::root::(run) FINISH thread <Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 948, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 934, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')
What happens in basic-suite-master/test-scenarios/003_basic_networking.py : modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, so vdsm is calling ifup, which tries to configure via DHCP in the background. The next test detach_vm_network_from_host_0 removes the network, while ifup still hangs in dhcp configuration, because in this VLAN there is no DHCP server. So from my point of view this error in supervdsm.log is legitimate, because VDSM was advised to configure via DHCP, which fails because the interface is removed during dhcp configuration. I did not recognize that this triggeres any chained errors.
thanks, Dafna

Thanks Dominic. Can you please review the network tests to fix this race between the two tests? On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 17:18:15 +0200 Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 12:22:21 +0100 Dafna Ron <dron@redhat.com> wrote:
Can someone take ownership to fix it?
I will take ownership and track the fix.
On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek <phoracek@redhat.com> wrote:
vdsm.log is the same issue, supervdsm.log seems unrelated.
2018-09-11 11:59 GMT+02:00 Dominik Holler <dholler@redhat.com>:
On Tue, 11 Sep 2018 10:26:02 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi,
I have been seeing random failures of tests in different projects caused by vdsm monitoring.
I need someone from vdsm to please help debug this issue.
Petr, is this the same problem like yesterday in
[ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
From what I can see, the test suspend/resume vm failed because we could not query the status of the vm on the host.
you can see full log from failed tests here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-test er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- suite-master/post-004_basic_sanity.py/
Here are the errors that I can see in the vdsm which seem to suggest that there is an issue getting stats on the vm which was suspended and caused the failure of the test:
2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4296118.49 elapsed 0.010 acquired True domains all (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in bridge with {u'allowIllegal': True, u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', u'storagedomainID': u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329)
2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest connection timed out (guestagent:556) 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START getStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot read eth2 speed (nic:42) 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot read eth3 speed (nic:42) 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH getStats error=[Errno 22] Invalid argument from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 49, in get_stats decStats = stats.produce(first_sample, last_sample) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 71, in produce stats.update(get_interfaces_stats()) File "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 153, in get_interfaces_stats return net_api.network_stats() File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 31, in report stats = link_stats.report() File "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 41, in report speed = vlan.speed(i.device) File "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", line 36, in speed dev_speed = nic.read_speed_using_sysfs(dev_name) File "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", line 48, in read_speed_using_sysfs s = int(f.read()) IOError: [Errno 22] Invalid argument 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.getStats failed (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START getAllVmStats() from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,46772 (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:14,661-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies' in bridge with {} (__init__:329) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0}, 'io_tune_policies_dict': {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path': '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', 'name': 'vda'}]}} (__init__:356) 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge with {} (__init__:329) 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 00:40:16,122-0400 DEBUG (jsonrpc/7) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge with (suppressed) (__init__:356) 2018-09-11 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command 'guest-get-users' to vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is not supported (qemuguestagent:192) 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388)
I can also see this in Super vdsm which may be related but I an not 100% sure its related to current issue:
MainProcess|jsonrpc/2::DEBUG::2018-09-11 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0 ifup/on039d2d88d8ee4::DEBUG::2018-09-11 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: <err> = 'Running scope as unit 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/ network-scripts/ifup-eth: line 321: 20715 Terminated /sbin/dhclient ${DHCLIENTARGS} ${DEVICE}\nDevice "on039d2d88d8ee4" does not exist.\nCannot find device "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevic e "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 00:33:20,238::concurrent::202::root::(run) FINISH thread <Thread(ifup/on039d2d88d8ee4, stopped daemon 140281550731008)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", line 948, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", line 934, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')
What happens in basic-suite-master/test-scenarios/003_basic_networking.py : modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, so vdsm is calling ifup, which tries to configure via DHCP in the background. The next test detach_vm_network_from_host_0 removes the network, while ifup still hangs in dhcp configuration, because in this VLAN there is no DHCP server. So from my point of view this error in supervdsm.log is legitimate, because VDSM was advised to configure via DHCP, which fails because the interface is removed during dhcp configuration. I did not recognize that this triggeres any chained errors.
thanks, Dafna

On Wed, 12 Sep 2018 08:17:46 +0100 Dafna Ron <dron@redhat.com> wrote:
Thanks Dominic. Can you please review the network tests to fix this race between the two tests?
Dafna, why do you think the current behavior should be changed, or how the current behavior should be improved?
On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 17:18:15 +0200 Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 12:22:21 +0100 Dafna Ron <dron@redhat.com> wrote:
Can someone take ownership to fix it?
I will take ownership and track the fix.
On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek <phoracek@redhat.com> wrote:
vdsm.log is the same issue, supervdsm.log seems unrelated.
2018-09-11 11:59 GMT+02:00 Dominik Holler <dholler@redhat.com>:
On Tue, 11 Sep 2018 10:26:02 +0100 Dafna Ron <dron@redhat.com> wrote:
> Hi, > > I have been seeing random failures of tests in different > projects caused by vdsm monitoring. > > I need someone from vdsm to please help debug this issue. >
Petr, is this the same problem like yesterday in
[ovirt-devel] failure in ost test - Invalid argument - help in debugging issue
> From what I can see, the test suspend/resume vm failed > because we could not query the status of the vm on the > host. > > you can see full log from failed tests here: > > https://jenkins.ovirt.org/job/ovirt-master_change-queue-test er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- suite-master/post-004_basic_sanity.py/ > > Here are the errors that I can see in the vdsm which seem > to suggest that there is an issue getting stats on the vm > which was suspended and caused the failure of the test: > > > > 2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) > [storage.StorageDomain] Removing remnants of deleted > images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG > (qgapoller/1) [vds] Not sending QEMU-GA command > 'guest-get-users' to > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is > not supported (qemuguestagent:192) 2018-09-11 > 00:40:08,068-0400 DEBUG (periodic/3) > [virt.sampling.VMBulkstatsMonitor] sampled timestamp > 4296118.49 elapsed 0.010 acquired True domains all > (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG > (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling > 'Image.prepare' in bridge with {u'allowIllegal': True, > u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', > u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', > u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', > u'storagedomainID': > u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329) > > > 2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] > (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest > connection timed out (guestagent:556) > 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) > [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge > with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO > (jsonrpc/5) [api.host] START getStats() > from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 > 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 > speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG > (jsonrpc/5) [root] cannot read eth1 speed (nic:42) > 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] > cannot read eth2 speed (nic:42) > 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] > cannot read eth3 speed (nic:42) > 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] > FINISH getStats error=[Errno 22] Invalid argument > from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 > 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] > Internal server error (__init__:350) Traceback (most recent > call last): File > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", > line 345, in _handle_request res = method(**params) > File > "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", > line 202, in _dynamicMethod result = fn(*methodArgs) > File "<string>", line 2, in getStats > File > "/usr/lib/python2.7/site-packages/vdsm/common/api.py", > line 49, in method ret = func(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/API.py", line > 1407, in getStats > multipath=True)} > File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", > line 49, in get_stats > decStats = stats.produce(first_sample, last_sample) > File > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > line 71, in produce stats.update(get_interfaces_stats()) > File > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > line 153, in get_interfaces_stats return > net_api.network_stats() File > "/usr/lib/python2.7/site-packages/vdsm/network/api.py", > line 63, in network_stats return netstats.report() > File > "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", > line 31, in report stats = link_stats.report() > File > "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", > line 41, in report speed = vlan.speed(i.device) > File > "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", > line 36, in speed dev_speed = > nic.read_speed_using_sysfs(dev_name) File > "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", > line 48, in read_speed_using_sysfs s = int(f.read()) > IOError: [Errno 22] Invalid argument > 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) > [jsonrpc.JsonRpcServer] RPC call Host.getStats failed > (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 > 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] > Calling 'Host.getAllVmStats' in bridge with {} > (__init__:329) 2018-09-11 00:40:14,657-0400 INFO > (jsonrpc/2) [api.host] START getAllVmStats() > from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO > (jsonrpc/2) [api.host] FINISH getAllVmStats > return={'status': {'message': 'Done', 'code': 0}, > 'statsList': (suppressed)} from=::1,46772 (api:53) > 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in > bridge with (suppressed) (__init__:356) 2018-09-11 > 00:40:14,661-0400 INFO (jsonrpc/2) > [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats > succeeded in 0.00 seconds (__init__:312) 2018-09-11 > 00:40:14,674-0400 DEBUG (jsonrpc/1) > [jsonrpc.JsonRpcServer] Calling > 'Host.getAllVmIoTunePolicies' in bridge with {} > (__init__:329) 2018-09-11 00:40:14,675-0400 INFO > (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() > from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO > (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies > return={'status': {'message': 'Done', 'code': 0}, > 'io_tune_policies_dict': > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], > 'current_values': [{'ioTune': {'write_bytes_sec': 0L, > 'total_iops_sec': 0L, 'read_iops_sec': 0L, > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, > 'total_bytes_sec': 0L}, 'path': > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', > 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 > 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] > Return 'Host.getAllVmIoTunePolicies' in bridge with > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], > 'current_values': [{'ioTune': {'write_bytes_sec': 0L, > 'total_iops_sec': 0L, 'read_iops_sec': 0L, > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, > 'total_bytes_sec': 0L}, 'path': > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ e2570635-a798-4e9e-a0ea-88436d5b296f', > 'name': 'vda'}]}} (__init__:356) 2018-09-11 > 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] > RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 > seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG > (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling > 'Host.getAllVmStats' in bridge with {} (__init__:329) > 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] > START getAllVmStats() from=::ffff:192.168.201.4,49184 > (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) > [api.host] FINISH getAllVmStats return={'status': > {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} > from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 > 00:40:16,122-0400 DEBUG (jsonrpc/7) > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in > bridge with (suppressed) (__init__:356) 2018-09-11 > 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] > RPC call Host.getAllVmStats succeeded in 0.01 seconds > (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG > (qgapoller/2) [vds] Not sending QEMU-GA command > 'guest-get-users' to > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is > not supported (qemuguestagent:192) 2018-09-11 > 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] > Looking for stale paused VMs (periodic:388) > > I can also see this in Super vdsm which may be related but > I an not 100% sure its related to current issue: > > MainProcess|jsonrpc/2::DEBUG::2018-09-11 > 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: > <err> = ''; <rc> = 0 > ifup/on039d2d88d8ee4::DEBUG::2018-09-11 > 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: > <err> = 'Running scope as unit > 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/ network-scripts/ifup-eth: > line 321: 20715 Terminated > /sbin/dhclient ${DHCLIENTARGS} > ${DEVICE}\nDevice "on039d2d88d8ee4" does not > exist.\nCannot find device "on039d2d88d8ee4"\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not > exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevic e > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not > exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevic e > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not > exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevic e > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\nDevice > "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 > ifup/on039d2d88d8ee4::ERROR::2018-09-11 > 00:33:20,238::concurrent::202::root::(run) FINISH thread > <Thread(ifup/on039d2d88d8ee4, stopped daemon > 140281550731008)> failed Traceback (most recent call > last): File > "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", > line 195, in run ret = func(*args, **kwargs) File > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > line 948, in _exec_ifup _exec_ifup_by_name(iface.name, > cgroup) File > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > line 934, in _exec_ifup_by_name > raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if > out else '') ConfigNetworkError: (29, '\n') > >
What happens in basic-suite-master/test-scenarios/003_basic_networking.py : modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, so vdsm is calling ifup, which tries to configure via DHCP in the background. The next test detach_vm_network_from_host_0 removes the network, while ifup still hangs in dhcp configuration, because in this VLAN there is no DHCP server. So from my point of view this error in supervdsm.log is legitimate, because VDSM was advised to configure via DHCP, which fails because the interface is removed during dhcp configuration. I did not recognize that this triggeres any chained errors.
> thanks, > Dafna

Hi Dominic, should detach_vm_network_from_host_0 be running while modify_host_0_ip_to_dhcp set VM_NETWORK is running? On Thu, Sep 13, 2018 at 8:05 AM, Dominik Holler <dholler@redhat.com> wrote:
On Wed, 12 Sep 2018 08:17:46 +0100 Dafna Ron <dron@redhat.com> wrote:
Thanks Dominic. Can you please review the network tests to fix this race between the two tests?
Dafna, why do you think the current behavior should be changed, or how the current behavior should be improved?
On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 17:18:15 +0200 Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 12:22:21 +0100 Dafna Ron <dron@redhat.com> wrote:
Can someone take ownership to fix it?
I will take ownership and track the fix.
On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek <phoracek@redhat.com> wrote:
vdsm.log is the same issue, supervdsm.log seems unrelated.
2018-09-11 11:59 GMT+02:00 Dominik Holler <dholler@redhat.com>: > On Tue, 11 Sep 2018 10:26:02 +0100 > Dafna Ron <dron@redhat.com> wrote: > > > Hi, > > > > I have been seeing random failures of tests in different > > projects caused by vdsm monitoring. > > > > I need someone from vdsm to please help debug this issue. > > > > Petr, is this the same problem like yesterday in > > [ovirt-devel] failure in ost test - Invalid argument - help > in debugging issue > > > From what I can see, the test suspend/resume vm failed > > because we could not query the status of the vm on the > > host. > > > > you can see full log from failed tests here: > > > > https://jenkins.ovirt.org/job/ovirt-master_change-queue-test
> er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- > suite-master/post-004_basic_sanity.py/ > > > > Here are the errors that I can see in the vdsm which seem > > to suggest that there is an issue getting stats on the vm > > which was suspended and caused the failure of the test: > > > > > > > > 2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) > > [storage.StorageDomain] Removing remnants of deleted > > images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 DEBUG > > (qgapoller/1) [vds] Not sending QEMU-GA command > > 'guest-get-users' to > > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is > > not supported (qemuguestagent:192) 2018-09-11 > > 00:40:08,068-0400 DEBUG (periodic/3) > > [virt.sampling.VMBulkstatsMonitor] sampled timestamp > > 4296118.49 elapsed 0.010 acquired True domains all > > (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG > > (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling > > 'Image.prepare' in bridge with {u'allowIllegal': True, > > u'storagepoolID': u'e80a56d9-74da-498a-b010-4a9df287f11d', > > u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', > > u'volumeID': u'15b07af1-625b-42e3-b62a-8e7c7a120a56', > > u'storagedomainID': > > u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329) > > > > > > 2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) [virt.vm] > > (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest > > connection timed out (guestagent:556) > > 2018-09-11 00:40:11,637-0400 DEBUG (jsonrpc/5) > > [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in bridge > > with {} (__init__:329) 2018-09-11 00:40:11,637-0400 INFO > > (jsonrpc/5) [api.host] START getStats() > > from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 > > 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] cannot read eth0 > > speed (nic:42) 2018-09-11 00:40:11,645-0400 DEBUG > > (jsonrpc/5) [root] cannot read eth1 speed (nic:42) > > 2018-09-11 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] > > cannot read eth2 speed (nic:42) > > 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] > > cannot read eth3 speed (nic:42) > > 2018-09-11 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] > > FINISH getStats error=[Errno 22] Invalid argument > > from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 > > 00:40:11,667-0400 ERROR (jsonrpc/5) [jsonrpc.JsonRpcServer] > > Internal server error (__init__:350) Traceback (most recent > > call last): File > > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", > > line 345, in _handle_request res = method(**params) > > File > > "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", > > line 202, in _dynamicMethod result = fn(*methodArgs) > > File "<string>", line 2, in getStats > > File > > "/usr/lib/python2.7/site-packages/vdsm/common/api.py", > > line 49, in method ret = func(*args, **kwargs) > > File "/usr/lib/python2.7/site-packages/vdsm/API.py", line > > 1407, in getStats > > multipath=True)} > > File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", > > line 49, in get_stats > > decStats = stats.produce(first_sample, last_sample) > > File > > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > > line 71, in produce stats.update(get_interfaces_stats()) > > File > > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > > line 153, in get_interfaces_stats return > > net_api.network_stats() File > > "/usr/lib/python2.7/site-packages/vdsm/network/api.py", > > line 63, in network_stats return netstats.report() > > File > > "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", > > line 31, in report stats = link_stats.report() > > File > > "/usr/lib/python2.7/site-packages/vdsm/network/link/ stats.py", > > line 41, in report speed = vlan.speed(i.device) > > File > > "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", > > line 36, in speed dev_speed = > > nic.read_speed_using_sysfs(dev_name) File > > "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", > > line 48, in read_speed_using_sysfs s = int(f.read()) > > IOError: [Errno 22] Invalid argument > > 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) > > [jsonrpc.JsonRpcServer] RPC call Host.getStats failed > > (error -32603) in 0.03 seconds (__init__:312) 2018-09-11 > > 00:40:14,656-0400 DEBUG (jsonrpc/2) [jsonrpc.JsonRpcServer] > > Calling 'Host.getAllVmStats' in bridge with {} > > (__init__:329) 2018-09-11 00:40:14,657-0400 INFO > > (jsonrpc/2) [api.host] START getAllVmStats() > > from=::1,46772 (api:47) 2018-09-11 00:40:14,659-0400 INFO > > (jsonrpc/2) [api.host] FINISH getAllVmStats > > return={'status': {'message': 'Done', 'code': 0}, > > 'statsList': (suppressed)} from=::1,46772 (api:53) > > 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) > > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in > > bridge with (suppressed) (__init__:356) 2018-09-11 > > 00:40:14,661-0400 INFO (jsonrpc/2) > > [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats > > succeeded in 0.00 seconds (__init__:312) 2018-09-11 > > 00:40:14,674-0400 DEBUG (jsonrpc/1) > > [jsonrpc.JsonRpcServer] Calling > > 'Host.getAllVmIoTunePolicies' in bridge with {} > > (__init__:329) 2018-09-11 00:40:14,675-0400 INFO > > (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() > > from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 INFO > > (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies > > return={'status': {'message': 'Done', 'code': 0}, > > 'io_tune_policies_dict': > > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], > > 'current_values': [{'ioTune': {'write_bytes_sec': 0L, > > 'total_iops_sec': 0L, 'read_iops_sec': 0L, > > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, > > 'total_bytes_sec': 0L}, 'path': > > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- > 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ > e2570635-a798-4e9e-a0ea-88436d5b296f', > > 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 > > 00:40:14,675-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] > > Return 'Host.getAllVmIoTunePolicies' in bridge with > > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], > > 'current_values': [{'ioTune': {'write_bytes_sec': 0L, > > 'total_iops_sec': 0L, 'read_iops_sec': 0L, > > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, > > 'total_bytes_sec': 0L}, 'path': > > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- > 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ > e2570635-a798-4e9e-a0ea-88436d5b296f', > > 'name': 'vda'}]}} (__init__:356) 2018-09-11 > > 00:40:14,675-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] > > RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 > > seconds (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG > > (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling > > 'Host.getAllVmStats' in bridge with {} (__init__:329) > > 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) [api.host] > > START getAllVmStats() from=::ffff:192.168.201.4,49184 > > (api:47) 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) > > [api.host] FINISH getAllVmStats return={'status': > > {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} > > from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 > > 00:40:16,122-0400 DEBUG (jsonrpc/7) > > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in > > bridge with (suppressed) (__init__:356) 2018-09-11 > > 00:40:16,122-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] > > RPC call Host.getAllVmStats succeeded in 0.01 seconds > > (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG > > (qgapoller/2) [vds] Not sending QEMU-GA command > > 'guest-get-users' to > > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command is > > not supported (qemuguestagent:192) 2018-09-11 > > 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] > > Looking for stale paused VMs (periodic:388) > > > > I can also see this in Super vdsm which may be related but > > I an not 100% sure its related to current issue: > > > > MainProcess|jsonrpc/2::DEBUG::2018-09-11 > > 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: > > <err> = ''; <rc> = 0 > > ifup/on039d2d88d8ee4::DEBUG::2018-09-11 > > 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: > > <err> = 'Running scope as unit > > 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/
> network-scripts/ifup-eth: > > line 321: 20715 Terminated > > /sbin/dhclient ${DHCLIENTARGS} > > ${DEVICE}\nDevice "on039d2d88d8ee4" does not > > exist.\nCannot find device "on039d2d88d8ee4"\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d > > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not > > exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevic e > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d > > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not > > exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevic e > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d > > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does not > > exist.\nDevice "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevic e > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\nDevice > > "on039d2d88d8ee4" does not exist.\n'; <rc> = 1 > > ifup/on039d2d88d8ee4::ERROR::2018-09-11 > > 00:33:20,238::concurrent::202::root::(run) FINISH thread > > <Thread(ifup/on039d2d88d8ee4, stopped daemon > > 140281550731008)> failed Traceback (most recent call > > last): File > > "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", > > line 195, in run ret = func(*args, **kwargs) File > > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > > line 948, in _exec_ifup _exec_ifup_by_name(iface.name, > > cgroup) File > > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > > line 934, in _exec_ifup_by_name > > raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if > > out else '') ConfigNetworkError: (29, '\n') > > > >
What happens in basic-suite-master/test-scenarios/003_basic_networking.py : modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, so vdsm is calling ifup, which tries to configure via DHCP in the background. The next test detach_vm_network_from_host_0 removes the network, while ifup still hangs in dhcp configuration, because in this VLAN there is no DHCP server. So from my point of view this error in supervdsm.log is legitimate, because VDSM was advised to configure via DHCP, which fails because the interface is removed during dhcp configuration. I did not recognize that this triggeres any chained errors.
> > thanks, > > Dafna > >

On Thu, 13 Sep 2018 10:32:12 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi Dominic,
should detach_vm_network_from_host_0 be running while modify_host_0_ip_to_dhcp set VM_NETWORK is running?
modify_host_0_ip_to_dhcp just initiates the async dhcp client in the background on the host and returns in the test environment.
On Thu, Sep 13, 2018 at 8:05 AM, Dominik Holler <dholler@redhat.com> wrote:
On Wed, 12 Sep 2018 08:17:46 +0100 Dafna Ron <dron@redhat.com> wrote:
Thanks Dominic. Can you please review the network tests to fix this race between the two tests?
Dafna, why do you think the current behavior should be changed, or how the current behavior should be improved?
On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 17:18:15 +0200 Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 12:22:21 +0100 Dafna Ron <dron@redhat.com> wrote:
Can someone take ownership to fix it?
I will take ownership and track the fix.
On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek <phoracek@redhat.com> wrote:
> vdsm.log is the same issue, supervdsm.log seems unrelated. > > 2018-09-11 11:59 GMT+02:00 Dominik Holler > <dholler@redhat.com>: >> On Tue, 11 Sep 2018 10:26:02 +0100 >> Dafna Ron <dron@redhat.com> wrote: >> >> > Hi, >> > >> > I have been seeing random failures of tests in >> > different projects caused by vdsm monitoring. >> > >> > I need someone from vdsm to please help debug this >> > issue. >> >> Petr, is this the same problem like yesterday in >> >> [ovirt-devel] failure in ost test - Invalid argument - >> help in debugging issue >> >> > From what I can see, the test suspend/resume vm failed >> > because we could not query the status of the vm on the >> > host. >> > >> > you can see full log from failed tests here: >> > >> > https://jenkins.ovirt.org/job/ovirt-master_change-queue-test
>> er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- >> suite-master/post-004_basic_sanity.py/ >> > >> > Here are the errors that I can see in the vdsm which >> > seem to suggest that there is an issue getting stats >> > on the vm which was suspended and caused the failure >> > of the test: >> > >> > >> > >> > 2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) >> > [storage.StorageDomain] Removing remnants of deleted >> > images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 >> > DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command >> > 'guest-get-users' to >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command >> > is not supported (qemuguestagent:192) 2018-09-11 >> > 00:40:08,068-0400 DEBUG (periodic/3) >> > [virt.sampling.VMBulkstatsMonitor] sampled timestamp >> > 4296118.49 elapsed 0.010 acquired True domains all >> > (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG >> > (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling >> > 'Image.prepare' in bridge with {u'allowIllegal': True, >> > u'storagepoolID': >> > u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': >> > u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': >> > u'15b07af1-625b-42e3-b62a-8e7c7a120a56', >> > u'storagedomainID': >> > u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329) >> > >> > >> > 2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) >> > [virt.vm] >> > (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest >> > connection timed out (guestagent:556) 2018-09-11 >> > 00:40:11,637-0400 DEBUG (jsonrpc/5) >> > [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in >> > bridge with {} (__init__:329) 2018-09-11 >> > 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START >> > getStats() from=::ffff:192.168.201.4,49184 (api:47) >> > 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] >> > cannot read eth0 speed (nic:42) 2018-09-11 >> > 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read >> > eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG >> > (jsonrpc/5) [root] cannot read eth2 speed (nic:42) >> > 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] >> > cannot read eth3 speed (nic:42) 2018-09-11 >> > 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH >> > getStats error=[Errno 22] Invalid argument >> > from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 >> > 00:40:11,667-0400 ERROR (jsonrpc/5) >> > [jsonrpc.JsonRpcServer] Internal server error >> > (__init__:350) Traceback (most recent call last): File >> > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", >> > line 345, in _handle_request res = method(**params) >> > File >> > "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", >> > line 202, in _dynamicMethod result = fn(*methodArgs) >> > File "<string>", line 2, in getStats File >> > "/usr/lib/python2.7/site-packages/vdsm/common/api.py", >> > line 49, in method ret = func(*args, **kwargs) >> > File "/usr/lib/python2.7/site-packages/vdsm/API.py", >> > line 1407, in getStats >> > multipath=True)} >> > File >> > "/usr/lib/python2.7/site-packages/vdsm/host/api.py", >> > line 49, in get_stats decStats = >> > stats.produce(first_sample, last_sample) File >> > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", >> > line 71, in produce >> > stats.update(get_interfaces_stats()) File >> > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", >> > line 153, in get_interfaces_stats return >> > net_api.network_stats() File >> > "/usr/lib/python2.7/site-packages/vdsm/network/api.py", >> > line 63, in network_stats return netstats.report() >> > File >> > "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", >> > line 31, in report stats = link_stats.report() >> > File >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/ stats.py", >> > line 41, in report speed = vlan.speed(i.device) >> > File >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/vlan.py", >> > line 36, in speed dev_speed = >> > nic.read_speed_using_sysfs(dev_name) File >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/nic.py", >> > line 48, in read_speed_using_sysfs s = int(f.read()) >> > IOError: [Errno 22] Invalid argument >> > 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) >> > [jsonrpc.JsonRpcServer] RPC call Host.getStats failed >> > (error -32603) in 0.03 seconds (__init__:312) >> > 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) >> > [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' >> > in bridge with {} (__init__:329) 2018-09-11 >> > 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START >> > getAllVmStats() from=::1,46772 (api:47) 2018-09-11 >> > 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH >> > getAllVmStats return={'status': {'message': 'Done', >> > 'code': 0}, 'statsList': (suppressed)} from=::1,46772 >> > (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) >> > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in >> > bridge with (suppressed) (__init__:356) 2018-09-11 >> > 00:40:14,661-0400 INFO (jsonrpc/2) >> > [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats >> > succeeded in 0.00 seconds (__init__:312) 2018-09-11 >> > 00:40:14,674-0400 DEBUG (jsonrpc/1) >> > [jsonrpc.JsonRpcServer] Calling >> > 'Host.getAllVmIoTunePolicies' in bridge with {} >> > (__init__:329) 2018-09-11 00:40:14,675-0400 INFO >> > (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() >> > from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 >> > INFO (jsonrpc/1) [api.host] FINISH >> > getAllVmIoTunePolicies return={'status': {'message': >> > 'Done', 'code': 0}, 'io_tune_policies_dict': >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], >> > 'current_values': [{'ioTune': {'write_bytes_sec': 0L, >> > 'total_iops_sec': 0L, 'read_iops_sec': 0L, >> > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, >> > 'total_bytes_sec': 0L}, 'path': >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- >> 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ >> e2570635-a798-4e9e-a0ea-88436d5b296f', >> > 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 >> > 00:40:14,675-0400 DEBUG (jsonrpc/1) >> > [jsonrpc.JsonRpcServer] Return >> > 'Host.getAllVmIoTunePolicies' in bridge with >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': >> > [], 'current_values': [{'ioTune': {'write_bytes_sec': >> > 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, >> > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, >> > 'total_bytes_sec': 0L}, 'path': >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- >> 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ >> e2570635-a798-4e9e-a0ea-88436d5b296f', >> > 'name': 'vda'}]}} (__init__:356) 2018-09-11 >> > 00:40:14,675-0400 INFO (jsonrpc/1) >> > [jsonrpc.JsonRpcServer] RPC call >> > Host.getAllVmIoTunePolicies succeeded in 0.00 seconds >> > (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG >> > (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling >> > 'Host.getAllVmStats' in bridge with {} (__init__:329) >> > 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) >> > [api.host] START getAllVmStats() >> > from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 >> > 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH >> > getAllVmStats return={'status': {'message': 'Done', >> > 'code': 0}, 'statsList': (suppressed)} >> > from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 >> > 00:40:16,122-0400 DEBUG (jsonrpc/7) >> > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in >> > bridge with (suppressed) (__init__:356) 2018-09-11 >> > 00:40:16,122-0400 INFO (jsonrpc/7) >> > [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats >> > succeeded in 0.01 seconds (__init__:312) 2018-09-11 >> > 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not >> > sending QEMU-GA command 'guest-get-users' to >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command >> > is not supported (qemuguestagent:192) 2018-09-11 >> > 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] >> > Looking for stale paused VMs (periodic:388) >> > >> > I can also see this in Super vdsm which may be related >> > but I an not 100% sure its related to current issue: >> > >> > MainProcess|jsonrpc/2::DEBUG::2018-09-11 >> > 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: >> > <err> = ''; <rc> = 0 >> > ifup/on039d2d88d8ee4::DEBUG::2018-09-11 >> > 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: >> > <err> = 'Running scope as unit >> > 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/
>> network-scripts/ifup-eth: >> > line 321: 20715 Terminated >> > /sbin/dhclient ${DHCLIENTARGS} >> > ${DEVICE}\nDevice "on039d2d88d8ee4" does not >> > exist.\nCannot find device "on039d2d88d8ee4"\nDevice >> > "on039d2d88d8ee4" does not exist.\nDevice >> > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d >> > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does >> > not exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevic e "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d 8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevic e "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d 8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevic e "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not >> > exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; >> > <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 >> > 00:33:20,238::concurrent::202::root::(run) FINISH >> > thread <Thread(ifup/on039d2d88d8ee4, stopped daemon >> > 140281550731008)> failed Traceback (most recent call >> > last): File >> > "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", >> > line 195, in run ret = func(*args, **kwargs) File >> > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", >> > line 948, in _exec_ifup _exec_ifup_by_name(iface.name, >> > cgroup) File >> > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", >> > line 934, in _exec_ifup_by_name >> > raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] >> > if out else '') ConfigNetworkError: (29, '\n') >> > >> >
What happens in basic-suite-master/test-scenarios/003_basic_networking.py : modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, so vdsm is calling ifup, which tries to configure via DHCP in the background. The next test detach_vm_network_from_host_0 removes the network, while ifup still hangs in dhcp configuration, because in this VLAN there is no DHCP server. So from my point of view this error in supervdsm.log is legitimate, because VDSM was advised to configure via DHCP, which fails because the interface is removed during dhcp configuration. I did not recognize that this triggeres any chained errors.
>> > thanks, >> > Dafna >> >> >

but this test is currently failing no? On Thu, Sep 13, 2018 at 10:49 AM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 13 Sep 2018 10:32:12 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi Dominic,
should detach_vm_network_from_host_0 be running while modify_host_0_ip_to_dhcp set VM_NETWORK is running?
modify_host_0_ip_to_dhcp just initiates the async dhcp client in the background on the host and returns in the test environment.
On Thu, Sep 13, 2018 at 8:05 AM, Dominik Holler <dholler@redhat.com> wrote:
On Wed, 12 Sep 2018 08:17:46 +0100 Dafna Ron <dron@redhat.com> wrote:
Thanks Dominic. Can you please review the network tests to fix this race between the two tests?
Dafna, why do you think the current behavior should be changed, or how the current behavior should be improved?
On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 17:18:15 +0200 Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 12:22:21 +0100 Dafna Ron <dron@redhat.com> wrote:
> Can someone take ownership to fix it? >
I will take ownership and track the fix.
> On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek > <phoracek@redhat.com> wrote: > > > vdsm.log is the same issue, supervdsm.log seems unrelated. > > > > 2018-09-11 11:59 GMT+02:00 Dominik Holler > > <dholler@redhat.com>: > >> On Tue, 11 Sep 2018 10:26:02 +0100 > >> Dafna Ron <dron@redhat.com> wrote: > >> > >> > Hi, > >> > > >> > I have been seeing random failures of tests in > >> > different projects caused by vdsm monitoring. > >> > > >> > I need someone from vdsm to please help debug this > >> > issue. > >> > >> Petr, is this the same problem like yesterday in > >> > >> [ovirt-devel] failure in ost test - Invalid argument - > >> help in debugging issue > >> > >> > From what I can see, the test suspend/resume vm failed > >> > because we could not query the status of the vm on the > >> > host. > >> > > >> > you can see full log from failed tests here: > >> > > >> > https://jenkins.ovirt.org/job/
ovirt-master_change-queue-test
> >> er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- > >> suite-master/post-004_basic_sanity.py/ > >> > > >> > Here are the errors that I can see in the vdsm which > >> > seem to suggest that there is an issue getting stats > >> > on the vm which was suspended and caused the failure > >> > of the test: > >> > > >> > > >> > > >> > 2018-09-11 00:40:05,896-0400 INFO (monitor/c1fe6e6) > >> > [storage.StorageDomain] Removing remnants of deleted > >> > images [] (fileSD:734) 2018-09-11 00:40:07,957-0400 > >> > DEBUG (qgapoller/1) [vds] Not sending QEMU-GA command > >> > 'guest-get-users' to > >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command > >> > is not supported (qemuguestagent:192) 2018-09-11 > >> > 00:40:08,068-0400 DEBUG (periodic/3) > >> > [virt.sampling.VMBulkstatsMonitor] sampled timestamp > >> > 4296118.49 elapsed 0.010 acquired True domains all > >> > (sampling:443) 2018-09-11 00:40:08,271-0400 DEBUG > >> > (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling > >> > 'Image.prepare' in bridge with {u'allowIllegal': True, > >> > u'storagepoolID': > >> > u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID': > >> > u'd4c831e6-02d2-4d89-b516-0ec4597 5e024', u'volumeID': > >> > u'15b07af1-625b-42e3-b62a-8e7c7a120a56', > >> > u'storagedomainID': > >> > u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329) > >> > > >> > > >> > 2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) > >> > [virt.vm] > >> > (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest > >> > connection timed out (guestagent:556) 2018-09-11 > >> > 00:40:11,637-0400 DEBUG (jsonrpc/5) > >> > [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in > >> > bridge with {} (__init__:329) 2018-09-11 > >> > 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START > >> > getStats() from=::ffff:192.168.201.4,49184 (api:47) > >> > 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) [root] > >> > cannot read eth0 speed (nic:42) 2018-09-11 > >> > 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot read > >> > eth1 speed (nic:42) 2018-09-11 00:40:11,647-0400 DEBUG > >> > (jsonrpc/5) [root] cannot read eth2 speed (nic:42) > >> > 2018-09-11 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] > >> > cannot read eth3 speed (nic:42) 2018-09-11 > >> > 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] FINISH > >> > getStats error=[Errno 22] Invalid argument > >> > from=::ffff:192.168.201.4,49184 (api:51) 2018-09-11 > >> > 00:40:11,667-0400 ERROR (jsonrpc/5) > >> > [jsonrpc.JsonRpcServer] Internal server error > >> > (__init__:350) Traceback (most recent call last): File > >> > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", > >> > line 345, in _handle_request res = method(**params) > >> > File > >> > "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", > >> > line 202, in _dynamicMethod result = fn(*methodArgs) > >> > File "<string>", line 2, in getStats File > >> > "/usr/lib/python2.7/site-packages/vdsm/common/api.py", > >> > line 49, in method ret = func(*args, **kwargs) > >> > File "/usr/lib/python2.7/site-packages/vdsm/API.py", > >> > line 1407, in getStats > >> > multipath=True)} > >> > File > >> > "/usr/lib/python2.7/site-packages/vdsm/host/api.py", > >> > line 49, in get_stats decStats = > >> > stats.produce(first_sample, last_sample) File > >> > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > >> > line 71, in produce > >> > stats.update(get_interfaces_stats()) File > >> > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > >> > line 153, in get_interfaces_stats return > >> > net_api.network_stats() File > >> > "/usr/lib/python2.7/site-packages/vdsm/network/api.py", > >> > line 63, in network_stats return netstats.report() > >> > File > >> > "/usr/lib/python2.7/site-packages/vdsm/network/
netstats.py",
> >> > line 31, in report stats = link_stats.report() > >> > File > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/ stats.py", > >> > line 41, in report speed = vlan.speed(i.device) > >> > File > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/ vlan.py", > >> > line 36, in speed dev_speed = > >> > nic.read_speed_using_sysfs(dev_name) File > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/ nic.py", > >> > line 48, in read_speed_using_sysfs s = int(f.read()) > >> > IOError: [Errno 22] Invalid argument > >> > 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) > >> > [jsonrpc.JsonRpcServer] RPC call Host.getStats failed > >> > (error -32603) in 0.03 seconds (__init__:312) > >> > 2018-09-11 00:40:14,656-0400 DEBUG (jsonrpc/2) > >> > [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' > >> > in bridge with {} (__init__:329) 2018-09-11 > >> > 00:40:14,657-0400 INFO (jsonrpc/2) [api.host] START > >> > getAllVmStats() from=::1,46772 (api:47) 2018-09-11 > >> > 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] FINISH > >> > getAllVmStats return={'status': {'message': 'Done', > >> > 'code': 0}, 'statsList': (suppressed)} from=::1,46772 > >> > (api:53) 2018-09-11 00:40:14,660-0400 DEBUG (jsonrpc/2) > >> > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in > >> > bridge with (suppressed) (__init__:356) 2018-09-11 > >> > 00:40:14,661-0400 INFO (jsonrpc/2) > >> > [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats > >> > succeeded in 0.00 seconds (__init__:312) 2018-09-11 > >> > 00:40:14,674-0400 DEBUG (jsonrpc/1) > >> > [jsonrpc.JsonRpcServer] Calling > >> > 'Host.getAllVmIoTunePolicies' in bridge with {} > >> > (__init__:329) 2018-09-11 00:40:14,675-0400 INFO > >> > (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() > >> > from=::1,46772 (api:47) 2018-09-11 00:40:14,675-0400 > >> > INFO (jsonrpc/1) [api.host] FINISH > >> > getAllVmIoTunePolicies return={'status': {'message': > >> > 'Done', 'code': 0}, 'io_tune_policies_dict': > >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [], > >> > 'current_values': [{'ioTune': {'write_bytes_sec': 0L, > >> > 'total_iops_sec': 0L, 'read_iops_sec': 0L, > >> > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, > >> > 'total_bytes_sec': 0L}, 'path': > >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- > >> 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ > >> e2570635-a798-4e9e-a0ea-88436d5b296f', > >> > 'name': 'vda'}]}}} from=::1,46772 (api:53) 2018-09-11 > >> > 00:40:14,675-0400 DEBUG (jsonrpc/1) > >> > [jsonrpc.JsonRpcServer] Return > >> > 'Host.getAllVmIoTunePolicies' in bridge with > >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': > >> > [], 'current_values': [{'ioTune': {'write_bytes_sec': > >> > 0L, 'total_iops_sec': 0L, 'read_iops_sec': 0L, > >> > 'read_bytes_sec': 0L, 'write_iops_sec': 0L, > >> > 'total_bytes_sec': 0L}, 'path': > >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- > >> 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ > >> e2570635-a798-4e9e-a0ea-88436d5b296f', > >> > 'name': 'vda'}]}} (__init__:356) 2018-09-11 > >> > 00:40:14,675-0400 INFO (jsonrpc/1) > >> > [jsonrpc.JsonRpcServer] RPC call > >> > Host.getAllVmIoTunePolicies succeeded in 0.00 seconds > >> > (__init__:312) 2018-09-11 00:40:16,118-0400 DEBUG > >> > (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling > >> > 'Host.getAllVmStats' in bridge with {} (__init__:329) > >> > 2018-09-11 00:40:16,119-0400 INFO (jsonrpc/7) > >> > [api.host] START getAllVmStats() > >> > from=::ffff:192.168.201.4,49184 (api:47) 2018-09-11 > >> > 00:40:16,121-0400 INFO (jsonrpc/7) [api.host] FINISH > >> > getAllVmStats return={'status': {'message': 'Done', > >> > 'code': 0}, 'statsList': (suppressed)} > >> > from=::ffff:192.168.201.4,49184 (api:53) 2018-09-11 > >> > 00:40:16,122-0400 DEBUG (jsonrpc/7) > >> > [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in > >> > bridge with (suppressed) (__init__:356) 2018-09-11 > >> > 00:40:16,122-0400 INFO (jsonrpc/7) > >> > [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats > >> > succeeded in 0.01 seconds (__init__:312) 2018-09-11 > >> > 00:40:17,956-0400 DEBUG (qgapoller/2) [vds] Not > >> > sending QEMU-GA command 'guest-get-users' to > >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command > >> > is not supported (qemuguestagent:192) 2018-09-11 > >> > 00:40:18,025-0400 DEBUG (periodic/3) [virt.periodic] > >> > Looking for stale paused VMs (periodic:388) > >> > > >> > I can also see this in Super vdsm which may be related > >> > but I an not 100% sure its related to current issue: > >> > > >> > MainProcess|jsonrpc/2::DEBUG::2018-09-11 > >> > 00:33:20,012::cmdutils::159::root::(exec_cmd) SUCCESS: > >> > <err> = ''; <rc> = 0 > >> > ifup/on039d2d88d8ee4::DEBUG::2018-09-11 > >> > 00:33:20,238::cmdutils::159::root::(exec_cmd) FAILED: > >> > <err> = 'Running scope as unit > >> > 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/
> >> network-scripts/ifup-eth: > >> > line 321: 20715 Terminated > >> > /sbin/dhclient ${DHCLIENTARGS} > >> > ${DEVICE}\nDevice "on039d2d88d8ee4" does not > >> > exist.\nCannot find device "on039d2d88d8ee4"\nDevice > >> > "on039d2d88d8ee4" does not exist.\nDevice > >> > "on039d2d88d8ee4" does not exist.\nDevice "on039d2d88d > >> > 8ee4" does not exist.\nDevice "on039d2d88d8ee4" does > >> > not exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevic e "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d 8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevic e "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d 8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevic e "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not > >> > exist.\nDevice "on039d2d88d8ee4" does not exist.\n'; > >> > <rc> = 1 ifup/on039d2d88d8ee4::ERROR::2018-09-11 > >> > 00:33:20,238::concurrent::202::root::(run) FINISH > >> > thread <Thread(ifup/on039d2d88d8ee4, stopped daemon > >> > 140281550731008)> failed Traceback (most recent call > >> > last): File > >> > "/usr/lib/python2.7/site-packages/vdsm/common/
concurrent.py",
> >> > line 195, in run ret = func(*args, **kwargs) File > >> > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > >> > line 948, in _exec_ifup _exec_ifup_by_name(iface.name, > >> > cgroup) File > >> > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > >> > line 934, in _exec_ifup_by_name > >> > raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] > >> > if out else '') ConfigNetworkError: (29, '\n') > >> > > >> >
What happens in basic-suite-master/test-scenarios/003_basic_networking.py : modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, so vdsm is calling ifup, which tries to configure via DHCP in the background. The next test detach_vm_network_from_host_0 removes the network, while ifup still hangs in dhcp configuration, because in this VLAN there is no DHCP server. So from my point of view this error in supervdsm.log is legitimate, because VDSM was advised to configure via DHCP, which fails because the interface is removed during dhcp configuration. I did not recognize that this triggeres any chained errors.
> >> > thanks, > >> > Dafna > >> > >> > >

On Thu, 13 Sep 2018 10:53:20 +0100 Dafna Ron <dron@redhat.com> wrote:
but this test is currently failing no?
No, only the issue related to the vlan.speed() failed. This is already addressed in "failure in ost test - Invalid argument - help in debugging issue" and Petr provided a fix for master and ovirt-4.2 in https://gerrit.ovirt.org/#/c/94301/ So I am not aware of any network related OST fail.
On Thu, Sep 13, 2018 at 10:49 AM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 13 Sep 2018 10:32:12 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi Dominic,
should detach_vm_network_from_host_0 be running while modify_host_0_ip_to_dhcp set VM_NETWORK is running?
modify_host_0_ip_to_dhcp just initiates the async dhcp client in the background on the host and returns in the test environment.
On Thu, Sep 13, 2018 at 8:05 AM, Dominik Holler <dholler@redhat.com> wrote:
On Wed, 12 Sep 2018 08:17:46 +0100 Dafna Ron <dron@redhat.com> wrote:
Thanks Dominic. Can you please review the network tests to fix this race between the two tests?
Dafna, why do you think the current behavior should be changed, or how the current behavior should be improved?
On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler <dholler@redhat.com> wrote:
On Tue, 11 Sep 2018 17:18:15 +0200 Dominik Holler <dholler@redhat.com> wrote:
> On Tue, 11 Sep 2018 12:22:21 +0100 > Dafna Ron <dron@redhat.com> wrote: > > > Can someone take ownership to fix it? > > > > I will take ownership and track the fix. > > > > On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek > > <phoracek@redhat.com> wrote: > > > > > vdsm.log is the same issue, supervdsm.log seems > > > unrelated. > > > > > > 2018-09-11 11:59 GMT+02:00 Dominik Holler > > > <dholler@redhat.com>: > > >> On Tue, 11 Sep 2018 10:26:02 +0100 > > >> Dafna Ron <dron@redhat.com> wrote: > > >> > > >> > Hi, > > >> > > > >> > I have been seeing random failures of tests in > > >> > different projects caused by vdsm monitoring. > > >> > > > >> > I need someone from vdsm to please help debug this > > >> > issue. > > >> > > >> Petr, is this the same problem like yesterday in > > >> > > >> [ovirt-devel] failure in ost test - Invalid argument > > >> - help in debugging issue > > >> > > >> > From what I can see, the test suspend/resume vm > > >> > failed because we could not query the status of > > >> > the vm on the host. > > >> > > > >> > you can see full log from failed tests here: > > >> > > > >> > https://jenkins.ovirt.org/job/
ovirt-master_change-queue-test
> > >> er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic- > > >> suite-master/post-004_basic_sanity.py/ > > >> > > > >> > Here are the errors that I can see in the vdsm > > >> > which seem to suggest that there is an issue > > >> > getting stats on the vm which was suspended and > > >> > caused the failure of the test: > > >> > > > >> > > > >> > > > >> > 2018-09-11 00:40:05,896-0400 INFO > > >> > (monitor/c1fe6e6) [storage.StorageDomain] Removing > > >> > remnants of deleted images [] (fileSD:734) > > >> > 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) > > >> > [vds] Not sending QEMU-GA command > > >> > 'guest-get-users' to > > >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', > > >> > command is not supported (qemuguestagent:192) > > >> > 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) > > >> > [virt.sampling.VMBulkstatsMonitor] sampled > > >> > timestamp 4296118.49 elapsed 0.010 acquired True > > >> > domains all (sampling:443) 2018-09-11 > > >> > 00:40:08,271-0400 DEBUG (jsonrpc/1) > > >> > [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in > > >> > bridge with {u'allowIllegal': True, > > >> > u'storagepoolID': > > >> > u'e80a56d9-74da-498a-b010-4a9df287f11d', > > >> > u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 > > >> > 5e024', u'volumeID': > > >> > u'15b07af1-625b-42e3-b62a-8e7c7a120a56', > > >> > u'storagedomainID': > > >> > u'f1744940-41b6-4d35-b7bf-870c4e07d995'} > > >> > (__init__:329) > > >> > > > >> > > > >> > 2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) > > >> > [virt.vm] > > >> > (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest > > >> > connection timed out (guestagent:556) 2018-09-11 > > >> > 00:40:11,637-0400 DEBUG (jsonrpc/5) > > >> > [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in > > >> > bridge with {} (__init__:329) 2018-09-11 > > >> > 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START > > >> > getStats() from=::ffff:192.168.201.4,49184 (api:47) > > >> > 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) > > >> > [root] cannot read eth0 speed (nic:42) 2018-09-11 > > >> > 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot > > >> > read eth1 speed (nic:42) 2018-09-11 > > >> > 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot > > >> > read eth2 speed (nic:42) 2018-09-11 > > >> > 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot > > >> > read eth3 speed (nic:42) 2018-09-11 > > >> > 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] > > >> > FINISH getStats error=[Errno 22] Invalid argument > > >> > from=::ffff:192.168.201.4,49184 (api:51) > > >> > 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) > > >> > [jsonrpc.JsonRpcServer] Internal server error > > >> > (__init__:350) Traceback (most recent call last): > > >> > File > > >> > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", > > >> > line 345, in _handle_request res = > > >> > method(**params) File > > >> > "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", > > >> > line 202, in _dynamicMethod result = > > >> > fn(*methodArgs) File "<string>", line 2, in > > >> > getStats File > > >> > "/usr/lib/python2.7/site-packages/vdsm/common/api.py", > > >> > line 49, in method ret = func(*args, **kwargs) > > >> > File > > >> > "/usr/lib/python2.7/site-packages/vdsm/API.py", > > >> > line 1407, in getStats multipath=True)} File > > >> > "/usr/lib/python2.7/site-packages/vdsm/host/api.py", > > >> > line 49, in get_stats decStats = > > >> > stats.produce(first_sample, last_sample) File > > >> > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > > >> > line 71, in produce > > >> > stats.update(get_interfaces_stats()) File > > >> > "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", > > >> > line 153, in get_interfaces_stats return > > >> > net_api.network_stats() File > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/api.py", > > >> > line 63, in network_stats return netstats.report() > > >> > File > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/
netstats.py",
> > >> > line 31, in report stats = link_stats.report() > > >> > File > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/ stats.py", > > >> > line 41, in report speed = vlan.speed(i.device) > > >> > File > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/ vlan.py", > > >> > line 36, in speed dev_speed = > > >> > nic.read_speed_using_sysfs(dev_name) File > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/ nic.py", > > >> > line 48, in read_speed_using_sysfs s = > > >> > int(f.read()) IOError: [Errno 22] Invalid argument > > >> > 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) > > >> > [jsonrpc.JsonRpcServer] RPC call Host.getStats > > >> > failed (error -32603) in 0.03 seconds > > >> > (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG > > >> > (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling > > >> > 'Host.getAllVmStats' in bridge with {} > > >> > (__init__:329) 2018-09-11 00:40:14,657-0400 INFO > > >> > (jsonrpc/2) [api.host] START getAllVmStats() > > >> > from=::1,46772 (api:47) 2018-09-11 > > >> > 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] > > >> > FINISH getAllVmStats return={'status': {'message': > > >> > 'Done', 'code': 0}, 'statsList': (suppressed)} > > >> > from=::1,46772 (api:53) 2018-09-11 > > >> > 00:40:14,660-0400 DEBUG (jsonrpc/2) > > >> > [jsonrpc.JsonRpcServer] Return > > >> > 'Host.getAllVmStats' in bridge with (suppressed) > > >> > (__init__:356) 2018-09-11 00:40:14,661-0400 INFO > > >> > (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call > > >> > Host.getAllVmStats succeeded in 0.00 seconds > > >> > (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG > > >> > (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling > > >> > 'Host.getAllVmIoTunePolicies' in bridge with {} > > >> > (__init__:329) 2018-09-11 00:40:14,675-0400 INFO > > >> > (jsonrpc/1) [api.host] START > > >> > getAllVmIoTunePolicies() from=::1,46772 (api:47) > > >> > 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) > > >> > [api.host] FINISH getAllVmIoTunePolicies > > >> > return={'status': {'message': 'Done', 'code': 0}, > > >> > 'io_tune_policies_dict': > > >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': > > >> > {'policy': [], 'current_values': [{'ioTune': > > >> > {'write_bytes_sec': 0L, 'total_iops_sec': 0L, > > >> > 'read_iops_sec': 0L, 'read_bytes_sec': 0L, > > >> > 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, > > >> > 'path': > > >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- > > >> 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ > > >> e2570635-a798-4e9e-a0ea-88436d5b296f', > > >> > 'name': 'vda'}]}}} from=::1,46772 (api:53) > > >> > 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) > > >> > [jsonrpc.JsonRpcServer] Return > > >> > 'Host.getAllVmIoTunePolicies' in bridge with > > >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': > > >> > [], 'current_values': [{'ioTune': > > >> > {'write_bytes_sec': 0L, 'total_iops_sec': 0L, > > >> > 'read_iops_sec': 0L, 'read_bytes_sec': 0L, > > >> > 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, > > >> > 'path': > > >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf- > > >> 870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/ > > >> e2570635-a798-4e9e-a0ea-88436d5b296f', > > >> > 'name': 'vda'}]}} (__init__:356) 2018-09-11 > > >> > 00:40:14,675-0400 INFO (jsonrpc/1) > > >> > [jsonrpc.JsonRpcServer] RPC call > > >> > Host.getAllVmIoTunePolicies succeeded in 0.00 > > >> > seconds (__init__:312) 2018-09-11 > > >> > 00:40:16,118-0400 DEBUG (jsonrpc/7) > > >> > [jsonrpc.JsonRpcServer] Calling > > >> > 'Host.getAllVmStats' in bridge with {} > > >> > (__init__:329) 2018-09-11 00:40:16,119-0400 INFO > > >> > (jsonrpc/7) [api.host] START getAllVmStats() > > >> > from=::ffff:192.168.201.4,49184 (api:47) > > >> > 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) > > >> > [api.host] FINISH getAllVmStats return={'status': > > >> > {'message': 'Done', 'code': 0}, 'statsList': > > >> > (suppressed)} from=::ffff:192.168.201.4,49184 > > >> > (api:53) 2018-09-11 00:40:16,122-0400 DEBUG > > >> > (jsonrpc/7) [jsonrpc.JsonRpcServer] Return > > >> > 'Host.getAllVmStats' in bridge with (suppressed) > > >> > (__init__:356) 2018-09-11 00:40:16,122-0400 INFO > > >> > (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call > > >> > Host.getAllVmStats succeeded in 0.01 seconds > > >> > (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG > > >> > (qgapoller/2) [vds] Not sending QEMU-GA command > > >> > 'guest-get-users' to > > >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', > > >> > command is not supported (qemuguestagent:192) > > >> > 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) > > >> > [virt.periodic] Looking for stale paused VMs > > >> > (periodic:388) > > >> > > > >> > I can also see this in Super vdsm which may be > > >> > related but I an not 100% sure its related to > > >> > current issue: > > >> > > > >> > MainProcess|jsonrpc/2::DEBUG::2018-09-11 > > >> > 00:33:20,012::cmdutils::159::root::(exec_cmd) > > >> > SUCCESS: <err> = ''; <rc> = 0 > > >> > ifup/on039d2d88d8ee4::DEBUG::2018-09-11 > > >> > 00:33:20,238::cmdutils::159::root::(exec_cmd) > > >> > FAILED: <err> = 'Running scope as unit > > >> > 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/
> > >> network-scripts/ifup-eth: > > >> > line 321: 20715 Terminated > > >> > /sbin/dhclient ${DHCLIENTARGS} > > >> > ${DEVICE}\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nCannot find device > > >> > "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does > > >> > not exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d 8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevic e "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d 8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevic e "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d 8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevic e "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > >> > exist.\n'; <rc> = 1 > > >> > ifup/on039d2d88d8ee4::ERROR::2018-09-11 > > >> > 00:33:20,238::concurrent::202::root::(run) FINISH > > >> > thread <Thread(ifup/on039d2d88d8ee4, stopped > > >> > daemon 140281550731008)> failed Traceback (most > > >> > recent call last): File > > >> > "/usr/lib/python2.7/site-packages/vdsm/common/
concurrent.py",
> > >> > line 195, in run ret = func(*args, **kwargs) File > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > > >> > line 948, in _exec_ifup > > >> > _exec_ifup_by_name(iface.name, cgroup) File > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", > > >> > line 934, in _exec_ifup_by_name > > >> > raise ConfigNetworkError(ERR_FAILED_IFUP, > > >> > out[-1] if out else '') ConfigNetworkError: (29, > > >> > '\n') > > >> > > > >> >
What happens in basic-suite-master/test-scenarios/003_basic_networking.py : modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, so vdsm is calling ifup, which tries to configure via DHCP in the background. The next test detach_vm_network_from_host_0 removes the network, while ifup still hangs in dhcp configuration, because in this VLAN there is no DHCP server. So from my point of view this error in supervdsm.log is legitimate, because VDSM was advised to configure via DHCP, which fails because the interface is removed during dhcp configuration. I did not recognize that this triggeres any chained errors. > > >> > thanks, > > >> > Dafna > > >> > > >> > > > >

ok. I am re-triggering the patch to provide a fix for master since its still failing and I am not seeing a passed vdsm project. I will update on status of vdsm after the patch passes On Thu, Sep 13, 2018 at 11:01 AM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 13 Sep 2018 10:53:20 +0100 Dafna Ron <dron@redhat.com> wrote:
but this test is currently failing no?
No, only the issue related to the vlan.speed() failed. This is already addressed in "failure in ost test - Invalid argument - help in debugging issue" and Petr provided a fix for master and ovirt-4.2 in https://gerrit.ovirt.org/#/c/94301/
So I am not aware of any network related OST fail.
On Thu, Sep 13, 2018 at 10:49 AM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 13 Sep 2018 10:32:12 +0100 Dafna Ron <dron@redhat.com> wrote:
Hi Dominic,
should detach_vm_network_from_host_0 be running while modify_host_0_ip_to_dhcp set VM_NETWORK is running?
modify_host_0_ip_to_dhcp just initiates the async dhcp client in the background on the host and returns in the test environment.
On Thu, Sep 13, 2018 at 8:05 AM, Dominik Holler <dholler@redhat.com> wrote:
On Wed, 12 Sep 2018 08:17:46 +0100 Dafna Ron <dron@redhat.com> wrote:
Thanks Dominic. Can you please review the network tests to fix this race between the two tests?
Dafna, why do you think the current behavior should be changed, or how the current behavior should be improved?
On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler <dholler@redhat.com> wrote:
> On Tue, 11 Sep 2018 17:18:15 +0200 > Dominik Holler <dholler@redhat.com> wrote: > > > On Tue, 11 Sep 2018 12:22:21 +0100 > > Dafna Ron <dron@redhat.com> wrote: > > > > > Can someone take ownership to fix it? > > > > > > > I will take ownership and track the fix. > > > > > > > On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek > > > <phoracek@redhat.com> wrote: > > > > > > > vdsm.log is the same issue, supervdsm.log seems > > > > unrelated. > > > > > > > > 2018-09-11 11:59 GMT+02:00 Dominik Holler > > > > <dholler@redhat.com>: > > > >> On Tue, 11 Sep 2018 10:26:02 +0100 > > > >> Dafna Ron <dron@redhat.com> wrote: > > > >> > > > >> > Hi, > > > >> > > > > >> > I have been seeing random failures of tests in > > > >> > different projects caused by vdsm monitoring. > > > >> > > > > >> > I need someone from vdsm to please help debug this > > > >> > issue. > > > >> > > > >> Petr, is this the same problem like yesterday in > > > >> > > > >> [ovirt-devel] failure in ost test - Invalid argument > > > >> - help in debugging issue > > > >> > > > >> > From what I can see, the test suspend/resume vm > > > >> > failed because we could not query the status of > > > >> > the vm on the host. > > > >> > > > > >> > you can see full log from failed tests here: > > > >> > > > > >> > https://jenkins.ovirt.org/job/
> > > >> er/10208/artifact/basic-suite.
el7.x86_64/test_logs/basic-
> > > >> suite-master/post-004_basic_sanity.py/ > > > >> > > > > >> > Here are the errors that I can see in the vdsm > > > >> > which seem to suggest that there is an issue > > > >> > getting stats on the vm which was suspended and > > > >> > caused the failure of the test: > > > >> > > > > >> > > > > >> > > > > >> > 2018-09-11 00:40:05,896-0400 INFO > > > >> > (monitor/c1fe6e6) [storage.StorageDomain] Removing > > > >> > remnants of deleted images [] (fileSD:734) > > > >> > 2018-09-11 00:40:07,957-0400 DEBUG (qgapoller/1) > > > >> > [vds] Not sending QEMU-GA command > > > >> > 'guest-get-users' to > > > >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', > > > >> > command is not supported (qemuguestagent:192) > > > >> > 2018-09-11 00:40:08,068-0400 DEBUG (periodic/3) > > > >> > [virt.sampling.VMBulkstatsMonitor] sampled > > > >> > timestamp 4296118.49 elapsed 0.010 acquired True > > > >> > domains all (sampling:443) 2018-09-11 > > > >> > 00:40:08,271-0400 DEBUG (jsonrpc/1) > > > >> > [jsonrpc.JsonRpcServer] Calling 'Image.prepare' in > > > >> > bridge with {u'allowIllegal': True, > > > >> > u'storagepoolID': > > > >> > u'e80a56d9-74da-498a-b010-4a9df287f11d', > > > >> > u'imageID': u'd4c831e6-02d2-4d89-b516-0ec4597 > > > >> > 5e024', u'volumeID': > > > >> > u'15b07af1-625b-42e3-b62a-8e7c7a120a56', > > > >> > u'storagedomainID': > > > >> > u'f1744940-41b6-4d35-b7bf-870c4e07d995'} > > > >> > (__init__:329) > > > >> > > > > >> > > > > >> > 2018-09-11 00:40:10,846-0400 DEBUG (vmchannels) > > > >> > [virt.vm] > > > >> > (vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest > > > >> > connection timed out (guestagent:556) 2018-09-11 > > > >> > 00:40:11,637-0400 DEBUG (jsonrpc/5) > > > >> > [jsonrpc.JsonRpcServer] Calling 'Host.getStats' in > > > >> > bridge with {} (__init__:329) 2018-09-11 > > > >> > 00:40:11,637-0400 INFO (jsonrpc/5) [api.host] START > > > >> > getStats() from=::ffff:192.168.201.4,49184 (api:47) > > > >> > 2018-09-11 00:40:11,643-0400 DEBUG (jsonrpc/5) > > > >> > [root] cannot read eth0 speed (nic:42) 2018-09-11 > > > >> > 00:40:11,645-0400 DEBUG (jsonrpc/5) [root] cannot > > > >> > read eth1 speed (nic:42) 2018-09-11 > > > >> > 00:40:11,647-0400 DEBUG (jsonrpc/5) [root] cannot > > > >> > read eth2 speed (nic:42) 2018-09-11 > > > >> > 00:40:11,649-0400 DEBUG (jsonrpc/5) [root] cannot > > > >> > read eth3 speed (nic:42) 2018-09-11 > > > >> > 00:40:11,667-0400 INFO (jsonrpc/5) [api.host] > > > >> > FINISH getStats error=[Errno 22] Invalid argument > > > >> > from=::ffff:192.168.201.4,49184 (api:51) > > > >> > 2018-09-11 00:40:11,667-0400 ERROR (jsonrpc/5) > > > >> > [jsonrpc.JsonRpcServer] Internal server error > > > >> > (__init__:350) Traceback (most recent call last): > > > >> > File > > > >> > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.
> > > >> > line 345, in _handle_request res = > > > >> > method(**params) File > > > >> > "/usr/lib/python2.7/site-
> > > >> > line 202, in _dynamicMethod result = > > > >> > fn(*methodArgs) File "<string>", line 2, in > > > >> > getStats File > > > >> > "/usr/lib/python2.7/site-
> > > >> > line 49, in method ret = func(*args, **kwargs) > > > >> > File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/API.py", > > > >> > line 1407, in getStats multipath=True)} File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/host/api.py", > > > >> > line 49, in get_stats decStats = > > > >> > stats.produce(first_sample, last_sample) File > > > >> > "/usr/lib/python2.7/site-
> > > >> > line 71, in produce > > > >> > stats.update(get_interfaces_stats()) File > > > >> > "/usr/lib/python2.7/site-
> > > >> > line 153, in get_interfaces_stats return > > > >> > net_api.network_stats() File > > > >> > "/usr/lib/python2.7/site-
ovirt-master_change-queue-test py", packages/vdsm/rpc/Bridge.py", packages/vdsm/common/api.py", packages/vdsm/host/stats.py", packages/vdsm/host/stats.py", packages/vdsm/network/api.py",
> > > >> > line 63, in network_stats return netstats.report() > > > >> > File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/ netstats.py", > > > >> > line 31, in report stats = link_stats.report() > > > >> > File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/
stats.py",
> > > >> > line 41, in report speed = vlan.speed(i.device) > > > >> > File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/
vlan.py",
> > > >> > line 36, in speed dev_speed = > > > >> > nic.read_speed_using_sysfs(dev_name) File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/link/
nic.py",
> > > >> > line 48, in read_speed_using_sysfs s = > > > >> > int(f.read()) IOError: [Errno 22] Invalid argument > > > >> > 2018-09-11 00:40:11,669-0400 INFO (jsonrpc/5) > > > >> > [jsonrpc.JsonRpcServer] RPC call Host.getStats > > > >> > failed (error -32603) in 0.03 seconds > > > >> > (__init__:312) 2018-09-11 00:40:14,656-0400 DEBUG > > > >> > (jsonrpc/2) [jsonrpc.JsonRpcServer] Calling > > > >> > 'Host.getAllVmStats' in bridge with {} > > > >> > (__init__:329) 2018-09-11 00:40:14,657-0400 INFO > > > >> > (jsonrpc/2) [api.host] START getAllVmStats() > > > >> > from=::1,46772 (api:47) 2018-09-11 > > > >> > 00:40:14,659-0400 INFO (jsonrpc/2) [api.host] > > > >> > FINISH getAllVmStats return={'status': {'message': > > > >> > 'Done', 'code': 0}, 'statsList': (suppressed)} > > > >> > from=::1,46772 (api:53) 2018-09-11 > > > >> > 00:40:14,660-0400 DEBUG (jsonrpc/2) > > > >> > [jsonrpc.JsonRpcServer] Return > > > >> > 'Host.getAllVmStats' in bridge with (suppressed) > > > >> > (__init__:356) 2018-09-11 00:40:14,661-0400 INFO > > > >> > (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call > > > >> > Host.getAllVmStats succeeded in 0.00 seconds > > > >> > (__init__:312) 2018-09-11 00:40:14,674-0400 DEBUG > > > >> > (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling > > > >> > 'Host.getAllVmIoTunePolicies' in bridge with {} > > > >> > (__init__:329) 2018-09-11 00:40:14,675-0400 INFO > > > >> > (jsonrpc/1) [api.host] START > > > >> > getAllVmIoTunePolicies() from=::1,46772 (api:47) > > > >> > 2018-09-11 00:40:14,675-0400 INFO (jsonrpc/1) > > > >> > [api.host] FINISH getAllVmIoTunePolicies > > > >> > return={'status': {'message': 'Done', 'code': 0}, > > > >> > 'io_tune_policies_dict': > > > >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': > > > >> > {'policy': [], 'current_values': [{'ioTune': > > > >> > {'write_bytes_sec': 0L, 'total_iops_sec': 0L, > > > >> > 'read_iops_sec': 0L, 'read_bytes_sec': 0L, > > > >> > 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, > > > >> > 'path': > > > >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-
> > > >> 870c4e07d995/images/9a3659dc- 339e-4b3f-a1d1-6ee3debb6168/ > > > >> e2570635-a798-4e9e-a0ea-88436d5b296f', > > > >> > 'name': 'vda'}]}}} from=::1,46772 (api:53) > > > >> > 2018-09-11 00:40:14,675-0400 DEBUG (jsonrpc/1) > > > >> > [jsonrpc.JsonRpcServer] Return > > > >> > 'Host.getAllVmIoTunePolicies' in bridge with > > > >> > {'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': > > > >> > [], 'current_values': [{'ioTune': > > > >> > {'write_bytes_sec': 0L, 'total_iops_sec': 0L, > > > >> > 'read_iops_sec': 0L, 'read_bytes_sec': 0L, > > > >> > 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, > > > >> > 'path': > > > >> > '/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-
> > > >> 870c4e07d995/images/9a3659dc- 339e-4b3f-a1d1-6ee3debb6168/ > > > >> e2570635-a798-4e9e-a0ea-88436d5b296f', > > > >> > 'name': 'vda'}]}} (__init__:356) 2018-09-11 > > > >> > 00:40:14,675-0400 INFO (jsonrpc/1) > > > >> > [jsonrpc.JsonRpcServer] RPC call > > > >> > Host.getAllVmIoTunePolicies succeeded in 0.00 > > > >> > seconds (__init__:312) 2018-09-11 > > > >> > 00:40:16,118-0400 DEBUG (jsonrpc/7) > > > >> > [jsonrpc.JsonRpcServer] Calling > > > >> > 'Host.getAllVmStats' in bridge with {} > > > >> > (__init__:329) 2018-09-11 00:40:16,119-0400 INFO > > > >> > (jsonrpc/7) [api.host] START getAllVmStats() > > > >> > from=::ffff:192.168.201.4,49184 (api:47) > > > >> > 2018-09-11 00:40:16,121-0400 INFO (jsonrpc/7) > > > >> > [api.host] FINISH getAllVmStats return={'status': > > > >> > {'message': 'Done', 'code': 0}, 'statsList': > > > >> > (suppressed)} from=::ffff:192.168.201.4,49184 > > > >> > (api:53) 2018-09-11 00:40:16,122-0400 DEBUG > > > >> > (jsonrpc/7) [jsonrpc.JsonRpcServer] Return > > > >> > 'Host.getAllVmStats' in bridge with (suppressed) > > > >> > (__init__:356) 2018-09-11 00:40:16,122-0400 INFO > > > >> > (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call > > > >> > Host.getAllVmStats succeeded in 0.01 seconds > > > >> > (__init__:312) 2018-09-11 00:40:17,956-0400 DEBUG > > > >> > (qgapoller/2) [vds] Not sending QEMU-GA command > > > >> > 'guest-get-users' to > > > >> > vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', > > > >> > command is not supported (qemuguestagent:192) > > > >> > 2018-09-11 00:40:18,025-0400 DEBUG (periodic/3) > > > >> > [virt.periodic] Looking for stale paused VMs > > > >> > (periodic:388) > > > >> > > > > >> > I can also see this in Super vdsm which may be > > > >> > related but I an not 100% sure its related to > > > >> > current issue: > > > >> > > > > >> > MainProcess|jsonrpc/2::DEBUG::2018-09-11 > > > >> > 00:33:20,012::cmdutils::159::root::(exec_cmd) > > > >> > SUCCESS: <err> = ''; <rc> = 0 > > > >> > ifup/on039d2d88d8ee4::DEBUG::2018-09-11 > > > >> > 00:33:20,238::cmdutils::159::root::(exec_cmd) > > > >> > FAILED: <err> = 'Running scope as unit > > > >> > 2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/
> > > >> network-scripts/ifup-eth: > > > >> > line 321: 20715 Terminated > > > >> > /sbin/dhclient ${DHCLIENTARGS} > > > >> > ${DEVICE}\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nCannot find device > > > >> > "on039d2d88d8ee4"\nDevice "on039d2d88d8ee4" does > > > >> > not exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d 8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevic e "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d 8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevic e "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d 8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevic e "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\nDevice "on039d2d88d8ee4" does not > > > >> > exist.\n'; <rc> = 1 > > > >> > ifup/on039d2d88d8ee4::ERROR::2018-09-11 > > > >> > 00:33:20,238::concurrent::202::root::(run) FINISH > > > >> > thread <Thread(ifup/on039d2d88d8ee4, stopped > > > >> > daemon 140281550731008)> failed Traceback (most > > > >> > recent call last): File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/common/
concurrent.py",
> > > >> > line 195, in run ret = func(*args, **kwargs) File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/ > configurators/ifcfg.py", > > > >> > line 948, in _exec_ifup > > > >> > _exec_ifup_by_name(iface.name, cgroup) File > > > >> > "/usr/lib/python2.7/site-packages/vdsm/network/ > configurators/ifcfg.py", > > > >> > line 934, in _exec_ifup_by_name > > > >> > raise ConfigNetworkError(ERR_FAILED_IFUP, > > > >> > out[-1] if out else '') ConfigNetworkError: (29, > > > >> > '\n') > > > >> > > > > >> > > > > What happens in > basic-suite-master/test-scenarios/003_basic_networking.py : > modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP, > so vdsm is calling ifup, which tries to configure via DHCP > in the background. > The next test detach_vm_network_from_host_0 removes the > network, while ifup still hangs in dhcp configuration, > because in this VLAN there is no DHCP server. > So from my point of view this error in supervdsm.log is > legitimate, because VDSM was advised to configure via DHCP, > which fails because the interface is removed during dhcp > configuration. I did not recognize that this triggeres any > chained errors. > > > >> > thanks, > > > >> > Dafna > > > >> > > > >> > > > > > > > >
participants (3)
-
Dafna Ron
-
Dominik Holler
-
Petr Horacek