but this test is currently failing no?
On Thu, Sep 13, 2018 at 10:49 AM, Dominik Holler <dholler(a)redhat.com> wrote:
On Thu, 13 Sep 2018 10:32:12 +0100
Dafna Ron <dron(a)redhat.com> wrote:
> Hi Dominic,
>
> should detach_vm_network_from_host_0 be running while
> modify_host_0_ip_to_dhcp set VM_NETWORK is running?
>
modify_host_0_ip_to_dhcp just initiates the async dhcp client in the
background on the host and returns in the test environment.
>
>
> On Thu, Sep 13, 2018 at 8:05 AM, Dominik Holler <dholler(a)redhat.com>
> wrote:
>
> > On Wed, 12 Sep 2018 08:17:46 +0100
> > Dafna Ron <dron(a)redhat.com> wrote:
> >
> > > Thanks Dominic.
> > > Can you please review the network tests to fix this race between
> > > the two tests?
> > >
> > >
> >
> >
> > Dafna, why do you think the current behavior should be changed, or
> > how the current behavior should be improved?
> >
> > > On Tue, Sep 11, 2018 at 9:53 PM, Dominik Holler
> > > <dholler(a)redhat.com> wrote:
> > >
> > > > On Tue, 11 Sep 2018 17:18:15 +0200
> > > > Dominik Holler <dholler(a)redhat.com> wrote:
> > > >
> > > > > On Tue, 11 Sep 2018 12:22:21 +0100
> > > > > Dafna Ron <dron(a)redhat.com> wrote:
> > > > >
> > > > > > Can someone take ownership to fix it?
> > > > > >
> > > > >
> > > > > I will take ownership and track the fix.
> > > > >
> > > > >
> > > > > > On Tue, Sep 11, 2018 at 12:04 PM, Petr Horacek
> > > > > > <phoracek(a)redhat.com> wrote:
> > > > > >
> > > > > > > vdsm.log is the same issue, supervdsm.log seems
unrelated.
> > > > > > >
> > > > > > > 2018-09-11 11:59 GMT+02:00 Dominik Holler
> > > > > > > <dholler(a)redhat.com>:
> > > > > > >> On Tue, 11 Sep 2018 10:26:02 +0100
> > > > > > >> Dafna Ron <dron(a)redhat.com> wrote:
> > > > > > >>
> > > > > > >> > Hi,
> > > > > > >> >
> > > > > > >> > I have been seeing random failures of tests
in
> > > > > > >> > different projects caused by vdsm
monitoring.
> > > > > > >> >
> > > > > > >> > I need someone from vdsm to please help debug
this
> > > > > > >> > issue.
> > > > > > >>
> > > > > > >> Petr, is this the same problem like yesterday in
> > > > > > >>
> > > > > > >> [ovirt-devel] failure in ost test - Invalid
argument -
> > > > > > >> help in debugging issue
> > > > > > >>
> > > > > > >> > From what I can see, the test suspend/resume
vm failed
> > > > > > >> > because we could not query the status of the
vm on the
> > > > > > >> > host.
> > > > > > >> >
> > > > > > >> > you can see full log from failed tests here:
> > > > > > >> >
> > > > > > >> >
https://jenkins.ovirt.org/job/
ovirt-master_change-queue-test
> >
> > > > > > >>
er/10208/artifact/basic-suite.el7.x86_64/test_logs/basic-
> > > > > > >> suite-master/post-004_basic_sanity.py/
> > > > > > >> >
> > > > > > >> > Here are the errors that I can see in the
vdsm which
> > > > > > >> > seem to suggest that there is an issue
getting stats
> > > > > > >> > on the vm which was suspended and caused the
failure
> > > > > > >> > of the test:
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > 2018-09-11 00:40:05,896-0400 INFO
(monitor/c1fe6e6)
> > > > > > >> > [storage.StorageDomain] Removing remnants of
deleted
> > > > > > >> > images [] (fileSD:734) 2018-09-11
00:40:07,957-0400
> > > > > > >> > DEBUG (qgapoller/1) [vds] Not sending QEMU-GA
command
> > > > > > >> > 'guest-get-users' to
> > > > > > >> >
vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command
> > > > > > >> > is not supported (qemuguestagent:192)
2018-09-11
> > > > > > >> > 00:40:08,068-0400 DEBUG (periodic/3)
> > > > > > >> > [virt.sampling.VMBulkstatsMonitor] sampled
timestamp
> > > > > > >> > 4296118.49 elapsed 0.010 acquired True
domains all
> > > > > > >> > (sampling:443) 2018-09-11 00:40:08,271-0400
DEBUG
> > > > > > >> > (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling
> > > > > > >> > 'Image.prepare' in bridge with
{u'allowIllegal': True,
> > > > > > >> > u'storagepoolID':
> > > > > > >> >
u'e80a56d9-74da-498a-b010-4a9df287f11d', u'imageID':
> > > > > > >> > u'd4c831e6-02d2-4d89-b516-0ec4597
5e024', u'volumeID':
> > > > > > >> >
u'15b07af1-625b-42e3-b62a-8e7c7a120a56',
> > > > > > >> > u'storagedomainID':
> > > > > > >> >
u'f1744940-41b6-4d35-b7bf-870c4e07d995'} (__init__:329)
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > 2018-09-11 00:40:10,846-0400 DEBUG
(vmchannels)
> > > > > > >> > [virt.vm]
> > > > > > >> >
(vmId='8214433a-f233-4aaa-aeda-2ce1d31c78dc') Guest
> > > > > > >> > connection timed out (guestagent:556)
2018-09-11
> > > > > > >> > 00:40:11,637-0400 DEBUG (jsonrpc/5)
> > > > > > >> > [jsonrpc.JsonRpcServer] Calling
'Host.getStats' in
> > > > > > >> > bridge with {} (__init__:329) 2018-09-11
> > > > > > >> > 00:40:11,637-0400 INFO (jsonrpc/5) [api.host]
START
> > > > > > >> > getStats() from=::ffff:192.168.201.4,49184
(api:47)
> > > > > > >> > 2018-09-11 00:40:11,643-0400 DEBUG
(jsonrpc/5) [root]
> > > > > > >> > cannot read eth0 speed (nic:42) 2018-09-11
> > > > > > >> > 00:40:11,645-0400 DEBUG (jsonrpc/5) [root]
cannot read
> > > > > > >> > eth1 speed (nic:42) 2018-09-11
00:40:11,647-0400 DEBUG
> > > > > > >> > (jsonrpc/5) [root] cannot read eth2 speed
(nic:42)
> > > > > > >> > 2018-09-11 00:40:11,649-0400 DEBUG
(jsonrpc/5) [root]
> > > > > > >> > cannot read eth3 speed (nic:42) 2018-09-11
> > > > > > >> > 00:40:11,667-0400 INFO (jsonrpc/5)
[api.host] FINISH
> > > > > > >> > getStats error=[Errno 22] Invalid argument
> > > > > > >> > from=::ffff:192.168.201.4,49184 (api:51)
2018-09-11
> > > > > > >> > 00:40:11,667-0400 ERROR (jsonrpc/5)
> > > > > > >> > [jsonrpc.JsonRpcServer] Internal server
error
> > > > > > >> > (__init__:350) Traceback (most recent call
last): File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py",
> > > > > > >> > line 345, in _handle_request res =
method(**params)
> > > > > > >> > File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py",
> > > > > > >> > line 202, in _dynamicMethod result =
fn(*methodArgs)
> > > > > > >> > File "<string>", line 2, in
getStats File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/common/api.py",
> > > > > > >> > line 49, in method ret = func(*args,
**kwargs)
> > > > > > >> > File
"/usr/lib/python2.7/site-packages/vdsm/API.py",
> > > > > > >> > line 1407, in getStats
> > > > > > >> > multipath=True)}
> > > > > > >> > File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/host/api.py",
> > > > > > >> > line 49, in get_stats decStats =
> > > > > > >> > stats.produce(first_sample, last_sample)
File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/host/stats.py",
> > > > > > >> > line 71, in produce
> > > > > > >> > stats.update(get_interfaces_stats()) File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/host/stats.py",
> > > > > > >> > line 153, in get_interfaces_stats return
> > > > > > >> > net_api.network_stats() File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/network/api.py",
> > > > > > >> > line 63, in network_stats return
netstats.report()
> > > > > > >> > File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/network/
netstats.py",
> > > > > > >> > line 31, in report stats =
link_stats.report()
> > > > > > >> > File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/network/link/
> > stats.py",
> > > > > > >> > line 41, in report speed =
vlan.speed(i.device)
> > > > > > >> > File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/network/link/
vlan.py",
> > > > > > >> > line 36, in speed dev_speed =
> > > > > > >> > nic.read_speed_using_sysfs(dev_name) File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/network/link/
nic.py",
> > > > > > >> > line 48, in read_speed_using_sysfs s =
int(f.read())
> > > > > > >> > IOError: [Errno 22] Invalid argument
> > > > > > >> > 2018-09-11 00:40:11,669-0400 INFO
(jsonrpc/5)
> > > > > > >> > [jsonrpc.JsonRpcServer] RPC call
Host.getStats failed
> > > > > > >> > (error -32603) in 0.03 seconds
(__init__:312)
> > > > > > >> > 2018-09-11 00:40:14,656-0400 DEBUG
(jsonrpc/2)
> > > > > > >> > [jsonrpc.JsonRpcServer] Calling
'Host.getAllVmStats'
> > > > > > >> > in bridge with {} (__init__:329) 2018-09-11
> > > > > > >> > 00:40:14,657-0400 INFO (jsonrpc/2) [api.host]
START
> > > > > > >> > getAllVmStats() from=::1,46772 (api:47)
2018-09-11
> > > > > > >> > 00:40:14,659-0400 INFO (jsonrpc/2) [api.host]
FINISH
> > > > > > >> > getAllVmStats return={'status':
{'message': 'Done',
> > > > > > >> > 'code': 0}, 'statsList':
(suppressed)} from=::1,46772
> > > > > > >> > (api:53) 2018-09-11 00:40:14,660-0400 DEBUG
(jsonrpc/2)
> > > > > > >> > [jsonrpc.JsonRpcServer] Return
'Host.getAllVmStats' in
> > > > > > >> > bridge with (suppressed) (__init__:356)
2018-09-11
> > > > > > >> > 00:40:14,661-0400 INFO (jsonrpc/2)
> > > > > > >> > [jsonrpc.JsonRpcServer] RPC call
Host.getAllVmStats
> > > > > > >> > succeeded in 0.00 seconds (__init__:312)
2018-09-11
> > > > > > >> > 00:40:14,674-0400 DEBUG (jsonrpc/1)
> > > > > > >> > [jsonrpc.JsonRpcServer] Calling
> > > > > > >> > 'Host.getAllVmIoTunePolicies' in
bridge with {}
> > > > > > >> > (__init__:329) 2018-09-11 00:40:14,675-0400
INFO
> > > > > > >> > (jsonrpc/1) [api.host] START
getAllVmIoTunePolicies()
> > > > > > >> > from=::1,46772 (api:47) 2018-09-11
00:40:14,675-0400
> > > > > > >> > INFO (jsonrpc/1) [api.host] FINISH
> > > > > > >> > getAllVmIoTunePolicies
return={'status': {'message':
> > > > > > >> > 'Done', 'code': 0},
'io_tune_policies_dict':
> > > > > > >> >
{'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy': [],
> > > > > > >> > 'current_values': [{'ioTune':
{'write_bytes_sec': 0L,
> > > > > > >> > 'total_iops_sec': 0L,
'read_iops_sec': 0L,
> > > > > > >> > 'read_bytes_sec': 0L,
'write_iops_sec': 0L,
> > > > > > >> > 'total_bytes_sec': 0L},
'path':
> > > > > > >> >
'/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-
> > > > > > >>
870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/
> > > > > > >> e2570635-a798-4e9e-a0ea-88436d5b296f',
> > > > > > >> > 'name': 'vda'}]}}}
from=::1,46772 (api:53) 2018-09-11
> > > > > > >> > 00:40:14,675-0400 DEBUG (jsonrpc/1)
> > > > > > >> > [jsonrpc.JsonRpcServer] Return
> > > > > > >> > 'Host.getAllVmIoTunePolicies' in
bridge with
> > > > > > >> >
{'8214433a-f233-4aaa-aeda-2ce1d31c78dc': {'policy':
> > > > > > >> > [], 'current_values':
[{'ioTune': {'write_bytes_sec':
> > > > > > >> > 0L, 'total_iops_sec': 0L,
'read_iops_sec': 0L,
> > > > > > >> > 'read_bytes_sec': 0L,
'write_iops_sec': 0L,
> > > > > > >> > 'total_bytes_sec': 0L},
'path':
> > > > > > >> >
'/rhev/data-center/mnt/blockSD/f1744940-41b6-4d35-b7bf-
> > > > > > >>
870c4e07d995/images/9a3659dc-339e-4b3f-a1d1-6ee3debb6168/
> > > > > > >> e2570635-a798-4e9e-a0ea-88436d5b296f',
> > > > > > >> > 'name': 'vda'}]}}
(__init__:356) 2018-09-11
> > > > > > >> > 00:40:14,675-0400 INFO (jsonrpc/1)
> > > > > > >> > [jsonrpc.JsonRpcServer] RPC call
> > > > > > >> > Host.getAllVmIoTunePolicies succeeded in 0.00
seconds
> > > > > > >> > (__init__:312) 2018-09-11 00:40:16,118-0400
DEBUG
> > > > > > >> > (jsonrpc/7) [jsonrpc.JsonRpcServer] Calling
> > > > > > >> > 'Host.getAllVmStats' in bridge with
{} (__init__:329)
> > > > > > >> > 2018-09-11 00:40:16,119-0400 INFO
(jsonrpc/7)
> > > > > > >> > [api.host] START getAllVmStats()
> > > > > > >> > from=::ffff:192.168.201.4,49184 (api:47)
2018-09-11
> > > > > > >> > 00:40:16,121-0400 INFO (jsonrpc/7)
[api.host] FINISH
> > > > > > >> > getAllVmStats return={'status':
{'message': 'Done',
> > > > > > >> > 'code': 0}, 'statsList':
(suppressed)}
> > > > > > >> > from=::ffff:192.168.201.4,49184 (api:53)
2018-09-11
> > > > > > >> > 00:40:16,122-0400 DEBUG (jsonrpc/7)
> > > > > > >> > [jsonrpc.JsonRpcServer] Return
'Host.getAllVmStats' in
> > > > > > >> > bridge with (suppressed) (__init__:356)
2018-09-11
> > > > > > >> > 00:40:16,122-0400 INFO (jsonrpc/7)
> > > > > > >> > [jsonrpc.JsonRpcServer] RPC call
Host.getAllVmStats
> > > > > > >> > succeeded in 0.01 seconds (__init__:312)
2018-09-11
> > > > > > >> > 00:40:17,956-0400 DEBUG (qgapoller/2) [vds]
Not
> > > > > > >> > sending QEMU-GA command
'guest-get-users' to
> > > > > > >> >
vm_id='8214433a-f233-4aaa-aeda-2ce1d31c78dc', command
> > > > > > >> > is not supported (qemuguestagent:192)
2018-09-11
> > > > > > >> > 00:40:18,025-0400 DEBUG (periodic/3)
[virt.periodic]
> > > > > > >> > Looking for stale paused VMs (periodic:388)
> > > > > > >> >
> > > > > > >> > I can also see this in Super vdsm which may
be related
> > > > > > >> > but I an not 100% sure its related to current
issue:
> > > > > > >> >
> > > > > > >> > MainProcess|jsonrpc/2::DEBUG::2018-09-11
> > > > > > >> > 00:33:20,012::cmdutils::159::root::(exec_cmd)
SUCCESS:
> > > > > > >> > <err> = ''; <rc> = 0
> > > > > > >> > ifup/on039d2d88d8ee4::DEBUG::2018-09-11
> > > > > > >> > 00:33:20,238::cmdutils::159::root::(exec_cmd)
FAILED:
> > > > > > >> > <err> = 'Running scope as unit
> > > > > > >> >
2f1f417d-85b9-46c0-b7d7-6082836daaae.scope.\n/etc/sysconfig/
> >
> > > > > > >> network-scripts/ifup-eth:
> > > > > > >> > line 321: 20715 Terminated
> > > > > > >> > /sbin/dhclient ${DHCLIENTARGS}
> > > > > > >> > ${DEVICE}\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nCannot find device
"on039d2d88d8ee4"\nDevice
> > > > > > >> > "on039d2d88d8ee4" does not
exist.\nDevice
> > > > > > >> > "on039d2d88d8ee4" does not
exist.\nDevice "on039d2d88d
> > > > > > >> > 8ee4" does not exist.\nDevice
"on039d2d88d8ee4" does
> > > > > > >> > not exist.\nDevice
"on039d2d88d8ee4" does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevic e "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d 8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevic e "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d 8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevic e "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not
> > > > > > >> > exist.\nDevice "on039d2d88d8ee4"
does not exist.\n';
> > > > > > >> > <rc> = 1
ifup/on039d2d88d8ee4::ERROR::2018-09-11
> > > > > > >> > 00:33:20,238::concurrent::202::root::(run)
FINISH
> > > > > > >> > thread <Thread(ifup/on039d2d88d8ee4,
stopped daemon
> > > > > > >> > 140281550731008)> failed Traceback (most
recent call
> > > > > > >> > last): File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/common/
concurrent.py",
> > > > > > >> > line 195, in run ret = func(*args, **kwargs)
File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/network/
> > > > configurators/ifcfg.py",
> > > > > > >> > line 948, in _exec_ifup
_exec_ifup_by_name(iface.name,
> > > > > > >> > cgroup) File
> > > > > > >> >
"/usr/lib/python2.7/site-packages/vdsm/network/
> > > > configurators/ifcfg.py",
> > > > > > >> > line 934, in _exec_ifup_by_name
> > > > > > >> > raise ConfigNetworkError(ERR_FAILED_IFUP,
out[-1]
> > > > > > >> > if out else '') ConfigNetworkError:
(29, '\n')
> > > > > > >> >
> > > > > > >> >
> > > >
> > > >
> > > > What happens in
> > > > basic-suite-master/test-scenarios/003_basic_networking.py :
> > > > modify_host_0_ip_to_dhcp set VM_NETWORK to use DHCP,
> > > > so vdsm is calling ifup, which tries to configure via DHCP in
> > > > the background.
> > > > The next test detach_vm_network_from_host_0 removes the network,
> > > > while ifup still hangs in dhcp configuration, because in this
> > > > VLAN there is no DHCP server.
> > > > So from my point of view this error in supervdsm.log is
> > > > legitimate, because VDSM was advised to configure via DHCP,
> > > > which fails because the interface is removed during dhcp
> > > > configuration. I did not recognize that this triggeres any
> > > > chained errors.
> > > > > > >> > thanks,
> > > > > > >> > Dafna
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > >
> > > >
> > > >
> >
> >