Martin, do you know what could be the reason for that?
I can see in the logs for both successful and unsuccessful
basic-suite-4.3 runs, that there is no 'ntpdate' on host-1:
2019-03-25 10:14:46,350::ssh.py::ssh::58::lago.ssh::DEBUG::Running d0c49b54 on
lago-basic-suite-4-3-host-1: ntpdate -4 lago-basic-suite-4-3-engine
2019-03-25 10:14:46,383::ssh.py::ssh::81::lago.ssh::DEBUG::Command d0c49b54 on
lago-basic-suite-4-3-host-1 returned with 127
2019-03-25 10:14:46,384::ssh.py::ssh::96::lago.ssh::DEBUG::Command d0c49b54 on
lago-basic-suite-4-3-host-1 errors:
bash: ntpdate: command not found
On host-0 everything is ok:
2019-03-25 10:14:46,917::ssh.py::ssh::58::lago.ssh::DEBUG::Running d11b2a64 on
lago-basic-suite-4-3-host-0: ntpdate -4 lago-basic-suite-4-3-engine
2019-03-25 10:14:53,088::ssh.py::ssh::81::lago.ssh::DEBUG::Command d11b2a64 on
lago-basic-suite-4-3-host-0 returned with 0
2019-03-25 10:14:53,088::ssh.py::ssh::89::lago.ssh::DEBUG::Command d11b2a64 on
lago-basic-suite-4-3-host-0 output:
25 Mar 06:14:53 ntpdate[6646]: adjust time server 192.168.202.2 offset 0.017150 sec
On 3/25/19 10:13 AM, Eyal Edri wrote:
Still fails, now on a different component. ( ovirt-web-ui-extentions
)
https://jenkins.ovirt.org/job/ovirt-4.3_change-queue-tester/339/
On Fri, Mar 22, 2019 at 3:59 PM Dan Kenigsberg <danken(a)redhat.com
<mailto:danken@redhat.com>> wrote:
On Fri, Mar 22, 2019 at 3:21 PM Marcin Sobczyk
<msobczyk(a)redhat.com <mailto:msobczyk@redhat.com>> wrote:
Dafna,
in 'verify_add_hosts' we specifically wait for single host to
be up with a timeout:
144 up_hosts = hosts_service.list(search='datacenter={} AND
status=up'.format(DC_NAME))
145 if len(up_hosts):
146 return True
The log files say, that it took ~50 secs for one of the hosts
to be up (seems reasonable) and no timeout is being reported.
Just after running 'verify_add_hosts', we run
'add_master_storage_domain', which calls '_hosts_in_dc'
function.
That function does the exact same check, but it fails:
113 hosts = hosts_service.list(search='datacenter={} AND
status=up'.format(dc_name))
114 if hosts:
115 if random_host:
116 return random.choice(hosts)
I don't think it is relevant to our current failure; but I
consider random_host=True as a bad practice. As if we do not have
enough moving parts, we are adding intentional randomness.
Reproducibility is far more important than coverage - particularly
for a shared system test like OST.
117 else:
118 return sorted(hosts, key=lambda host:host.name
<
http://host.name>)
119 raise RuntimeError('Could not find hosts that are up in DC %s'
% dc_name)
I'm also not able to reproduce this issue locally on my
server. The investigation continues...
I think that it would be fair to take the filtering by host state
out of Engine and into the test, where we can easily log the
current status of each host. Then we'd have better understanding
on the next failure.
On 3/22/19 1:17 PM, Marcin Sobczyk wrote:
>
> Hi,
>
> sure, I'm on it - it's weird though, I did ran 4.3 basic
> suite for this patch manually and everything was ok.
>
> On 3/22/19 1:05 PM, Dafna Ron wrote:
>> Hi,
>>
>> We are failing branch 4.3 for test:
>> 002_bootstrap.add_master_storage_domain
>>
>> It seems that in one of the hosts, the vdsm is not starting
>> there is nothing in vdsm.log or in supervdsm.log
>>
>> CQ identified this patch as the suspected root cause:
>>
>>
https://gerrit.ovirt.org/#/c/98748/ - vdsm: client: Add
>> support for flow id
>>
>> Milan, Marcin, can you please have a look?
>>
>> full logs:
>>
>>
http://jenkins.ovirt.org/job/ovirt-4.3_change-queue-tester/326/artifact/b...
>>
>> the only error I can see is about host not being up (makes
>> sense as vdsm is not running)
>>
>>
>> Stacktrace
>>
>> File "/usr/lib64/python2.7/unittest/case.py", line 369, in
run
>> testMethod()
>> File "/usr/lib/python2.7/site-packages/nose/case.py", line
197, in runTest
>> self.test(*self.arg)
>> File
"/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 142, in
wrapped_test
>> test()
>> File
"/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 60, in wrapper
>> return func(get_test_prefix(), *args, **kwargs)
>> File
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
line 417, in add_master_storage_domain
>> add_iscsi_storage_domain(prefix)
>> File
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
line 561, in add_iscsi_storage_domain
>> host=_random_host_from_dc(api, DC_NAME),
>> File
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
line 122, in _random_host_from_dc
>> return _hosts_in_dc(api, dc_name, True)
>> File
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
line 119, in _hosts_in_dc
>> raise RuntimeError('Could not find hosts that are up in DC
%s' % dc_name)
>> 'Could not find hosts that are up in DC test-dc\n--------------------
>> begin captured logging << --------------------\nlago.ssh: DEBUG: start
task:937bdea7-a2a3-47ad-9383-36647ea37ddf:Get ssh client for
lago-basic-suite-4-3-engine:\nlago.ssh: DEBUG: end
task:937bdea7-a2a3-47ad-9383-36647ea37ddf:Get ssh client for
lago-basic-suite-4-3-engine:\nlago.ssh: DEBUG: Running c07b5ee2 on
lago-basic-suite-4-3-engine: cat /root/multipath.txt\nlago.ssh: DEBUG: Command c07b5ee2 on
lago-basic-suite-4-3-engine returned with 0\nlago.ssh: DEBUG: Command c07b5ee2 on
lago-basic-suite-4-3-engine output:\n
3600140516f88cafa71243648ea218995\n360014053e28f60001764fed9978ec4b3\n360014059edc777770114a6484891dcf1\n36001405d93d8585a50d43a4ad0bd8d19\n36001405e31361631de14bcf87d43e55a\n\n-----------
_______________________________________________
Devel mailing list -- devel(a)ovirt.org <mailto:devel@ovirt.org>
To unsubscribe send an email to devel-leave(a)ovirt.org
<mailto:devel-leave@ovirt.org>
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/J4NCHXTK5ZY...
_______________________________________________
Devel mailing list -- devel(a)ovirt.org <mailto:devel@ovirt.org>
To unsubscribe send an email to devel-leave(a)ovirt.org
<mailto:devel-leave@ovirt.org>
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ULS4OKU2YZF...
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <
https://www.redhat.com/>
<
https://red.ht/sig> TRIED. TESTED. TRUSTED.
<
https://redhat.com/trusted>
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)