Weird CI failure - lago ssh client output cut in the middle (was: Re:
Change in ovirt-system-tests[master]: Rewrite local maintenance test)
by Yedidyah Bar David
On Tue, Jun 11, 2019 at 2:45 PM Code Review <gerrit(a)ovirt.org> wrote:
>
> From Jenkins CI <jenkins(a)ovirt.org>:
>
> Jenkins CI has posted comments on this change.
>
> Change subject: Rewrite local maintenance test
> ......................................................................
>
>
> Patch Set 11: Continuous-Integration-1
>
> Build Failed
>
> http://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/4758/ : FAILURE
Only one suite failed, basic-suite-master. It failed with:
https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/475...
2019-06-11 10:57:55,509::log_utils.py::__enter__::600::lago.ssh::DEBUG::start
task:b2f51023-f985-4421-b096-930a187f09e7:Get ssh client for
lago-he-basic-suite-master-host-0:
2019-06-11 10:57:59,391::log_utils.py::__exit__::611::lago.ssh::DEBUG::end
task:b2f51023-f985-4421-b096-930a187f09e7:Get ssh client for
lago-he-basic-suite-master-host-0:
2019-06-11 10:58:01,123::ssh.py::ssh::58::lago.ssh::DEBUG::Running
c79824b8 on lago-he-basic-suite-master-host-0: hosted-engine
--vm-status --json
2019-06-11 10:58:02,321::ssh.py::ssh::81::lago.ssh::DEBUG::Command
c79824b8 on lago-he-basic-suite-master-host-0 returned with 0
2019-06-11 10:58:02,322::ssh.py::ssh::89::lago.ssh::DEBUG::Command
c79824b8 on lago-he-basic-suite-master-host-0 output:
{"1": {"conf_on_shared_storage": true, "live-data": true, "extra":
"metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5333
(Tue Jun 11 06:57:58
2019)\nhost-id=1\nscore=2671\nvm_conf_refresh_time=5334 (Tue Jun 11
06:57:59 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n",
"hostname": "lago-he-basic-suite-master-host-0.lago.local", "host-id":
1, "engine-status": {"reason": "failed liveliness check", "health":
"bad", "vm": "up", "detail": "Up"}, "score": 2671, "stopped": false,
"maintenance": false, "crc32": "0cc93af9", "local_conf_timestamp":
5334, "host-ts": 5333}, "2": {"conf_on_shared_storage": true,
"live-data": true, "extra":
"metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5332
(Tue Jun 11 06:57:56
2019)\nhost-id=2\nscore=3400\nvm_conf_refresh_time=5332 (Tue Jun 11
06:57:56 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n",
"hostname": "lago-he-basic-suite-master-host-1", "host-id":
2019-06-11 10:58:02,322::testlib.py::assert_equals_within::242::ovirtlago.testlib::ERROR::
* Unhandled exception in <function <lambda> at 0x7f149ea0dd70>
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
234, in assert_equals_within
res = func()
File "/home/jenkins/workspace/ovirt-system-tests_standard-check-patch/ovirt-system-tests/he-basic-suite-master/test-scenarios/008_restart_he_vm.py",
line 193, in <lambda>
for k, v in _get_he_status(host).items()
File "/home/jenkins/workspace/ovirt-system-tests_standard-check-patch/ovirt-system-tests/he-basic-suite-master/test-scenarios/008_restart_he_vm.py",
line 134, in _get_he_status
raise RuntimeError('could not parse JSON: %s' % ret.out)
RuntimeError: could not parse JSON: {"1": {"conf_on_shared_storage":
true, "live-data": true, "extra":
"metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5333
(Tue Jun 11 06:57:58
2019)\nhost-id=1\nscore=2671\nvm_conf_refresh_time=5334 (Tue Jun 11
06:57:59 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n",
"hostname": "lago-he-basic-suite-master-host-0.lago.local", "host-id":
1, "engine-status": {"reason": "failed liveliness check", "health":
"bad", "vm": "up", "detail": "Up"}, "score": 2671, "stopped": false,
"maintenance": false, "crc32": "0cc93af9", "local_conf_timestamp":
5334, "host-ts": 5333}, "2": {"conf_on_shared_storage": true,
"live-data": true, "extra":
"metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5332
(Tue Jun 11 06:57:56
2019)\nhost-id=2\nscore=3400\nvm_conf_refresh_time=5332 (Tue Jun 11
06:57:56 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n",
"hostname": "lago-he-basic-suite-master-host-1", "host-id":
Indeed, the json output from --vm-status is chopped in the middle,
ending with: '"host-id": '.
This can happen either due to the command writing only partial output
(unlikely, imo), or some infra issue (lago, network, etc.).
I am going to retrigger, anyway, but do not feel very good about this...
Perhaps someone from infra can check this too - perhaps there was some
infra issue on the host or something.
Thanks,
>
> --
> To view, visit https://gerrit.ovirt.org/100570
> To unsubscribe, visit https://gerrit.ovirt.org/settings
>
> Gerrit-Project: ovirt-system-tests
> Gerrit-Branch: master
> Gerrit-MessageType: comment
> Gerrit-Change-Id: I1c00426f65d92a5531415fee7c7b19ab0f4177e8
> Gerrit-Change-Number: 100570
> Gerrit-PatchSet: 11
> Gerrit-Owner: Yedidyah Bar David <didi(a)redhat.com>
> Gerrit-Reviewer: Anton Marchukov <amarchuk(a)redhat.com>
> Gerrit-Reviewer: Dafna Ron <dron(a)redhat.com>
> Gerrit-Reviewer: Eyal Edri <eedri(a)redhat.com>
> Gerrit-Reviewer: Gal Ben Haim <galbh2(a)gmail.com>
> Gerrit-Reviewer: Galit Rosenthal <grosenth(a)redhat.com>
> Gerrit-Reviewer: Jenkins CI <jenkins(a)ovirt.org>
> Gerrit-Reviewer: Sandro Bonazzola <sbonazzo(a)redhat.com>
> Gerrit-Reviewer: Simone Tiraboschi <stirabos(a)redhat.com>
> Gerrit-Reviewer: Yedidyah Bar David <didi(a)redhat.com>
> Gerrit-Comment-Date: Tue, 11 Jun 2019 11:45:03 +0000
> Gerrit-HasComments: No
--
Didi
5 years, 4 months