Weird CI failure - lago ssh client output cut in the middle (was: Re: Change in ovirt-system-tests[master]: Rewrite local maintenance test)

On Tue, Jun 11, 2019 at 2:45 PM Code Review <gerrit@ovirt.org> wrote:
From Jenkins CI <jenkins@ovirt.org>:
Jenkins CI has posted comments on this change.
Change subject: Rewrite local maintenance test ......................................................................
Patch Set 11: Continuous-Integration-1
Build Failed
http://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/4758/ : FAILURE
Only one suite failed, basic-suite-master. It failed with: https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/4758//... 2019-06-11 10:57:55,509::log_utils.py::__enter__::600::lago.ssh::DEBUG::start task:b2f51023-f985-4421-b096-930a187f09e7:Get ssh client for lago-he-basic-suite-master-host-0: 2019-06-11 10:57:59,391::log_utils.py::__exit__::611::lago.ssh::DEBUG::end task:b2f51023-f985-4421-b096-930a187f09e7:Get ssh client for lago-he-basic-suite-master-host-0: 2019-06-11 10:58:01,123::ssh.py::ssh::58::lago.ssh::DEBUG::Running c79824b8 on lago-he-basic-suite-master-host-0: hosted-engine --vm-status --json 2019-06-11 10:58:02,321::ssh.py::ssh::81::lago.ssh::DEBUG::Command c79824b8 on lago-he-basic-suite-master-host-0 returned with 0 2019-06-11 10:58:02,322::ssh.py::ssh::89::lago.ssh::DEBUG::Command c79824b8 on lago-he-basic-suite-master-host-0 output: {"1": {"conf_on_shared_storage": true, "live-data": true, "extra": "metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5333 (Tue Jun 11 06:57:58 2019)\nhost-id=1\nscore=2671\nvm_conf_refresh_time=5334 (Tue Jun 11 06:57:59 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n", "hostname": "lago-he-basic-suite-master-host-0.lago.local", "host-id": 1, "engine-status": {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}, "score": 2671, "stopped": false, "maintenance": false, "crc32": "0cc93af9", "local_conf_timestamp": 5334, "host-ts": 5333}, "2": {"conf_on_shared_storage": true, "live-data": true, "extra": "metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5332 (Tue Jun 11 06:57:56 2019)\nhost-id=2\nscore=3400\nvm_conf_refresh_time=5332 (Tue Jun 11 06:57:56 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n", "hostname": "lago-he-basic-suite-master-host-1", "host-id": 2019-06-11 10:58:02,322::testlib.py::assert_equals_within::242::ovirtlago.testlib::ERROR:: * Unhandled exception in <function <lambda> at 0x7f149ea0dd70> Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, in assert_equals_within res = func() File "/home/jenkins/workspace/ovirt-system-tests_standard-check-patch/ovirt-system-tests/he-basic-suite-master/test-scenarios/008_restart_he_vm.py", line 193, in <lambda> for k, v in _get_he_status(host).items() File "/home/jenkins/workspace/ovirt-system-tests_standard-check-patch/ovirt-system-tests/he-basic-suite-master/test-scenarios/008_restart_he_vm.py", line 134, in _get_he_status raise RuntimeError('could not parse JSON: %s' % ret.out) RuntimeError: could not parse JSON: {"1": {"conf_on_shared_storage": true, "live-data": true, "extra": "metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5333 (Tue Jun 11 06:57:58 2019)\nhost-id=1\nscore=2671\nvm_conf_refresh_time=5334 (Tue Jun 11 06:57:59 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n", "hostname": "lago-he-basic-suite-master-host-0.lago.local", "host-id": 1, "engine-status": {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}, "score": 2671, "stopped": false, "maintenance": false, "crc32": "0cc93af9", "local_conf_timestamp": 5334, "host-ts": 5333}, "2": {"conf_on_shared_storage": true, "live-data": true, "extra": "metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=5332 (Tue Jun 11 06:57:56 2019)\nhost-id=2\nscore=3400\nvm_conf_refresh_time=5332 (Tue Jun 11 06:57:56 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=GlobalMaintenance\nstopped=False\n", "hostname": "lago-he-basic-suite-master-host-1", "host-id": Indeed, the json output from --vm-status is chopped in the middle, ending with: '"host-id": '. This can happen either due to the command writing only partial output (unlikely, imo), or some infra issue (lago, network, etc.). I am going to retrigger, anyway, but do not feel very good about this... Perhaps someone from infra can check this too - perhaps there was some infra issue on the host or something. Thanks,
-- To view, visit https://gerrit.ovirt.org/100570 To unsubscribe, visit https://gerrit.ovirt.org/settings
Gerrit-Project: ovirt-system-tests Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c00426f65d92a5531415fee7c7b19ab0f4177e8 Gerrit-Change-Number: 100570 Gerrit-PatchSet: 11 Gerrit-Owner: Yedidyah Bar David <didi@redhat.com> Gerrit-Reviewer: Anton Marchukov <amarchuk@redhat.com> Gerrit-Reviewer: Dafna Ron <dron@redhat.com> Gerrit-Reviewer: Eyal Edri <eedri@redhat.com> Gerrit-Reviewer: Gal Ben Haim <galbh2@gmail.com> Gerrit-Reviewer: Galit Rosenthal <grosenth@redhat.com> Gerrit-Reviewer: Jenkins CI <jenkins@ovirt.org> Gerrit-Reviewer: Sandro Bonazzola <sbonazzo@redhat.com> Gerrit-Reviewer: Simone Tiraboschi <stirabos@redhat.com> Gerrit-Reviewer: Yedidyah Bar David <didi@redhat.com> Gerrit-Comment-Date: Tue, 11 Jun 2019 11:45:03 +0000 Gerrit-HasComments: No
-- Didi
participants (1)
-
Yedidyah Bar David