[JIRA] (OVIRT-2794) OST is broken since this morning - looks like infra issue

[ https://ovirt-jira.atlassian.net/browse/OVIRT-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=39792#comment-39792 ] Anton Marchukov commented on OVIRT-2794: ---------------------------------------- Just to clarify that this currently affects manual OST and nightly OST jobs (the ones that were switched to containers). I see that docker_cleanup.sh handles “ImageNotFoundException”, but in this case this is not what docker library throws and we get HTTP 404 error when it works through docker API. So we suspect that we now have some incompatibility between APIs of the docker library inside the container and the actual docker running on the host. Though it might be just a bug in docker client failing to properly reraise HTTP exception into the proper one. Since it fails all manual jobs I have applied a workaround right inside the Jenkins job by changing “cleanup_docker || failed=true” to "cleanup_docker || failed=false”. It is not committed into git yet as I assume it is just temporary workaround.
OST is broken since this morning - looks like infra issue ---------------------------------------------------------
Key: OVIRT-2794 URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2794 Project: oVirt - virtualization made easy Issue Type: By-EMAIL Reporter: Nir Soffer Assignee: infra
The last successful build was today at 08:10: Since then all builds fail very early with the error below - which is not related to oVirt. Removing image: sha256:f8e5aa8e979155e074411bfef9adade6cdcdf3a5a2eb1d5ad2dbf0288d585ffa, force=True Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/docker/api/client.py", line 222, in _raise_for_status response.raise_for_status() File "/usr/lib/python3.6/site-packages/requests/models.py", line 893, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localunixsocket/v1.30/images/sha256:f8e5aa8e979155e074411bfef9adade6cdcdf3a5a2eb1d5ad2dbf0288d585ffa?force=True&noprune=False During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/jenkins/workspace/ovirt-system-tests_manual/jenkins/scripts/docker_cleanup.py", line 349, in <module> main() File "/home/jenkins/workspace/ovirt-system-tests_manual/jenkins/scripts/docker_cleanup.py", line 37, in main safe_image_cleanup(client, whitelisted_repos) File "/home/jenkins/workspace/ovirt-system-tests_manual/jenkins/scripts/docker_cleanup.py", line 107, in safe_image_cleanup _safe_rm(client, parent) File "/home/jenkins/workspace/ovirt-system-tests_manual/jenkins/scripts/docker_cleanup.py", line 329, in _safe_rm client.images.remove(image_id, force=force) File "/usr/lib/python3.6/site-packages/docker/models/images.py", line 288, in remove self.client.api.remove_image(*args, **kwargs) File "/usr/lib/python3.6/site-packages/docker/utils/decorators.py", line 19, in wrapped return f(self, resource_id, *args, **kwargs) File "/usr/lib/python3.6/site-packages/docker/api/image.py", line 481, in remove_image return self._result(res, True) File "/usr/lib/python3.6/site-packages/docker/api/client.py", line 228, in _result self._raise_for_status(response) File "/usr/lib/python3.6/site-packages/docker/api/client.py", line 224, in _raise_for_status raise create_api_error_from_http_exception(e) File "/usr/lib/python3.6/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception raise cls(e, response=response, explanation=explanation) docker.errors.NotFound: 404 Client Error: Not Found ("reference does not exist") Aborting. Build step 'Execute shell' marked build as failure x [image: Failed > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5542/console> #5542 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5542/> Sep 5, 2019 3:02 PM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5542/> [image: Failed > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5541/console> #5541 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5541/> Sep 5, 2019 3:02 PM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5541/> [image: Failed > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5540/console> #5540 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5540/> Sep 5, 2019 3:01 PM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5540/> [image: Failed > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5539/console> #5539 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5539/> Sep 5, 2019 2:13 PM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5539/> [image: Failed > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5538/console> #5538 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5538/> Sep 5, 2019 1:58 PM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5538/> [image: Failed > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5537/console> #5537 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5537/> Sep 5, 2019 1:50 PM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5537/> [image: Failed > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5536/console> #5536 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5536/> Sep 5, 2019 10:21 AM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5536/> [image: x] <http://jenkins.ovirt.org/job/ovirt-system-tests_manual/jobConfigHistory/showDiffFiles?timestamp1=2019-08-27_12-38-35×tamp2=2019-09-05_08-22-23> [image: Success > Console Output] <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5535/console> #5535 <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5535/> Sep 5, 2019 8:10 AM <https://jenkins.ovirt.org/job/ovirt-system-tests_manual/5535/>
-- This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100109)
participants (1)
-
Anton Marchukov (oVirt JIRA)