[JIRA] (OVIRT-2590) Cache Docker images in the datacenter
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2590?page=com.atlassian.jir... ]
Eyal Edri commented on OVIRT-2590:
----------------------------------
Just to understand, this is about adding a local dockerhub mirror on the OpenShift instance and configuring all builds to work against it instead of dockerhub.io?
How much time do we estimate it will reduce from job runtime?
cc [~gbenhaim@redhat.com][~ederevea][~dbelenky@redhat.com][~bkorren(a)redhat.com]
> Cache Docker images in the datacenter
> -------------------------------------
>
> Key: OVIRT-2590
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2590
> Project: oVirt - virtualization made easy
> Issue Type: Improvement
> Reporter: Roman Mohr
> Assignee: infra
>
> What?
> As a user, I expect that I don't have to care about caching to speed up
> builds for the good of the CI system itself.
> Right now there exists a whitelist for docker images, which will not be
> remove from the build slot after the build. Instead of that I expect a
> clean build environment and that in general all images which I regularly
> use are cached in the cluster via e.g. a pull-through-cache [1].
> Why?
> 1) Caching in a build slot is not very effective. CI runs do really-a-lot
> of almost identical things in a small time-window (e.g. days). If caching
> happens in the build-slot and many slots are present, then the cache
> utilization will be very low.
> 2) Whitelisting docker images extra for a slot where the registry runs in,
> is very error prone and since it is not cached across the cluster it is
> also very intransparent what the clear benefit for the user is. Especially
> when thinking about scaling a CI system, that seems to leak internal
> optimizations to the user. Fast builds are twice as important for the CI
> system than they are for the users (by default faster and lower utilization
> is always better than asking people to optimize on their side).
> [1] https://docs.docker.com/registry/recipes/mirror/
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100096)
5 years, 10 months
[JIRA] (OVIRT-2591) Add a distributed docker-cache
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2591?page=com.atlassian.jir... ]
Eyal Edri commented on OVIRT-2591:
----------------------------------
Few questions:
# Does it apply also to podman and buildha, assuming we'll move to them in a few months?
# Can you estimate or measure how much time this setup will reduce from running tests? it will have to be something significant to justify this kind of improvement
Also, we probably want to put on hold any major infrastructure improvement ( that requires a significant amount of work/code ) until we'll know what are the requirements to run it inside CentOS CI infra.
cc [[~gbenhaim@redhat.com][~dbelenky@redhat.com][~bkorren(a)redhat.com]
> Add a distributed docker-cache
> ------------------------------
>
> Key: OVIRT-2591
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2591
> Project: oVirt - virtualization made easy
> Issue Type: Improvement
> Reporter: Roman Mohr
> Assignee: infra
> Priority: High
>
> What?
> If CI builds get heavy and things are running inside containers, I expect
> that the CI system proactively tries to optimize when it can. Since the CI
> system provides the docker installation, I would expect that under some
> conditions, it automatically puts heavy docker builds in a distributed
> cache in the cluster. Examples on how this can achieved are listed in [1]
> and [2].
> Why?
> Dockerfiles have the advantage that we can isolate our biuld-steps in a
> Dockerfile. This gives reproducibility, but also means that e.g. curl
> downloads or RPM installs are not visible for the CI system. Therefore it
> is beneficial for the CI system and the user (more speed and less
> utilization), to put docker images with their build chain into a
> distributed cache and pre-fetch the cache into the docker cache of the
> build slot. Pre-fetching based on e.g. gibhub project probably makes sense.
> [1] https://runnable.com/blog/distributing-docker-cache-across-hosts
> [2] https://blog.codeship.com/building-a-remote-caching-system/
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100096)
5 years, 10 months
[JIRA] (OVIRT-2636) Tests failed because global_setup.sh failed
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2636?page=com.atlassian.jir... ]
Eyal Edri commented on OVIRT-2636:
----------------------------------
I know [~gbenhaim(a)redhat.com] saw some errors with upgrading + rebooting centos 7.6 servers, not sure if its the same.
[~ederevea] any ideas?
> Tests failed because global_setup.sh failed
> -------------------------------------------
>
> Key: OVIRT-2636
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2636
> Project: oVirt - virtualization made easy
> Issue Type: By-EMAIL
> Reporter: Nir Soffer
> Assignee: infra
>
> Not sure why global setup failed:
> + [[ ! -O /home/jenkins/.ssh ]]
> + [[ ! -G /home/jenkins/.ssh ]]
> + verify_set_permissions 700 /home/jenkins/.ssh
> + local target_permissions=700
> + local path_to_set=/home/jenkins/.ssh
> ++ stat -c %a /home/jenkins/.ssh
> + local access=700
> + [[ 700 != \7\0\0 ]]
> + return 0
> + [[ -f /home/jenkins/.ssh/known_hosts ]]
> + verify_set_ownership /home/jenkins/.ssh/known_hosts
> + local path_to_set=/home/jenkins/.ssh/known_hosts
> ++ id -un
> + local owner=jenkins
> ++ id -gn
> + local group=jenkins
> + [[ ! -O /home/jenkins/.ssh/known_hosts ]]
> + [[ ! -G /home/jenkins/.ssh/known_hosts ]]
> + verify_set_permissions 644 /home/jenkins/.ssh/known_hosts
> + local target_permissions=644
> + local path_to_set=/home/jenkins/.ssh/known_hosts
> ++ stat -c %a /home/jenkins/.ssh/known_hosts
> + local access=644
> + [[ 644 != \6\4\4 ]]
> + return 0
> + return 0
> + true
> + log ERROR Aborting.
> Build:
> https://jenkins.ovirt.org/blue/rest/organizations/jenkins/pipelines/vdsm_...
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100096)
5 years, 10 months
[JIRA] (OVIRT-2637) [VDSM] Tests fail because of network timeout
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2637?page=com.atlassian.jir... ]
Eyal Edri commented on OVIRT-2637:
----------------------------------
Could this have been a network issue to fedora repos? Is it still happening?
> [VDSM] Tests fail because of network timeout
> --------------------------------------------
>
> Key: OVIRT-2637
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2637
> Project: oVirt - virtualization made easy
> Issue Type: By-EMAIL
> Reporter: Nir Soffer
> Assignee: infra
>
> First time I see this error. The issue seems to be network read timeout
> when creating virtual env
> in tox test.
> If you see this error in your builds, please report.
> storage-py27 create:
> /home/jenkins/workspace/vdsm_standard-check-patch/vdsm/.tox/storage-py27
> ERROR: invocation failed (exit code 1), logfile:
> /home/jenkins/workspace/vdsm_standard-check-patch/vdsm/.tox/storage-py27/log/storage-py27-0.log
> ERROR: actionid: storage-py27
> msg: getenv
> cmdargs: ['/usr/bin/python2', '-m', 'virtualenv',
> '--system-site-packages', '--python', '/usr/bin/python2.7',
> 'storage-py27']
> New python executable in
> /home/jenkins/workspace/vdsm_standard-check-patch/vdsm/.tox/storage-py27/bin/python2.7
> Also creating executable in
> /home/jenkins/workspace/vdsm_standard-check-patch/vdsm/.tox/storage-py27/bin/python
> Installing setuptools, pip, wheel...
> Complete output from command
> /home/jenkins/worksp...e-py27/bin/python2.7 - setuptools pip wheel:
> The directory
> '/home/jenkins/workspace/vdsm_standard-check-patch/vdsm/.cache/pip/http'
> or its parent directory is not owned by the current user and the cache
> has been disabled. Please check the permissions and owner of that
> directory. If executing pip with sudo, you may want sudo's -H flag.
> The directory '/home/jenkins/workspace/vdsm_standard-check-patch/vdsm/.cache/pip'
> or its parent directory is not owned by the current user and caching
> wheels has been disabled. check the permissions and owner of that
> directory. If executing pip with sudo, you may want sudo's -H flag.
> Looking in links: /usr/lib/python2.7/site-packages,
> /usr/lib/python2.7/site-packages/virtualenv_support
> Collecting setuptools
> Downloading https://files.pythonhosted.org/packages/37/06/754589caf971b0d2d48f151c258...
> (573kB)
> Exception:
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/cli/base_command.py",
> line 143, in main
> status = self.run(options, args)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/commands/install.py",
> line 318, in run
> resolver.resolve(requirement_set)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/resolve.py",
> line 102, in resolve
> self._resolve_one(requirement_set, req)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/resolve.py",
> line 256, in _resolve_one
> abstract_dist = self._get_abstract_dist_for(req_to_install)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/resolve.py",
> line 209, in _get_abstract_dist_for
> self.require_hashes
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/operations/prepare.py",
> line 283, in prepare_linked_requirement
> progress_bar=self.progress_bar
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/download.py",
> line 836, in unpack_url
> progress_bar=progress_bar
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/download.py",
> line 673, in unpack_http_url
> progress_bar)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/download.py",
> line 897, in _download_http_url
> _download_url(resp, link, content_file, hashes, progress_bar)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/download.py",
> line 617, in _download_url
> hashes.check_against_chunks(downloaded_chunks)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/utils/hashes.py",
> line 48, in check_against_chunks
> for chunk in chunks:
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/download.py",
> line 585, in written_chunks
> for chunk in chunks:
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/utils/ui.py",
> line 159, in iter
> for x in it:
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_internal/download.py",
> line 574, in resp_read
> decode_content=False):
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_vendor/urllib3/response.py",
> line 465, in stream
> data = self.read(amt=amt, decode_content=decode_content)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_vendor/urllib3/response.py",
> line 430, in read
> raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
> File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
> self.gen.throw(type, value, traceback)
> File "/usr/lib/python2.7/site-packages/virtualenv_support/pip-18.1-py2.py3-none-any.whl/pip/_vendor/urllib3/response.py",
> line 345, in _error_catcher
> raise ReadTimeoutError(self._pool, None, 'Read timed out.')
> ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org',
> port=443): Read timed out.
> ----------------------------------------
> ...Installing setuptools, pip, wheel...done.
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/virtualenv.py", line 2462, in <module>
> main()
> File "/usr/lib/python2.7/site-packages/virtualenv.py", line 762, in main
> symlink=options.symlink,
> File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1015, in
> create_environment
> install_wheel(to_install, py_executable, search_dirs, download=download)
> File "/usr/lib/python2.7/site-packages/virtualenv.py", line 968, in
> install_wheel
> call_subprocess(cmd, show_stdout=False, extra_env=env, stdin=SCRIPT)
> File "/usr/lib/python2.7/site-packages/virtualenv.py", line 854, in
> call_subprocess
> raise OSError("Command {} failed with error code
> {}".format(cmd_desc, proc.returncode))
> OSError: Command /home/jenkins/worksp...e-py27/bin/python2.7 -
> setuptools pip wheel failed with error code 2
> Running virtualenv with interpreter /usr/bin/python2.7
> ERROR: Error creating virtualenv. Note that some special characters
> (e.g. ':' and unicode symbols) in paths are not supported by
> virtualenv. Error details: InvocationError('/usr/bin/python2 -m
> virtualenv --system-site-packages --python /usr/bin/python2.7
> storage-py27 (see
> /home/jenkins/workspace/vdsm_standard-check-patch/vdsm/.tox/storage-py27/log/storage-py27-0.log)',
> 1)
> Build:
> https://jenkins.ovirt.org/blue/rest/organizations/jenkins/pipelines/vdsm_...
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100096)
5 years, 10 months
Build failed in Jenkins:
system-sync_mirrors-centos-kvm-common-el7-x86_64 #2053
by jenkins@jenkins.phx.ovirt.org
See <http://jenkins.ovirt.org/job/system-sync_mirrors-centos-kvm-common-el7-x8...>
------------------------------------------
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on mirrors.phx.ovirt.org (mirrors) in workspace <http://jenkins.ovirt.org/job/system-sync_mirrors-centos-kvm-common-el7-x8...>
> git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
> git config remote.origin.url http://gerrit.ovirt.org/jenkins.git # timeout=10
Cleaning workspace
> git rev-parse --verify HEAD # timeout=10
Resetting working tree
> git reset --hard # timeout=10
> git clean -fdx # timeout=10
Pruning obsolete local branches
Fetching upstream changes from http://gerrit.ovirt.org/jenkins.git
> git --version # timeout=10
> git fetch --tags --progress http://gerrit.ovirt.org/jenkins.git +refs/changes/59/95959/1:test --prune
> git rev-parse origin/test^{commit} # timeout=10
> git rev-parse test^{commit} # timeout=10
Checking out Revision 05b40dfb4ec43a82530e8c471395d1858e5c59e1 (test)
> git config core.sparsecheckout # timeout=10
> git checkout -f 05b40dfb4ec43a82530e8c471395d1858e5c59e1
Commit message: "mirror-reposync: remove gluster-3.10 mirror"
> git rev-list --no-walk 05b40dfb4ec43a82530e8c471395d1858e5c59e1 # timeout=10
[system-sync_mirrors-centos-kvm-common-el7-x86_64] $ /bin/bash -xe /tmp/jenkins1590719001793889012.sh
+ jenkins/scripts/mirror_mgr.sh resync_yum_mirror centos-kvm-common-el7 x86_64 jenkins/data/mirrors-reposync.conf
Checking if mirror needs a resync
Traceback (most recent call last):
File "/usr/bin/reposync", line 343, in <module>
main()
File "/usr/bin/reposync", line 175, in main
my.doRepoSetup()
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 681, in doRepoSetup
return self._getRepos(thisrepo, True)
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 721, in _getRepos
self._repos.doSetup(thisrepo)
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 157, in doSetup
self.retrieveAllMD()
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 88, in retrieveAllMD
dl = repo._async and repo._commonLoadRepoXML(repo)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1465, in _commonLoadRepoXML
local = self.cachedir + '/repomd.xml'
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 774, in <lambda>
cachedir = property(lambda self: self._dirGetAttr('cachedir'))
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 757, in _dirGetAttr
self.dirSetup()
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 735, in dirSetup
self._dirSetupMkdir_p(dir)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 712, in _dirSetupMkdir_p
raise Errors.RepoError, msg
yum.Errors.RepoError: Error making cache directory: /home/jenkins/mirrors_cache/centos-kvm-common-el7/gen error was: [Errno 17] File exists: '/home/jenkins/mirrors_cache/centos-kvm-common-el7/gen'
Build step 'Execute shell' marked build as failure
5 years, 10 months
Build failed in Jenkins:
system-sync_mirrors-fedora-base-fc29-x86_64 #154
by jenkins@jenkins.phx.ovirt.org
See <http://jenkins.ovirt.org/job/system-sync_mirrors-fedora-base-fc29-x86_64/...>
------------------------------------------
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on mirrors.phx.ovirt.org (mirrors) in workspace <http://jenkins.ovirt.org/job/system-sync_mirrors-fedora-base-fc29-x86_64/ws/>
> git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
> git config remote.origin.url http://gerrit.ovirt.org/jenkins.git # timeout=10
Cleaning workspace
> git rev-parse --verify HEAD # timeout=10
Resetting working tree
> git reset --hard # timeout=10
> git clean -fdx # timeout=10
Pruning obsolete local branches
Fetching upstream changes from http://gerrit.ovirt.org/jenkins.git
> git --version # timeout=10
> git fetch --tags --progress http://gerrit.ovirt.org/jenkins.git +refs/heads/*:refs/remotes/origin/* --prune
> git rev-parse origin/master^{commit} # timeout=10
Checking out Revision ebb5016d852a395b97686551537a621133628cd0 (origin/master)
> git config core.sparsecheckout # timeout=10
> git checkout -f ebb5016d852a395b97686551537a621133628cd0
Commit message: "OST: Remove he-basic-ansible master job"
> git rev-list --no-walk ebb5016d852a395b97686551537a621133628cd0 # timeout=10
[system-sync_mirrors-fedora-base-fc29-x86_64] $ /bin/bash -xe /tmp/jenkins1591948347159497588.sh
+ jenkins/scripts/mirror_mgr.sh resync_yum_mirror fedora-base-fc29 x86_64 jenkins/data/mirrors-reposync.conf
Checking if mirror needs a resync
Traceback (most recent call last):
File "/usr/bin/reposync", line 343, in <module>
main()
File "/usr/bin/reposync", line 175, in main
my.doRepoSetup()
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 681, in doRepoSetup
return self._getRepos(thisrepo, True)
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 721, in _getRepos
self._repos.doSetup(thisrepo)
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 157, in doSetup
self.retrieveAllMD()
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 88, in retrieveAllMD
dl = repo._async and repo._commonLoadRepoXML(repo)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1465, in _commonLoadRepoXML
local = self.cachedir + '/repomd.xml'
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 774, in <lambda>
cachedir = property(lambda self: self._dirGetAttr('cachedir'))
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 757, in _dirGetAttr
self.dirSetup()
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 735, in dirSetup
self._dirSetupMkdir_p(dir)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 712, in _dirSetupMkdir_p
raise Errors.RepoError, msg
yum.Errors.RepoError: Error making cache directory: /home/jenkins/mirrors_cache/centos-kvm-common-el7/packages error was: [Errno 17] File exists: '/home/jenkins/mirrors_cache/centos-kvm-common-el7/packages'
Build step 'Execute shell' marked build as failure
5 years, 10 months