[JIRA] (OVIRT-2586) Jenkins terribly slow and unresponsive
by sbonazzo (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2586?page=com.atlassian.jir... ]
sbonazzo commented on OVIRT-2586:
---------------------------------
{quote}Yuval Turgeman can we change the OST job not to read from the jenkins the artifact?
It might overload the server which is already overloaded. can you read it from another location maybe? perhaps resources.ovirt.org?{quote}
Same reason as above, builds from jenkins of oVirt Node are not published automatically to resources.ovirt.org and ISO repo is completely missing for 4.2 branch: https://resources.ovirt.org/pub/ovirt-4.2-snapshot/
Same tracker as above, #OVIRT-2355
> Jenkins terribly slow and unresponsive
> --------------------------------------
>
> Key: OVIRT-2586
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2586
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Reporter: sbonazzo
> Assignee: Evgheni Dereveanchin
> Priority: Highest
> Attachments: usedMemory_year.png
>
>
> Hi,
> jenkins is terribly slow and becoming worse every day.
> I tried to gain some speed by adding 4 cores to the VM through engine-phx.
> It's a bit better but the real issue doesn't seem related to CPU power.
> Can anybody investigate?
> --
> SANDRO BONAZZOLA
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
> Red Hat EMEA <https://www.redhat.com/>
> sbonazzo(a)redhat.com
> <https://red.ht/sig>
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
[JIRA] (OVIRT-2586) Jenkins terribly slow and unresponsive
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2586?page=com.atlassian.jir... ]
Evgheni Dereveanchin commented on OVIRT-2586:
---------------------------------------------
I planned to reboot Jenkins tonight but it was busy running pipelines that aren't easy to cancel. As a result, Jenkins completely locked up in the morning with the UI being completely unreachable and backend threads timing out in the background. Had to restart it and it's now coming back up.
The monitoring plugin was partially responsive still during the outage and showed the following info:
|Java memory used: |15,590 Mb / 16,384 Mb *Usage is near the maximum, you may need to optimize or to reconfigure (-Xmx)|
|Nb of http sessions: |8 |
|Nb of active threads
(current http requests): |33 |
|System load |2.78 |
|% System CPU |17.17|
Almost all memory got exhausted which is likely caused by a memory leak in the SSE-gateway plugin coinciding with a large number of CI jobs appearing in the queue. Adding memory to Java will likely just delay the symptoms as the memory leak is still there (see JENKINS-51057)
!usedMemory_year.png|thumbnail!
From the yearly memory graph the leak started around May-June this year and intensified in November.
To confirm the exact root cause we may need some lower-level troubleshooting of the Java process yet I am not familiar with how that's done. [~mwperina] maybe you can assist with the info that can be gathered to identify the root cause?
> Jenkins terribly slow and unresponsive
> --------------------------------------
>
> Key: OVIRT-2586
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2586
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Reporter: sbonazzo
> Assignee: Evgheni Dereveanchin
> Priority: Highest
> Attachments: usedMemory_year.png
>
>
> Hi,
> jenkins is terribly slow and becoming worse every day.
> I tried to gain some speed by adding 4 cores to the VM through engine-phx.
> It's a bit better but the real issue doesn't seem related to CPU power.
> Can anybody investigate?
> --
> SANDRO BONAZZOLA
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
> Red Hat EMEA <https://www.redhat.com/>
> sbonazzo(a)redhat.com
> <https://red.ht/sig>
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
jenkins dead or barely alive?
by Michal Skrivanek
Hey,
it didn’t work well yesterday evening (taking 10+ minutes to trigger CI) and this morning it’s even worse. Logging in to trigger it manually takes ~5 minutes, every action there seems to take ages...
Thanks,
michal
5 years, 11 months
Build failed in Jenkins:
system-sync_mirrors-fedora-base-fc29-x86_64 #76
by jenkins@jenkins.phx.ovirt.org
See <http://jenkins.ovirt.org/job/system-sync_mirrors-fedora-base-fc29-x86_64/...>
------------------------------------------
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on mirrors.phx.ovirt.org (mirrors) in workspace <http://jenkins.ovirt.org/job/system-sync_mirrors-fedora-base-fc29-x86_64/ws/>
> git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
> git config remote.origin.url http://gerrit.ovirt.org/jenkins.git # timeout=10
Cleaning workspace
> git rev-parse --verify HEAD # timeout=10
Resetting working tree
> git reset --hard # timeout=10
> git clean -fdx # timeout=10
Pruning obsolete local branches
Fetching upstream changes from http://gerrit.ovirt.org/jenkins.git
> git --version # timeout=10
> git fetch --tags --progress http://gerrit.ovirt.org/jenkins.git +refs/heads/*:refs/remotes/origin/* --prune
> git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 0a4f1895f20e18208a7a26d867c6c1f4969b0655 (origin/master)
> git config core.sparsecheckout # timeout=10
> git checkout -f 0a4f1895f20e18208a7a26d867c6c1f4969b0655
Commit message: "sync_mirros: remove gluster-3.10 mirror."
> git rev-list --no-walk 0a4f1895f20e18208a7a26d867c6c1f4969b0655 # timeout=10
[system-sync_mirrors-fedora-base-fc29-x86_64] $ /bin/bash -xe /tmp/jenkins4383354295454012668.sh
+ jenkins/scripts/mirror_mgr.sh resync_yum_mirror fedora-base-fc29 x86_64 jenkins/data/mirrors-reposync.conf
Checking if mirror needs a resync
Traceback (most recent call last):
File "/usr/bin/reposync", line 343, in <module>
main()
File "/usr/bin/reposync", line 175, in main
my.doRepoSetup()
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 681, in doRepoSetup
return self._getRepos(thisrepo, True)
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 721, in _getRepos
self._repos.doSetup(thisrepo)
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 157, in doSetup
self.retrieveAllMD()
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 88, in retrieveAllMD
dl = repo._async and repo._commonLoadRepoXML(repo)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1465, in _commonLoadRepoXML
local = self.cachedir + '/repomd.xml'
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 774, in <lambda>
cachedir = property(lambda self: self._dirGetAttr('cachedir'))
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 757, in _dirGetAttr
self.dirSetup()
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 735, in dirSetup
self._dirSetupMkdir_p(dir)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 712, in _dirSetupMkdir_p
raise Errors.RepoError, msg
yum.Errors.RepoError: Error making cache directory: /home/jenkins/mirrors_cache/centos-kvm-common-el7 error was: [Errno 17] File exists: '/home/jenkins/mirrors_cache/centos-kvm-common-el7'
Build step 'Execute shell' marked build as failure
5 years, 11 months