gerrit issues/status
Itamar Heim
iheim at redhat.com
Wed Jun 6 22:48:47 UTC 2012
summary:
1. daily hangs since upgrade to gerrit 2.3.
2. asked eedri to change versions of jenkins gerrit plugin per[1]
3. instance seems busy all the time cpu wise.
4. changed instance size to bigger to rule out load issues.
this is costly, so will track load to see if can go back to previous
instance.
5. git clone/pull stopped working post reboot complaining on repo
corruption
6. fixed by changing ulimit
7. need to continue tracking for hangs
8. need to consider weekly git repack cron job
9. in a happier note - Gal's patches to improve gerrit emails were
merged yesterday to gerrit upstream!
more details below.
Itamar
details:
looking at gerrit behavior/hangs since the upgrade to gerrit 2.3 on
sunday, it seems the host is busy cpu wise.
just to rule out heavier load of users/jenkins coinciding with the
upgrade, i started by changing the server config to a bigger machine.
however, post the reboot git pull/clone stopped working for
git://gerrit.ovirt.org/ovirt-engine and vdsm
checking /var/log/messages I saw a git corruption message:
git-daemon[7241]: fatal: object e1866a3a1aaefdc5a0537b83b193ac1065d938e6
is corrupted
other repo's worked.
however, git pull/clone for http://gerrit.ovirt.org/ovirt-engine did work!
(small sigh of relief)
more important, git fsck did not complain on corruption
(bigger sigh of relief)
after a bit of research, this is related to limit on number of file
descriptors, where git-daemon is more limited by default than jgit is.
better explanation here:
http://osdir.com/ml/repo-discuss/2010-10/msg00193.html
(and if this didn't happen just because i rebooted/upgraded, i would
have probably remembered earlier we already fixed this some time back on
another gerrit a few months back).
[1]
http://www.mailinglistarchive.com/html/repo-discuss@googlegroups.com/2012-04/msg00352.html
More information about the Infra
mailing list