gerrit issues/status

Itamar Heim iheim at redhat.com
Wed Jun 6 22:48:47 UTC 2012


summary:
1. daily hangs since upgrade to gerrit 2.3.
2. asked eedri to change versions of jenkins gerrit plugin per[1]
3. instance seems busy all the time cpu wise.
4. changed instance size to bigger to rule out load issues.
    this is costly, so will track load to see if can go back to previous
    instance.
5. git clone/pull stopped working post reboot complaining on repo
    corruption
6. fixed by changing ulimit
7. need to continue tracking for hangs
8. need to consider weekly git repack cron job
9. in a happier note - Gal's patches to improve gerrit emails were
    merged yesterday to gerrit upstream!

more details below.

Itamar

details:
looking at gerrit behavior/hangs since the upgrade to gerrit 2.3 on 
sunday, it seems the host is busy cpu wise.
just to rule out heavier load of users/jenkins coinciding with the 
upgrade, i started by changing the server config to a bigger machine.

however, post the reboot git pull/clone stopped working for 
git://gerrit.ovirt.org/ovirt-engine and vdsm

checking /var/log/messages I saw a git corruption message:
git-daemon[7241]: fatal: object e1866a3a1aaefdc5a0537b83b193ac1065d938e6 
is corrupted

other repo's worked.

however, git pull/clone for http://gerrit.ovirt.org/ovirt-engine did work!
(small sigh of relief)

more important, git fsck did not complain on corruption
(bigger sigh of relief)

after a bit of research, this is related to limit on number of file 
descriptors, where git-daemon is more limited by default than jgit is.

better explanation here:
http://osdir.com/ml/repo-discuss/2010-10/msg00193.html

(and if this didn't happen just because i rebooted/upgraded, i would 
have probably remembered earlier we already fixed this some time back on 
another gerrit a few months back).


[1] 
http://www.mailinglistarchive.com/html/repo-discuss@googlegroups.com/2012-04/msg00352.html



More information about the Infra mailing list