-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 03/20/2012 07:18 AM, Ofer Schreiber wrote:
www.ovirt.org and
gerrit.ovirt.org are now up and running.
We experienced two issues: 1. DB corruption on
www.ovirt.org,
caused by a full file system. 2. Faulty gerrit service, probably
caused by #1.
Both issues were handled by oVirt infra team (mburns, quaid and
myself)
I'm in a meeting all day today so I won't pause for a single
root-cause analysis, but instead just dump pieces as I go.
As a first step, I'm fixing the easy mistakes, which includes that we
didn't have a straightforward backup of the MediaWiki and WordPress
databases. (We do have a daily Linode backup, but that's the painful way.)
As a stop-gap, I setup a few bash scripts to run every day to grab the
database.
crontab -e
# Give root word about the backup
MAILTO=root
#
# Run five minutes after Midnight Eastern at quietest time, every day
5 0 * * * /root/bin/wordpress-backup.sh
# Run ten minutes after Midnight Eastern at quietest time, every day
10 0 * * * /root/bin/mediawiki-backup.sh
The root cause today was a fillup of
/home/gerrit-backup/gerrit.ovirt.org-gerrit2-home-backup/ which has a
daily snapshot of everything-that-is-gerrit.
The problem is, I didn't build a clean-up for those backups, so they
went back to January when I did the last manual clean-up.
Gerrit probably fellover when trying to do the rsync of its backup to
linode01.ovirt.org. That's the only way these two servers interact
that I recall.
So we need a cleanup script to run in cron.weekly or cron.daily to
erase the old backups.
We also need a script to rsync out the daily backup of the databases
(and maybe other useful bits such as /var/www/html/w and
/usr/share/wordpress.) We could copy this back over to
gerrit.ovirt.org.
Umm, hacky, but would work. And be better than the current situation.
There is so little disk space on
linode01.ovirt.org because I never
intended to use that host this long. I've been working to find a
better solution, preferably one running on KVM. :) and ideally
provided by e.g. one of the sponsors.
- - Karsten
- --
name: Karsten 'quaid' Wade, Sr. Community Architect
team: Red Hat Community Architecture & Leadership
uri:
http://communityleadershipteam.org
http://TheOpenSourceWay.org
gpg: AD0E0C41
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org/
iD8DBQFPaKI72ZIOBq0ODEERAhnAAKCvNMDHxxG3IR2rDBBarqsn7V/UAACg4HIo
T9rM2fCZTCdpDGrQsz/Xq2o=
=vzYt
-----END PGP SIGNATURE-----