
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/20/2012 07:18 AM, Ofer Schreiber wrote:
www.ovirt.org and gerrit.ovirt.org are now up and running.
We experienced two issues: 1. DB corruption on www.ovirt.org, caused by a full file system. 2. Faulty gerrit service, probably caused by #1.
Both issues were handled by oVirt infra team (mburns, quaid and myself)
I'm in a meeting all day today so I won't pause for a single root-cause analysis, but instead just dump pieces as I go. As a first step, I'm fixing the easy mistakes, which includes that we didn't have a straightforward backup of the MediaWiki and WordPress databases. (We do have a daily Linode backup, but that's the painful way.) As a stop-gap, I setup a few bash scripts to run every day to grab the database. crontab -e # Give root word about the backup MAILTO=root # # Run five minutes after Midnight Eastern at quietest time, every day 5 0 * * * /root/bin/wordpress-backup.sh # Run ten minutes after Midnight Eastern at quietest time, every day 10 0 * * * /root/bin/mediawiki-backup.sh The root cause today was a fillup of /home/gerrit-backup/gerrit.ovirt.org-gerrit2-home-backup/ which has a daily snapshot of everything-that-is-gerrit. The problem is, I didn't build a clean-up for those backups, so they went back to January when I did the last manual clean-up. Gerrit probably fellover when trying to do the rsync of its backup to linode01.ovirt.org. That's the only way these two servers interact that I recall. So we need a cleanup script to run in cron.weekly or cron.daily to erase the old backups. We also need a script to rsync out the daily backup of the databases (and maybe other useful bits such as /var/www/html/w and /usr/share/wordpress.) We could copy this back over to gerrit.ovirt.org. Umm, hacky, but would work. And be better than the current situation. There is so little disk space on linode01.ovirt.org because I never intended to use that host this long. I've been working to find a better solution, preferably one running on KVM. :) and ideally provided by e.g. one of the sponsors. - - Karsten - -- name: Karsten 'quaid' Wade, Sr. Community Architect team: Red Hat Community Architecture & Leadership uri: http://communityleadershipteam.org http://TheOpenSourceWay.org gpg: AD0E0C41 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFPaKI72ZIOBq0ODEERAhnAAKCvNMDHxxG3IR2rDBBarqsn7V/UAACg4HIo T9rM2fCZTCdpDGrQsz/Xq2o= =vzYt -----END PGP SIGNATURE-----