[DRAFT] Outage :: No disk space :: 2012-08-30

31 Aug 2012

      This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigAC7E381E3EB9CC3AB6A2AEE6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi:

I didn't really participate in this outage, so I thought others could
help us draft up notes about it. I put some barebones below.

One outcome we need to look at is, what do people do when they perceive
services are out?

Of course, if the service is the wiki, they can't check that for what to
do ...

How do we communicate when major communication services of ovirt.org are
down? IRC is great but not enough ... If we can arrange for a reliable
third-party mail relay to alias a page to the Infra team, great, but how
do we keep it from getting spam?

Another angle to resolve is service monitoring so we know when things go
out rather than waiting for service users to tell us. I got some direct
emails from people (since the infra@ list wasn't working), but I was
unavailable and unaware of the problem until Robert called me when he
was working on fixing it. I don't mind getting pager alerts, as long as
we can tune things so they are not crazy often. :)

=3D=3D What occurred =3D=3D

Even the doubled disk space on linode01.ovirt.ort (to 25 GB) wasn't
enough to last long.

=3D=3D When =3D=3D

XXXX?

date -d "2012-08-30 XXXX UTC"

=3D=3D Affected services =3D=3D

lists.ovirt.org
wiki.ovirt.org
ovirt.org/.*
ovirtbot
Gerritt backup
Jenkins backup
[[What else?]]

=3D=3D Responses to take =3D=3D

* Get new hosting solution in place.
* Double current disk space before new hosting move, to give us room to
breath.
* Work up a response place that is posted in the IRC topic or somewhere
good so people know how to contact all of the Infra team when something
is happening.
* New service need: monitoring server

--=20
Karsten 'quaid' Wade, Sr. Analyst - Community Growth
http://TheOpenSourceWay.org  .^\  http://community.redhat.com
@quaid (identi.ca/twitter/IRC)  \v'  gpg: AD0E0C41

--------------enigAC7E381E3EB9CC3AB6A2AEE6
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFQQSg82ZIOBq0ODEERAiERAJkB6BhMquAAw1afh4vAsvguFAW78wCfQVR7
rkZrO1s+Ym9Qi3ge4x6qj4c=
=ZjRV
-----END PGP SIGNATURE-----

--------------enigAC7E381E3EB9CC3AB6A2AEE6--

Karsten 'quaid' Wade

Mike Burns

Robert Middleswarth

tags

participants (3)