--=-FhjElmgt/jS0gQbeNfRJ
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Le vendredi 06 juin 2014 =C3=A0 14:36 +0200, Ewoud Kohl van Wijngaarden a
=C3=A9crit :
On Fri, Jun 06, 2014 at 01:29:44PM +0200, Michael Scherer wrote:
> Due to CVE on openssl and on kernel, I did upgrade various piece of the
> infrastructure ( foreman, lists, stats, monitoring ), which implied a
> few reboots ( due to kernel lagging behind, which is not that great wit=
h
> local root exploit ). As this is friday and I assumed most of
the Tel
> Aviv office was not working, i hope this kept the disruption to a
> minimum. However, if something is broken, please tell it so we can fix.
=20
Nice work.
> This also got me thinking. In order to bring a bit more order,
what
> about having a fixed schedule for upgrade ?
>=20
> In my previous position, we were doing that once per month ( except
> during end of quarter freeze ), with mandatory reboot ( cause if
> something do not boot, you want to know it when you have a planned
> outage, not when everyone is running around updating stuff ). Fedora ha=
s
> a rather complex procedure to decide what to upgrade, hilighted
on
>
http://infrastructure.fedoraproject.org/infra/docs/massupgrade.txt
=20
At my previous employer we had something similar. I also wrote a puppet
custom fact reboot_needed which checks if the running kernel matches the
default kernel that would be booted.
We also want to restart services that are affected. Hence a reboot would
be simpler from a engineering point of view.
> So we could adopt a schedule ( once per month, unless there is
somethin=
g
> critical, in which case we do it ASAP, with warning on the list
and irc
> ).=20
=20
+1
=20
> The schedule should of course take in account "business need", which is
> "release schedule of ovirt".
>=20
> So what about "first friday of the month, unless exception" ?
=20
We should also make sure that we don't reboot ALL servers at once. So if
we have multiple centos 6 jenkins slaves, try to just reboot one at a
time. Also would be nice if the slave did proper scheduling in jenkins
so no jobs are running.
Yep, proper orchestration would be nice. Now, if jenkins is resilient
enough ( ie, if it can survive a 5 minutes downtime ), it may not be
that business critical to have it running all the time.
I am not a jenkins expert, not even a user, so I defer to David for
that.=20
> And by update, i mean "yum upgrade -y". Cleaning the
list of repo on
> various servers is also IMHO another task to discuss, to make sure the
> task can be safely executed. ( having something like
> mcollective/ansible/func is also needed, but that's more a convenience
> than a requirement at this stage ).
=20
We sometimes have pinned versions on jenkins build slaves. That means we
should either do a proper yum versionlock or find something else. Note
that I'm all in favor of being able to to a blind yum upgrades.
--=20
Michael Scherer
Open Source and Standards, Sysadmin
--=-FhjElmgt/jS0gQbeNfRJ
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part
Content-Transfer-Encoding: 7bit
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJTkc70AAoJEE89Wa+PrSK9qrIP/1zYIV1fUHSJAfdLMzIqgbmM
vg09H49Ovykb+0wr9Qmp8SEY4P6ITaQr2awamg4JtoLKZTuxSscx6OxWzkRjHLdE
c5vy7c7qpvrL89lZBXO5si7mYyuAyHkNo4V0VAnRkPJh8uscqaEXJUTUcXQdcyzR
cM9jyjuo9CJovPUrRW6fq8Jn7RjFRmNa7+0Czh0Rg5PVLrfYkESAoPSohXbZzkSk
KKhvctH4Aax5YeKgz6vAcGK+xtQD5ReBqhX1kQJD2fEYvgXcfKYrKGuM1Dj6FyDU
CHoDW9pslO6WO2SoSbLjh45Hi8gN0FYF1oc9gNUHzFTqVhITl8pU36QCocLo+lK0
0h0WkBa5NwQG1tLEPaZzY7FMKDlQ9XimFNNJJTnWam7BELyy9pZM95cbjEkFFW2G
IXPZ6IJKaIe7/VA/R1qE1d+LZtDzaioMgcNOcq31tE+/UVuUuxc+XN+vg/jFVBk9
Kp+EFzLG94MJ6SYn/9t2FimycGShJ+z2mjDDwQTrliXE9fA5U1iqk0hrZIEpNTeV
mDTqnqh47ZfUyIQxackS8PJnj0jFuPRHWqidEhhhhrEl0yo+SL5GPIKRzcwR06h0
XiIUcir+xkv8eyw5JGNpRZ672Lp2uCQoNo01Xw3zI5x3lUjls5TTwL4UVPli+ui2
4BCzsssWEHPA2PpFEP3o
=kbfA
-----END PGP SIGNATURE-----
--=-FhjElmgt/jS0gQbeNfRJ--