--xHFwDpU9dbj6ez1V
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On 02/08 20:28, David Caro wrote:
=20
Hi everyone!
=20
There has been a storage outage today, it started around 17:30 CEST and s=
panned
until ~20:15. All the services are back up now and running, but a
bunch of
jenkins jobs failed due to the outage (all the slaves are using that stor=
age)
so you might see some false positives in your ci runs. To retrigger
you c=
an use
this job:
=20
http://jenkins.ovirt.org/gerrit_manual_trigger/
=20
And/or submit a new patchset (rebasing should work). In any case, if you =
have
any issues or doubts, please respond to this email or ping me
(dcaro/dcar=
oest)
on irc.
=20
Sorry for the inconvenience, we are gathering logs to find out what happe=
nd and
prevent it from happening in the future.
So the source of the issue has been sorted out, there was some uncoordinated
effort that ended up changing the LACP settings in the switches for all the
hosts, what caused a global network outage (all the hosts were affected) and
that in turn caused the clustering to freeze as none of the nodes was able =
to
contact the network both went down.
Then, once the network came up, the master of the cluster tried to remount =
the
drdb storage but was unable to due to some process keeping it busy, and did=
not
fully start up.
That is a scenario that we did not test (we tested one node failure, not bo=
th)
so will have to investigate that failure case and find a solution for the
clustering.
We are also talking with the hosting to properly sync with us on that type =
of
interventions so this will not happen again.
Thanks for your patience
=20
--=20
David Caro
=20
Red Hat S.L.
Continuous Integration Engineer - EMEA ENG Virtualization R&D
=20
Tel.: +420 532 294 605
Email: dcaro(a)redhat.com
IRC: dcaro|dcaroest@{freenode|oftc|redhat}
Web:
www.redhat.com
RHT Global #: 82-62605
--=20
David Caro
Red Hat S.L.
Continuous Integration Engineer - EMEA ENG Virtualization R&D
Tel.: +420 532 294 605
Email: dcaro(a)redhat.com
IRC: dcaro|dcaroest@{freenode|oftc|redhat}
Web:
www.redhat.com
RHT Global #: 82-62605
--xHFwDpU9dbj6ez1V
Content-Type: application/pgp-signature; name="signature.asc"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAEBAgAGBQJWuhFEAAoJEEBxx+HSYmnDkFMH+gNCa1Rd8dSoYaOdjgQBy3FZ
op05ye3zt7UaQ9ZcBlTtkmJ89XxtadEWXEc8gmqvaiuQxF6tSO9BX/jHN4OPMKvb
RNTEDpUCyxJmPl/hI/mbkOIQjaceTXjgJv/AgzrEGrBVKi5nbwxi8IIa2y7D8p53
hM/dy4DtFFTU6VtkQUHQsi7zANSLVE5/dd8B/er6KxfYH+PdzHTOnZz0kGiUXtkD
sN5pCqEeMXGKf8BsHxj0a8RBVQDBTKKakl5WoqGcCfXErGpG1lcpvXfI+dMN2H5r
7o0fVLCOZvTYKRXOmzdOhn4ZqlWMVrDaYoQjESAYaoR6/Oon+mWsrqsyNGQy5kM=
=UpXh
-----END PGP SIGNATURE-----
--xHFwDpU9dbj6ez1V--