[
https://ovirt-jira.atlassian.net/browse/OVIRT-2972?page=com.atlassian.jir...
]
Shlomi Zidmi commented on OVIRT-2972:
-------------------------------------
Actually we are monitoring [
resources.ovirt.org|http://resources.ovirt.org] in Prometheus,
and should have caught this on time
!prom.png|width=83.33333333333334%!
Looking at the alert rules defined, i see the following:
{noformat}node_filesystem_free_bytes
/ node_filesystem_size_bytes{device!="tmpfs"} < 0.1{noformat}
The blue line (/dev/sdb - mounted on /home/jenkins) in the picture is slightly above 0.1,
and that’s why no alerts were fired. I’ll readjust the threshold so that Prometheus would
be able to catch similar issues sooner next time.
resources.ovirt.org ran out of space for artifact publishing
------------------------------------------------------------
Key: OVIRT-2972
URL:
https://ovirt-jira.atlassian.net/browse/OVIRT-2972
Project: oVirt - virtualization made easy
Issue Type: Bug
Reporter: Evgheni Dereveanchin
Assignee: infra
Attachments: image-20200720-083159.png, prom.png
This weekend Nagios sent disk space alerts for the /home/jenkins on
resources.ovirt.org
AFAIR this is used as intermediate artifact storage for publishing and shouldn't fill
up that much. Logging a ticket to investigate the root cause.
The partition is free again now so this is not blocking anything right now.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100133)