]
Shlomi Zidmi updated OVIRT-2996:
--------------------------------
Resolution: Fixed
Status: Done (was: To Do)
This was cause due to /var/lib/origin being 100% full.
We are now monitoring the disks and investigating the root cause, as we still don't
know why the origin-node service crashes
[FIRING:1] InstanceUnreachable
ibm-srv02.ovirt.org
kubernetes-nodes-exporter (amd64 linux
ibm-srv02.ovirt.org true external
bare-metal-external ci)
---------------------------------------------------------------------------------------------------------------------------------------------------
Key: OVIRT-2996
URL:
https://ovirt-jira.atlassian.net/browse/OVIRT-2996
Project: oVirt - virtualization made easy
Issue Type: Bug
Reporter: Alertmanager_Bot
Assignee: infra
Labels:
ALERT{alertname="InstanceUnreachable",instance="ibm-srv02.ovirt.org",job="kubernetes-nodes-exporter"}
Labels:
- alertname = InstanceUnreachable
- beta_kubernetes_io_arch = amd64
- beta_kubernetes_io_os = linux
- instance =
ibm-srv02.ovirt.org
- job = kubernetes-nodes-exporter
- kubernetes_io_hostname =
ibm-srv02.ovirt.org
- node_role_kubernetes_io_compute = true
- region = external
- type = bare-metal-external
- zone = ci
Annotations:
- description =
ibm-srv02.ovirt.org of job kubernetes-nodes-exporter has been possibly
down for more than 10 minutes.
Source:
http://prometheus-0:9090/graph?g0.expr=up%7Bjob%3D%22kubernetes-nodes-exp...