
hi, i have prometheus based ovirt hosts monitoring (node_exporter, smartcl_exporter, ipmi_exporter) https://prometheus-community.github.io/ansible/branch/main/ and alerts from https://samber.github.io/awesome-prometheus-alerts/ after i started this monitoring i found that one VM is overloading local storage (so i must check IO limiting documentation as a homework :) ) but my question is how do you monitor IO traffic per VM? (IOPS, read/write traffic,..) some qemu/libvirt exporter? some custom text file + node_exporter? thanks for tips Marek

For detailed monitoring I use Zabbix. This way I get detailed metrics on my hypervisors, VMs as well as my network storage. If a machine starts generating large IO I get alerts highlighting the responsible machine as well as the impacted services. For example, you might get high IO on a VM but also the correlated high latency on systems sharing the storage. Sometimes users will report the the high latency, masking the real problem so it's nice to have a holistic view of the entire environment. Patrick.Dubois On 2024-02-13 11:19, marek wrote:
hi,
i have prometheus based ovirt hosts monitoring (node_exporter, smartcl_exporter, ipmi_exporter)
https://prometheus-community.github.io/ansible/branch/main/ and alerts from https://samber.github.io/awesome-prometheus-alerts/
after i started this monitoring i found that one VM is overloading local storage (so i must check IO limiting documentation as a homework :) )
but my question is
how do you monitor IO traffic per VM? (IOPS, read/write traffic,..)
some qemu/libvirt exporter? some custom text file + node_exporter?
thanks for tips
Marek _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6HVHFX464QJPJT...

Hi. I also use Zabbix here. Its problem is that it collects metrics in real time, this is not its function. There are other alternatives like Elasticsearch + metricbeat, but from what I've tested, it's very heavy and uses a lot of disk space lol. I never used Prometheus, I found it interesting. I'll do some tests. @Patrick Dubois <pat@pdubois.com> How often do you collect information with Zabbix? Every 1 minute? Because for example... for the information to be used correctly for analysis, we have to have an IO load of at least 1 continuous minute so that Zabbix can collect the correct information. Cheers! Em ter., 13 de fev. de 2024 às 13:44, Patrick Dubois via Users < users@ovirt.org> escreveu:
For detailed monitoring I use Zabbix. This way I get detailed metrics on my hypervisors, VMs as well as my network storage.
If a machine starts generating large IO I get alerts highlighting the responsible machine as well as the impacted services. For example, you might get high IO on a VM but also the correlated high latency on systems sharing the storage.
Sometimes users will report the the high latency, masking the real problem so it's nice to have a holistic view of the entire environment.
Patrick.Dubois
On 2024-02-13 11:19, marek wrote:
hi,
i have prometheus based ovirt hosts monitoring (node_exporter, smartcl_exporter, ipmi_exporter)
https://prometheus-community.github.io/ansible/branch/main/ and alerts from https://samber.github.io/awesome-prometheus-alerts/
after i started this monitoring i found that one VM is overloading local storage (so i must check IO limiting documentation as a homework :) )
but my question is
how do you monitor IO traffic per VM? (IOPS, read/write traffic,..)
some qemu/libvirt exporter? some custom text file + node_exporter?
thanks for tips
Marek _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6HVHFX464QJPJT... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/L4SU7YZ52PO4FP...
-- Att, Jorge Visentini +55 55 98432-9868

I have different collection schedules depending on the importance of the data i'm collecting. You can adjust as you need accordingly. For your IO issues you can easily simply poll your machine's IO load statistics from the 5/10/15 minute averages. That will not give you precise intervals but certainly will tell you if something is going wrong. To be honest, Zabbix is flexible enough to get you what you need even if you're not monitoring the metric directly. Anything you can do to raise system visibility is good stuff! Enjoy! On 2024-02-13 14:32, Jorge Visentini wrote:
Hi.
I also use Zabbix here. Its problem is that it collects metrics in real time, this is not its function. There are other alternatives like Elasticsearch + metricbeat, but from what I've tested, it's very heavy and uses a lot of disk space lol.
I never used Prometheus, I found it interesting. I'll do some tests.
@Patrick Dubois <mailto:pat@pdubois.com> How often do you collect information with Zabbix? Every 1 minute? Because for example... for the information to be used correctly for analysis, we have to have an IO load of at least 1 continuous minute so that Zabbix can collect the correct information.
Cheers!
Em ter., 13 de fev. de 2024 às 13:44, Patrick Dubois via Users <users@ovirt.org> escreveu:
For detailed monitoring I use Zabbix. This way I get detailed metrics on my hypervisors, VMs as well as my network storage.
If a machine starts generating large IO I get alerts highlighting the responsible machine as well as the impacted services. For example, you might get high IO on a VM but also the correlated high latency on systems sharing the storage.
Sometimes users will report the the high latency, masking the real problem so it's nice to have a holistic view of the entire environment.
Patrick.Dubois
On 2024-02-13 11:19, marek wrote: > hi, > > i have prometheus based ovirt hosts monitoring (node_exporter, > smartcl_exporter, ipmi_exporter) > > https://prometheus-community.github.io/ansible/branch/main/ and alerts > from https://samber.github.io/awesome-prometheus-alerts/ > > after i started this monitoring i found that one VM is overloading > local storage (so i must check IO limiting documentation as a homework > :) ) > > but my question is > > how do you monitor IO traffic per VM? (IOPS, read/write traffic,..) > > some qemu/libvirt exporter? some custom text file + node_exporter? > > thanks for tips > > Marek > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/6HVHFX464QJPJT... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/L4SU7YZ52PO4FP...
-- Att, Jorge Visentini +55 55 98432-9868

Hi Marek, In fact all the data you need is already collected by oVirt/VDSM itself and saved into the DWH database. I configured sql_exporter for prometheus which does queries on the DWH database to gather the data I need. This is exported to prometheus, and there I can query all the data and do some alerting on for example I/O usage. Jean-Louis On 13/02/2024 17:19, marek wrote:
hi,
i have prometheus based ovirt hosts monitoring (node_exporter, smartcl_exporter, ipmi_exporter)
https://prometheus-community.github.io/ansible/branch/main/ and alerts from https://samber.github.io/awesome-prometheus-alerts/
after i started this monitoring i found that one VM is overloading local storage (so i must check IO limiting documentation as a homework :) )
but my question is
how do you monitor IO traffic per VM? (IOPS, read/write traffic,..)
some qemu/libvirt exporter? some custom text file + node_exporter?
thanks for tips
Marek _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6HVHFX464QJPJT...

can you publish your sql_exporter configuration? i found this exporter https://github.com/czerwonk/ovirt_exporter . will give a try Marek Dne 2024-02-14 v 9:04 Jean-Louis Dupond napsal(a):
Hi Marek,
In fact all the data you need is already collected by oVirt/VDSM itself and saved into the DWH database. I configured sql_exporter for prometheus which does queries on the DWH database to gather the data I need. This is exported to prometheus, and there I can query all the data and do some alerting on for example I/O usage.
Jean-Louis
On 13/02/2024 17:19, marek wrote:
hi,
i have prometheus based ovirt hosts monitoring (node_exporter, smartcl_exporter, ipmi_exporter)
https://prometheus-community.github.io/ansible/branch/main/ and alerts from https://samber.github.io/awesome-prometheus-alerts/
after i started this monitoring i found that one VM is overloading local storage (so i must check IO limiting documentation as a homework :) )
but my question is
how do you monitor IO traffic per VM? (IOPS, read/write traffic,..)
some qemu/libvirt exporter? some custom text file + node_exporter?
thanks for tips
Marek _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6HVHFX464QJPJT...

https://bugzilla.redhat.com/show_bug.cgi?id=1679333 You can find the config I use in this bugreport :) Jean-Louis On 14/02/2024 13:24, marek wrote:
can you publish your sql_exporter configuration?
i found this exporter https://github.com/czerwonk/ovirt_exporter . will give a try
Marek
Dne 2024-02-14 v 9:04 Jean-Louis Dupond napsal(a):
Hi Marek,
In fact all the data you need is already collected by oVirt/VDSM itself and saved into the DWH database. I configured sql_exporter for prometheus which does queries on the DWH database to gather the data I need. This is exported to prometheus, and there I can query all the data and do some alerting on for example I/O usage.
Jean-Louis
On 13/02/2024 17:19, marek wrote:
hi,
i have prometheus based ovirt hosts monitoring (node_exporter, smartcl_exporter, ipmi_exporter)
https://prometheus-community.github.io/ansible/branch/main/ and alerts from https://samber.github.io/awesome-prometheus-alerts/
after i started this monitoring i found that one VM is overloading local storage (so i must check IO limiting documentation as a homework :) )
but my question is
how do you monitor IO traffic per VM? (IOPS, read/write traffic,..)
some qemu/libvirt exporter? some custom text file + node_exporter?
thanks for tips
Marek _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6HVHFX464QJPJT...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BVXGPID57RFONQ...

btw i found this interesting discussion about libvirt exporter https://github.com/prometheus-community/community/issues/50 Marek Dne 2024-02-14 v 14:19 Jean-Louis Dupond napsal(a):
https://bugzilla.redhat.com/show_bug.cgi?id=1679333
You can find the config I use in this bugreport :)
Jean-Louis
On 14/02/2024 13:24, marek wrote:
can you publish your sql_exporter configuration?
i found this exporter https://github.com/czerwonk/ovirt_exporter . will give a try
Marek
Dne 2024-02-14 v 9:04 Jean-Louis Dupond napsal(a):
Hi Marek,
In fact all the data you need is already collected by oVirt/VDSM itself and saved into the DWH database. I configured sql_exporter for prometheus which does queries on the DWH database to gather the data I need. This is exported to prometheus, and there I can query all the data and do some alerting on for example I/O usage.
Jean-Louis
On 13/02/2024 17:19, marek wrote:
hi,
i have prometheus based ovirt hosts monitoring (node_exporter, smartcl_exporter, ipmi_exporter)
https://prometheus-community.github.io/ansible/branch/main/ and alerts from https://samber.github.io/awesome-prometheus-alerts/
after i started this monitoring i found that one VM is overloading local storage (so i must check IO limiting documentation as a homework :) )
but my question is
how do you monitor IO traffic per VM? (IOPS, read/write traffic,..)
some qemu/libvirt exporter? some custom text file + node_exporter?
thanks for tips
Marek _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6HVHFX464QJPJT...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BVXGPID57RFONQ...
participants (4)
-
Jean-Louis Dupond
-
Jorge Visentini
-
marek
-
Patrick Dubois