[ovirt-users] Re: VMs hang periodically: gluster problem?

21 Sep 2022

      I tried to measure IOs using gluster volume top but its results seem very cryptic to me (I need a deep analyze and don't have the time now)

Thank you very much for your analysis, if I understood the problem is that the consumer SSD cache is too weak to help in times under a smoll number ~15 not particularly IO intensive VMs, so the IO hangs as the performance is poor and this hangs the VMs. The VMs kernel think that the CPU had hanged and so it crash.

This seem to be the case....

If it's possible would be very useful a sort of profiler in the gluster enviromnent that raise up the evidence of issue related to speed of the undelying storage infrastructure, it can be a problem related to disks or to network, in any case the errors reported to user are almost misleading as it seem there is a data integrity issue (cannot read... or something like this.
Only for reference these are the first lines of the "open" top command (currently I don't experience problems):
[root@ovirt-node2 ~]# gluster volume top gv1 open
Brick: ovirt-node2.ovirt:/brickgv1/gv1
Current open fds: 15, Max open fds: 38, Max openfd time: 2022-09-19 07:27:20.033304 +0000
Count           filename
=======================
331763          /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/inbox
66284           /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/leases
53939           /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/metadata.new
169             /45b4f14c-8323-482f-90ab-99d8fd610018/images/910fa026-d30b-4be2-9111-3c9f4f646fde/b7d6f39a-1481-4f5c-84fd-fc43f9e14d71
[...]

[ovirt-users] Re: VMs hang periodically: gluster problem?

Diego Ercolani