[ovirt-users] Storage latency message

Tue Apr 18 14:42:01 UTC 2017

Once upon a time, Nir Soffer <nsoffer at redhat.com> said:
> Ovirt is reading 4k from the metadata special volume every 10 secods. If
> the read takes more than 5 seconds, you will see this warning in engine
> event log.
> 
> Maybe your storage or the host was overloaded at that time (e.g. vm backup)?

I don't see any evidence that the storage was having any problem.  The
times the message gets logged are not at any high-load times either
(either scheduled backups or just high demand).

I wrote a perl script to replicate the check, and I ran it on a node in
maintenance mode (so no other traffic on the node).  My script opens a
block device with O_DIRECT, reads the first 4K, and closes it, reporting
the time.  I do see some latency jumps with that check, but not on the
raw block device, just the LV.

By that I mean I'm running it on two devices: the multipath device that
is the PV and the metadata LV.  The multipath device latency is pretty
stable, running around 0.3 to 0.5ms.  The LV latency is higher (just a
little normally) but has a higher variability and spikes to 50-125ms (at
the same time that reading the multipath device took under 0.5ms).

Seems like this might be a problem somewhere in the Linux logical volume
layer, not the block or network layer (or with the network/storage
itself).
-- 
Chris Adams <cma at cmadams.net>