On Fri, Dec 8, 2017 at 4:18 PM Kevin Wolf <kwolf@redhat.com> wrote:
Am 07.12.2017 um 23:45 hat Nir Soffer geschrieben:
> The qemu bug https://bugzilla.redhat.com/713743 explains the issue:
> qemu-img was writing disk images using writeback and fillingup the
> cache buffers which are then flushed by the kernel preventing other
> processes from accessing the storage. This is particularly bad in
> cluster environments where time-based algorithms might be in place and
> accessing the storage within certain timeouts is critical
>
> I'm not sure it this issue relevant now. We use now sanlock instead of
> safelease, (except for export domain still using safelease), and qemu
> or kernel may have better options to avoid trashing the host cache, or
> guarantee reliable access to storage.

Non-direct means that the data goes through the kernel page cache, and
the kernel doesn't know that it won't be needed again, so it will fill
up the cache with the image.

I'm also not aware that cache coherency is now provided by all backends
for shared storage, so O_DIRECT still seems to be the only way to avoid
using stale caches. Since the problem is about stale caches, I don't see
how the locking mechanism could make a difference.

The only thing I can suggest, given that there is a "glusterfs" in the
subject line of the email, is that the native gluster driver in QEMU
takes a completely different path and never uses the kernel page cache,
which should make both problems disappear. Maybe it would be worth
having a look at this.

This is an interesting direction - currently oVirt does not use native 
glusterfs for qemu-img operation, only for running vms. We still use
the fuse based glusterfs mount for storage operations like copying
and converting images.
 

Kevin