Re: [ovirt-users] slow performance with export storage on glusterfs

7 Dec 2017

      On Thu, Nov 23, 2017 at 3:33 PM Yaniv Kaul <ykaul@redhat.com> wrote:
...
On Thu, Nov 23, 2017 at 1:43 PM, Jiří Sléžka <jiri.slezka@slu.cz> wrote:
...
well, another idea
when I did not use the direct flag, the performace was much better
15787360256 bytes (16 GB) copied, 422.955159 s, 37.3 MB/s
That means you were hitting the cache.
...
probably qemu-img uses direct write too and I understand why. But in
case of backup it is not as hot I think. Is there a chance to modify
this behavior for backup case? Is it a good idea? Should I fill RFE?
Probably not. We really prefer direct IO to ensure data is consistent.
Y
I did some research in prehistoric oVirt source, and found why we started
to
use direct I/O. We have two issues:

1. reading stale data from storage
2. trashing host cache

If you don't use direct I/O when accessing shared storage, you risk reading
stale data from the kernel buffer cache. This cache may be stale since the
kernel
does not know anything about other hosts writing to the same storage after
the
last read from this storage.

The -t none option in vdsm was introduced because of
https://bugzilla.redhat.com/699976.

The qemu bug https://bugzilla.redhat.com/713743 explains the issue:
qemu-img was writing disk images using writeback and fillingup the cache
buffers
which are then flushed by the kernel preventing other processes from
accessing
the storage. This is particularly bad in cluster environments where
time-based
algorithms might be in place and accessing the storage within certain
timeouts
is critical

I'm not sure it this issue relevant now. We use now sanlock instead of
safelease,
(except for export domain still using safelease), and qemu or kernel may
have
better options to avoid trashing the host cache, or guarantee reliable
access
to storage.

Daivd, do you know if sanlock is effected by trashing the host cache?

Adding also qemu-block mailing list.

Nir
...
...
Cheers,
Jiri
On 11/23/2017 12:26 PM, Jiří Sléžka wrote:
...
Hi,
On 11/22/2017 07:30 PM, Nir Soffer wrote:
...
On Mon, Nov 20, 2017 at 5:22 PM Jiří Sléžka <jiri.slezka@slu.cz
<mailto:jiri.slezka@slu.cz>> wrote:
Hi,
I am trying realize why is exporting of vm to export storage on
    glusterfs such slow.
I am using oVirt and RHV, both instalations on version 4.1.7.
Hosts have dedicated nics for rhevm network - 1gbps, data storage
itself
    is on FC.
GlusterFS cluster lives separate on 4 dedicated hosts. It has slow
disks
    but I can achieve about 200-400mbit throughput in other
applications (we
    are using it for "cold" data, backups mostly).
I am using this glusterfs cluster as backend for export storage.
When I
    am exporting vm I can see only about 60-80mbit throughput.
What could be the bottleneck here?
Could it be qemu-img utility?
vdsm      97739  0.3  0.0 354212 29148 ?        S<l  15:43   0:06
    /usr/bin/qemu-img convert -p -t none -T none -f raw
/rhev/data-center/2ff6d0ee-a10b-473d-b77c-be9149945f5f/ff3cd56a-1005-4426-8137-8f422c0b47c1/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
...
-O raw
    /rhev/data-center/mnt/glusterSD/10.20.30.41:
_rhv__export/81094499-a392-4ea2-b081-7c6288fbb636/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
...
Any idea how to make it work faster or what throughput should I
    expected?
gluster storage operations are using fuse mount - so every write:
- travel to the kernel
- travel back to the gluster fuse helper process
- travel to all 3 replicas - replication is done on client side
- return to kernel when all writes succeeded
- return to caller
So gluster will never set any speed record.
Additionally, you are copying from raw lv on FC - qemu-img cannot do
anything
smart and avoid copying unused clusters. Instead if copies gigabytes of
zeros
from FC.
ok, it does make sense
...
However 7.5-10 MiB/s sounds too slow.
I would try to test with dd - how much time it takes to copy
the same image from FC to your gluster storage?
dd
if=/rhev/data-center/2ff6d0ee-a10b-473d-b77c-be9149945f5f/ff3cd56a-1005-4426-8137-8f422c0b47c1/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
...
of=/rhev/data-center/mnt/glusterSD/10.20.30.41:
_rhv__export/81094499-a392-4ea2-b081-7c6288fbb636/__test__
bs=8M oflag=direct status=progress
unfrotunately dd performs the same
1778384896 bytes (1.8 GB) copied, 198.565265 s, 9.0 MB/s
...
If dd can do this faster, please ask on qemu-discuss mailing list:
https://lists.nongnu.org/mailman/listinfo/qemu-discuss
If both give similar results, I think asking in gluster mailing list
about this can help. Maybe your gluster setup can be optimized.
ok, this is definitly on the gluster side. Thanks for your guidance.
I will investigate the gluster side and also will try Export on NFS
share.
Cheers,
Jiri
...
Nir
Cheers,
Jiri
_______________________________________________
    Users mailing list
    Users@ovirt.org <mailto:Users@ovirt.org>
    http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users