On March 7, 2020 1:09:37 AM GMT+02:00, Jayme <jaymef(a)gmail.com> wrote:
Strahil,
Thanks for your suggestions. The config is pretty standard HCI setup
with
cockpit and hosts are oVirt node. XFS was handled by the deployment
automatically. The gluster volumes were optimized for virt store.
I tried noop on the SSDs, that made zero difference in the tests I was
running above. I took a look at the random-io-profile and it looks like
it
really only sets vm.dirty_background_ratio = 2 & vm.dirty_ratio = 5 --
my
hosts already appear to have those sysctl values, and by default are
using virtual-host tuned profile.
I'm curious what a test like "dd if=/dev/zero of=test2.img bs=512
count=1000 oflag=dsync" on one of your VMs would show for results?
I haven't done much with gluster profiling but will take a look and see
if
I can make sense of it. Otherwise, the setup is pretty stock oVirt HCI
deployment with SSD backed storage and 10Gbe storage network. I'm not
coming anywhere close to maxing network throughput.
The NFS export I was testing was an export from a local server
exporting a
single SSD (same type as in the oVirt hosts).
I might end up switching storage to NFS and ditching gluster if
performance
is really this much better...
On Fri, Mar 6, 2020 at 5:06 PM Strahil Nikolov <hunter86_bg(a)yahoo.com>
wrote:
> On March 6, 2020 6:02:03 PM GMT+02:00, Jayme <jaymef(a)gmail.com>
wrote:
> >I have 3 server HCI with Gluster replica 3 storage (10GBe and SSD
> >disks).
> >Small file performance inner-vm is pretty terrible compared to a
> >similar
> >spec'ed VM using NFS mount (10GBe network, SSD disk)
> >
> >VM with gluster storage:
> >
> ># dd if=/dev/zero of=test2.img bs=512 count=1000 oflag=dsync
> >1000+0 records in
> >1000+0 records out
> >512000 bytes (512 kB) copied, 53.9616 s, 9.5 kB/s
> >
> >VM with NFS:
> >
> ># dd if=/dev/zero of=test2.img bs=512 count=1000 oflag=dsync
> >1000+0 records in
> >1000+0 records out
> >512000 bytes (512 kB) copied, 2.20059 s, 233 kB/s
> >
> >This is a very big difference, 2 seconds to copy 1000 files on NFS
VM
> >VS 53
> >seconds on the other.
> >
> >Aside from enabling libgfapi is there anything I can tune on the
> >gluster or
> >VM side to improve small file performance? I have seen some guides
by
> >Redhat in regards to small file performance but I'm not sure what/if
> >any of
> >it applies to oVirt's implementation of gluster in HCI.
>
> You can use the rhgs-random-io tuned profile from
>
ftp://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHS/SRPMS/redhat-...
> and try with that on your hosts.
> In my case, I have modified it so it's a mixture between
rhgs-random-io
> and the profile for Virtualization Host.
>
> Also,ensure that your bricks are using XFS with relatime/noatime
mount
> option and your scheduler for the SSDs is either 'noop' or 'none'
.The
> default I/O scheduler for RHEL7 is deadline which is giving
preference to
> reads and your workload is definitely 'write'.
>
> Ensure that the virt settings are enabled for your gluster volumes:
> 'gluster volume set <volname> group virt'
>
> Also, are you running on fully allocated disks for the VM or you
started
> thin ?
> I'm asking as creation of new shards at gluster level is a slow
task.
>
> Have you checked gluster profiling the volume? It can clarify what
is
> going on.
>
>
> Also are you comparing apples to apples ?
> For example, 1 ssd mounted and exported as NFS and a replica 3
volume
> of the same type of ssd ? If not, the NFS can have more iops due to
> multiple disks behind it, while Gluster has to write the same thing
on all
> nodes.
>
> Best Regards,
> Strahil Nikolov
>
>
Hi Jayme,
My test are not quite good ,as I have a different setup:
NVME - VDO - 4 thin LVs -XFS - 4 Gluster volumes (replica 2 arbiter 1) - 4 storage
domains - striped LV in each VM
RHEL7 VM (fully stock):
[root@node1 ~]# dd if=/dev/zero of=test2.img bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 19.8195 s, 25.8 kB/s
[root@node1 ~]#
Brick:
[root@ovirt1 data_fast]# dd if=/dev/zero of=test2.img bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 1.41192 s, 363 kB/s
As I use VDO with compression (on 1/4 of the NVMe) - I cannot expect any performance from
it.
Is your app really using dsync ? I have seen many times that performance testing with the
wrong tools/tests cause more trouble than it should.
I would recommend you to test with a real workload before deciding to change the
architecture.
I forgot to mention that you need to disable c states for your systems if you are chasing
performance.
Run a gluster profile while you run real workload in your VMs and then provide that for
analysis.
Which version of Gluster are you using ?
Best Regards,
Strahil Nikolov