[ovirt-users] Re: Tuning and testing GlusterFS performance

Saturday, 4 August 2018

One more interesting thing to note.  As a test I just re-ran DD on engine
VM and got around 3-5MB/sec average writes.  I had not previous set this
volume to optimize for virt store.  So I went ahead and set that option and
now I'm getting 50MB/sec writes.

However, if I compare my gluster engine volume info now VS what I just
posted in above reply before I made the optimize change all the gluster
options are identical, not one value is changed as far as I can see.  What
is the optimize for virt store option in the admin GUI doing exactly?

On Sat, Aug 4, 2018 at 10:29 AM, Jayme <jaymef(a)gmail.com&gt; wrote:

...
 One more note on this.  I only set optimize for virt on data volumes.
 I
 did not and wasn't sure if I should set on engine volume.  My DD tests on
 engine VM are writing at ~8Mb/sec (like my test VM on data volume was
 before I made the change).  Is it recommended to use the optimize for virt
 on the engine volume as well?

 On Sat, Aug 4, 2018 at 10:26 AM, Jayme <jaymef(a)gmail.com&gt; wrote:

> Interesting that it should have been set by cockpit but seemingly wasn't
> (at least it did not appear so in my case, as setting optimize for virt
> increased performance dramatically).  I did indeed use the cockpit to
> deploy.  I was using ovirt node on all three host, recent download/burn of
> 4.2.5.  Here is my current gluster volume info if it's helpful to anyone:
>
> Volume Name: data
> Type: Replicate
> Volume ID: 1428c3d3-8a51-4e45-a7bb-86b3bde8b6ea
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: MASKED:/gluster_bricks/data/data
> Brick2: MASKED:/gluster_bricks/data/data
> Brick3: MASKED:/gluster_bricks/data/data
> Options Reconfigured:
> features.barrier: disable
> server.allow-insecure: on
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: enable
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> Volume Name: data2
> Type: Replicate
> Volume ID: e97a2e9c-cd47-4f18-b2c2-32d917a8c016
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: MASKED:/gluster_bricks/data2/data2
> Brick2: MASKED:/gluster_bricks/data2/data2
> Brick3: MASKED:/gluster_bricks/data2/data2
> Options Reconfigured:
> server.allow-insecure: on
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: enable
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> Volume Name: engine
> Type: Replicate
> Volume ID: ae465791-618c-4075-b68c-d4972a36d0b9
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: MASKED:/gluster_bricks/engine/engine
> Brick2: MASKED:/gluster_bricks/engine/engine
> Brick3: MASKED:/gluster_bricks/engine/engine
> Options Reconfigured:
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> Volume Name: vmstore
> Type: Replicate
> Volume ID: 7065742b-c09d-410b-9e89-174ade4fc3f5
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: MASKED:/gluster_bricks/vmstore/vmstore
> Brick2: MASKED:/gluster_bricks/vmstore/vmstore
> Brick3: MASKED:/gluster_bricks/vmstore/vmstore
> Options Reconfigured:
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> Volume Name: vmstore2
> Type: Replicate
> Volume ID: 6f9a1c51-c0bc-46ad-b94a-fc2989a36e0c
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: MASKED:/gluster_bricks/vmstore2/vmstore2
> Brick2: MASKED:/gluster_bricks/vmstore2/vmstore2
> Brick3: MASKED:/gluster_bricks/vmstore2/vmstore2
> Options Reconfigured:
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> On Fri, Aug 3, 2018 at 6:53 AM, Sahina Bose <sabose(a)redhat.com&gt; wrote:
>
>>
>>
>> On Fri, 3 Aug 2018 at 3:07 PM, Jayme <jaymef(a)gmail.com&gt; wrote:
>>
>>> Hello,
>>>
>>> The option to optimize for virt store is tough to find (in my opinion)
>>> you have to go to volumes > volume name and then click the two dots to
>>> expand further options in the top right to see it.  No one would know to
>>> find it (or that it even exists) if they weren't specifically looking.
>>>
>>> I don't know enough about it but my assumption is that there are
>>> reasons why it's not set by default (as it might or should not need to
>>> apply to ever volume created), however my suggestion would be that it be
>>> included in the cockpit as a selectable option next to each volume you
>>> create with a hint to suggest that for best performance select it for any
>>> volume that is going to be a data volume for VMs
>>>
>>
>> If you have installed via Cockpit, the options are set.
>> Can you provide the “gluster volume info “ output  after you optimised
>> for virt?
>>
>>
>>
>> .
>>>
>>> I simply installed using the latest node ISO / default cockpit
>>> deployment.
>>>
>>> Hope this helps!
>>>
>>> - Jayme
>>>
>>> On Fri, Aug 3, 2018 at 5:15 AM, Sahina Bose <sabose(a)redhat.com&gt; wrote:
>>>
>>>>
>>>>
>>>> On Fri, Aug 3, 2018 at 5:25 AM, Jayme <jaymef(a)gmail.com&gt; wrote:
>>>>
>>>>> Bill,
>>>>>
>>>>> I thought I'd let you (and others know this) as it might save
you
>>>>> some headaches.  I found that my performance problem was resolved by
>>>>> clicking "optimize for virt store" option in the volume
settings of the
>>>>> hosted engine (for the data volume).  Doing this one change has
increased
>>>>> my I/O performance by 10x alone.  I don't know why this would not
be set or
>>>>> recommended by default but I'm glad I found it!
>>>>>
>>>>
>>>> Thanks for the feedback, Could you log a bug to make it default by
>>>> providing the user flow that you used.
>>>>
>>>> Also, I would be interested to know how you prepared the gluster
>>>> volume for use - if it was using the Cockpit deployment UI, the volume
>>>> options would have been set by default.
>>>>
>>>>
>>>>> - James
>>>>>
>>>>> On Thu, Aug 2, 2018 at 2:32 PM, William Dossett <
>>>>> william.dossett(a)gmail.com&gt; wrote:
>>>>>
>>>>>> Yeah, I am just ramping up here, but this project is mostly on
my
>>>>>> own time and money, hence no SSDs for Gluster… I’ve already blown
close to
>>>>>> $500 of my own money on 10Gb ethernet cards and SFPs on ebay as
my company
>>>>>> frowns on us getting good deals for equipment on ebay and would
rather go
>>>>>> to their preferred supplier – where $500 wouldn’t even buy half a
10Gb CNA
>>>>>> ☹  but I believe in this project and it feels like it is getting
>>>>>> ready for showtime – if I can demo this in a few weeks and get
some
>>>>>> interest I’ll be asking them to reimburse me, that’s for sure!
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hopefully going to get some of the other work off my plate and
work
>>>>>> on this later this afternoon, will let you know any findings.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Bill
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Jayme <jaymef(a)gmail.com&gt;
>>>>>> *Sent:* Thursday, August 2, 2018 11:07 AM
>>>>>> *To:* William Dossett <william.dossett(a)gmail.com&gt;
>>>>>> *Cc:* users <users(a)ovirt.org&gt;
>>>>>> *Subject:* Re: [ovirt-users] Tuning and testing GlusterFS
>>>>>> performance
>>>>>>
>>>>>>
>>>>>>
>>>>>> Bill,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Appreciate the feedback and would be interested to hear some of
your
>>>>>> results.  I'm a bit worried about what i'm seeing so far
on a very stock 3
>>>>>> node HCI setup.  8mb/sec on that dd test mentioned in the
original post
>>>>>> from within a VM (which may be explained by bad testing methods
or some
>>>>>> other configuration considerations).. but what is more worrisome
to me is
>>>>>> that I tried another dd test to time creating a 32GB file, it was
taking a
>>>>>> long time so I exited the process and the VM basically locked up
on me, I
>>>>>> couldn't access it or the console and eventually had to do a
hard shutdown
>>>>>> of the VM to recover.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I don't plan to host many VMs, probably around 15.  They
aren't
>>>>>> super demanding servers but some do read/write big directories
such as
>>>>>> working with github repos and large node_module folders, rsyncs
of fairly
>>>>>> large dirs etc.  I'm definitely going to have to do a lot
more testing
>>>>>> before I can be assured enough to put any important VMs on this
cluster.
>>>>>>
>>>>>>
>>>>>>
>>>>>> - James
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 2, 2018 at 1:54 PM, William Dossett <
>>>>>> william.dossett(a)gmail.com&gt; wrote:
>>>>>>
>>>>>> I usually look at IOPs using IOMeter… you usually want several
>>>>>> workers running reads and writes in different threads at the same
time.
>>>>>> You can run Dynamo on a Linux instance and then connect it to a
window GUI
>>>>>> running IOMeter to give you stats.  I was getting around 250 IOPs
on JBOD
>>>>>> sata 7200rpm drives which isn’t bad for cheap and cheerful sata
drives.
>>>>>>
>>>>>>
>>>>>>
>>>>>> As I said, I’ve worked with HCI in VMware now for a couple of
years,
>>>>>> intensely this last year when we had some defective Dell hardware
and
>>>>>> trying to diagnose the problem.  Since then the hardware has
been
>>>>>> completely replaced with all flash solution.   So when I got the
all flash
>>>>>> solution I used IOmeter on it and was only getting around 3000
IOPs on
>>>>>> enterprise flash disks… not exactly stellar, but OK for one VM. 
The trick
>>>>>> there was the scale out.  There is a VMware Fling call HCI Bench.
 Its very
>>>>>> cool in that you spin up one VM and then it spawns 40 more VMs
across the
>>>>>> cluster.  I  could then use VSAN observer and it showed my hosts
were
>>>>>> actually doing 30K IOPs on average which is absolutely stellar
>>>>>> performance.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Anyway, moral of the story there was that your one VM may seem
like
>>>>>> its quick, but not what you would expect from flash…   but as you
add more
>>>>>> VMs in the cluster and they are all doing workloads, it scales
out
>>>>>> beautifully and the read/write speed does not slow down as you
add more
>>>>>> loads.  I’m hoping that’s what we are going to see with Gluster.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Also, you are using mb nomenclature below, is that Mb, or MB?  I
am
>>>>>> sort of assuming MB megabytes per second…  it does not seem very
fast.  I’m
>>>>>> probably not going to get to work more on my cluster today as
I’ve got
>>>>>> other projects that I need to get done on time, but I want to try
and get
>>>>>> some templates up and running and do some more testing either
tomorrow or
>>>>>> this weekend and see what I get in just basic writing MB/s and
let you know.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Bill
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Jayme <jaymef(a)gmail.com&gt;
>>>>>> *Sent:* Thursday, August 2, 2018 8:12 AM
>>>>>> *To:* users <users(a)ovirt.org&gt;
>>>>>> *Subject:* [ovirt-users] Tuning and testing GlusterFS
performance
>>>>>>
>>>>>>
>>>>>>
>>>>>> So I've finally completed my first HCI build using the below
>>>>>> configuration:
>>>>>>
>>>>>>
>>>>>>
>>>>>> 3x
>>>>>>
>>>>>> Dell PowerEdge R720
>>>>>>
>>>>>> 2x 2.9 GHz 8 Core E5-2690
>>>>>>
>>>>>> 256GB RAM
>>>>>>
>>>>>> 2x250gb SSD Raid 1 (boot/os)
>>>>>>
>>>>>> 2x2TB SSD jbod passthrough (used for gluster bricks)
>>>>>>
>>>>>> 1Gbe Nic for management 10Gbe nic for Gluster
>>>>>>
>>>>>>
>>>>>>
>>>>>> Using Replica 3 with no arbiter.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Installed the latest version of oVirt available at the time
4.2.5.
>>>>>> Created recommended volumes (with an additional data volume on
second SSD).
>>>>>> Not using VDO
>>>>>>
>>>>>>
>>>>>>
>>>>>> First thing I did was setup glusterFS network on 10Gbe and set it
to
>>>>>> be used for glusterFS and migration traffic.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I've setup a single test VM using Centos7 minimal on the
default
>>>>>> "x-large instance" profile.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Within this VM if I do very basic write test using something
like:
>>>>>>
>>>>>>
>>>>>>
>>>>>> dd bs=1M count=256 if=/dev/zero of=test conv=fdatasync
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm seeing quite slow speeds, only 8mb/sec.
>>>>>>
>>>>>>
>>>>>>
>>>>>> If I do the same from one of the hosts gluster mounts i.e.
>>>>>>
>>>>>>
>>>>>>
>>>>>> host1: /rhev/data-center/mnt/glusterSD/HOST:data
>>>>>>
>>>>>>
>>>>>>
>>>>>> I get about 30mb/sec (which still seems fairly low?)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Am I testing incorrectly here?  Is there anything I should be
tuning
>>>>>> on the Gluster volumes to increase performance with SSDs?  Where
can I find
>>>>>> out where the bottle neck is here, or is this expected
performance of
>>>>>> Gluster?
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list -- users(a)ovirt.org
>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>> oVirt Code of Conduct: https://www.ovirt.org/communit
>>>>> y/about/community-guidelines/
>>>>> List Archives: https://lists.ovirt.org/archiv
>>>>> es/list/users(a)ovirt.org/message/JKUWFXIZOWQ42JFBIJLFJGBKISY7OMPV/
>>>>>
>>>>>
>>>>
>>>
>

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[ovirt-users] Re: Tuning and testing GlusterFS performance