[ovirt-users] oVirt selft-hosted with NFS on top gluster

Thu Sep 7 10:47:25 UTC 2017

On Thu, Sep 7, 2017 at 12:52 PM, Abi Askushi <rightkicktech at gmail.com>
wrote:

>
>
> On Thu, Sep 7, 2017 at 10:30 AM, Yaniv Kaul <ykaul at redhat.com> wrote:
>
>>
>>
>> On Thu, Sep 7, 2017 at 10:06 AM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>
>>>
>>>
>>> On Wed, Sep 6, 2017 at 6:08 PM, Abi Askushi <rightkicktech at gmail.com>
>>> wrote:
>>>
>>>> For a first idea I use:
>>>>
>>>> dd if=/dev/zero of=testfile bs=1GB count=1
>>>>
>>>
>>> This is an incorrect way to test performance, for various reasons:
>>> 1. You are not using oflag=direct , thus not using DirectIO, but using
>>> cache.
>>> 2. It's unrealistic - it is very uncommon to write large blocks of zeros
>>> (sometimes during FS creation or wiping). Certainly not 1GB
>>> 3. It is a single thread of IO - again, unrealistic for VM's IO.
>>>
>>> I forgot to mention that I include oflag=direct in my tests. I agree
> though that dd is not the correct way to test, hence I mentioned I just use
> it to get a first feel. More tests are done within the VM benchmarking its
> disk IO (with tools like IOmeter).
>
> I suggest using fio and such. See https://github.com/pcuzner/fio-tools
>>> for example.
>>>
>> Do you have any recommended config file to use for VM workload?
>

Desktops and Servers VMs behave quite differently, so not really. But the
70/30 job is typically a good baseline.

>
>
>>>
>>>>
>>>> When testing on the gluster mount point using above command I hardly
>>>> get 10MB/s. (On the same time the network traffic hardly reaches 100Mbit).
>>>>
>>>> When testing our of the gluster (for example at /root) I get 600 -
>>>> 700MB/s.
>>>>
>>>
>>> That's very fast - from 4 disks doing RAID5? Impressive (unless you use
>>> caching!). Are those HDDs or SSDs/NVMe?
>>>
>>>
>> These are SAS disks. But there is also a RAID controller with 1GB cache.
>
>
>>>> When I mount the gluster volume with NFS and test on it I get 90 - 100
>>>> MB/s, (almost 10x from gluster results) which is the max I can get
>>>> considering I have only 1 Gbit network for the storage.
>>>>
>>>> Also, when using glusterfs the general VM performance is very poor and
>>>> disk write benchmarks show that is it at least 4 times slower then when the
>>>> VM is hosted on the same data store when NFS mounted.
>>>>
>>>> I don't know why I hitting such a significant performance penalty, and
>>>> every possible tweak that I was able to find out there did not make any
>>>> difference on the performance.
>>>>
>>>> The hardware I am using is pretty decent for the purposes intended:
>>>> 3 nodes, each node having with 32 MB of RAM, 16 physical CPU cores, 2
>>>> TB of storage in RAID5 (4 disks), of which 1.5 TB are sliced for the data
>>>> store of ovirt where VMs are stored.
>>>>
>>>
>> I forgot to ask why are you using RAID 5 with 4 disks and not RAID 10?
>> Same usable capacity, higher performance, same protection and faster
>> recovery, I believe.
>>
> Correction: there are 5 disks of 600GB each. The main reason going with
> RAID 5 was the capacity. With RAID 10 I can use only 4 of them and get only
> 1.1 TB usable, with RAID 5 I get 2.2 TB usable. I agree going with RAID 10
> (+ one additional drive to go with 6 drives) would be better but this is
> what I have now.
>
> Y.
>>
>>
>>> You have not mentioned your NIC speeds. Please ensure all work well,
>>> with 10g.
>>> Is the network dedicated for Gluster traffic? How are they connected?
>>>
>>>
>> I have mentioned that I have 1 Gbit dedicated for the storage. A
> different network is used for this and a dedicated 1Gbit switch. The
> throughput has been confirmed between all nodes with iperf.
>

Oh.... With 1Gb, you can't get more than 100+MBps...

> I know 10Gbit would be better, but when using native gluster at ovirt the
> network pipe was hardly reaching 100Mbps thus the bottleneck was gluster
> and not the network. If I can saturate 1Gbit and I still have performance
> issues then I may think to go with 10Gbit. With NFS on top gluster I see
> traffic reaching 800Mbit when testing with dd which is much better.
>

Agreed. Do you see the bottleneck elsewhere? CPU?

>
>
>>>> The gluster configuration is the following:
>>>>
>>>
>>> Which version of Gluster are you using?
>>>
>>>
>> The version is  3.8.12
>

I think it's a very old release (near end of life?). I warmly suggest
3.10.x or 3.12.
There are performance improvements (AFAIR) in both.
Y.

>
>>>> Volume Name: vms
>>>> Type: Replicate
>>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gluster0:/gluster/vms/brick
>>>> Brick2: gluster1:/gluster/vms/brick
>>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>>> Options Reconfigured:
>>>> nfs.export-volumes: on
>>>> nfs.disable: off
>>>> performance.readdir-ahead: on
>>>> transport.address-family: inet
>>>> performance.quick-read: off
>>>> performance.read-ahead: off
>>>> performance.io-cache: off
>>>> performance.stat-prefetch: on
>>>>
>>>
>>> I think this should be off.
>>>
>>>
>>>> performance.low-prio-threads: 32
>>>> network.remote-dio: off
>>>>
>>>
>>> I think this should be enabled.
>>>
>>>
>>>> cluster.eager-lock: off
>>>> cluster.quorum-type: auto
>>>> cluster.server-quorum-type: server
>>>> cluster.data-self-heal-algorithm: full
>>>> cluster.locking-scheme: granular
>>>> cluster.shd-max-threads: 8
>>>> cluster.shd-wait-qlength: 10000
>>>> features.shard: on
>>>> user.cifs: off
>>>> storage.owner-uid: 36
>>>> storage.owner-gid: 36
>>>> network.ping-timeout: 30
>>>> performance.strict-o-direct: on
>>>> cluster.granular-entry-heal: enable
>>>> features.shard-block-size: 64MB
>>>>
>>>
>>> I'm not sure if this should not be 512MB.  I don't remember the last
>>> resolution on this.
>>> Y.
>>>
>>>
>>>> performance.client-io-threads: on
>>>> client.event-threads: 4
>>>> server.event-threads: 4
>>>> performance.write-behind-window-size: 4MB
>>>> performance.cache-size: 1GB
>>>>
>>>> I have been playing with all above with very little difference on
> performance I was getting.
>
> In case I can provide any other details let me know.
>>>>
>>>
>>> What is your tuned profile?
>>>
>>>
>> the tuned profile is virtual-host
>
> At the moment I already switched to gluster based NFS but I have a similar
>>>> setup with 2 nodes  where the data store is mounted through gluster (and
>>>> again relatively good hardware) where I might check any tweaks or
>>>> improvements on this setup.
>>>>
>>>> Thanx
>>>>
>>>>
>>>> On Wed, Sep 6, 2017 at 5:32 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Sep 6, 2017 at 3:32 PM, Abi Askushi <rightkicktech at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I've playing with ovirt self hosted engine setup and I even use it to
>>>>>> production for several VM. The setup I have is 3 server with gluster
>>>>>> storage in replica 2+1 (1 arbiter).
>>>>>> The data storage domain where VMs are stored is mounted with gluster
>>>>>> through ovirt. The performance I get for the VMs is very low and I was
>>>>>> thinking to switch and mount the same storage through NFS instead of
>>>>>> glusterfs.
>>>>>>
>>>>>
>>>>> I don't see how it'll improve performance.
>>>>> I suggest you share the gluster configuration (as well as the storage
>>>>> HW) so we can understand why the performance is low.
>>>>> Y.
>>>>>
>>>>>
>>>>>>
>>>>>> The only think I am hesitant is how can I ensure high availability of
>>>>>> the storage when I loose one server? I was thinking to have at /etc/hosts
>>>>>> sth like below:
>>>>>>
>>>>>> 10.100.100.1 nfsmount
>>>>>> 10.100.100.2 nfsmount
>>>>>> 10.100.100.3 nfsmount
>>>>>>
>>>>>> then use nfsmount as the server name when adding this domain through
>>>>>> ovirt GUI.
>>>>>> Are there any other more elegant solutions? What do you do for such
>>>>>> cases?
>>>>>> Note: gluster has the back-vol-file option which provides a lean way
>>>>>> to have redundancy on the mount point and I am using this when mounting
>>>>>> with glusterfs.
>>>>>>
>>>>>> Thanx
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170907/a235a22e/attachment.html>