[ovirt-users] oVirt selft-hosted with NFS on top gluster

Thu Sep 7 09:52:46 UTC 2017

On Thu, Sep 7, 2017 at 10:30 AM, Yaniv Kaul <ykaul at redhat.com> wrote:

>
>
> On Thu, Sep 7, 2017 at 10:06 AM, Yaniv Kaul <ykaul at redhat.com> wrote:
>
>>
>>
>> On Wed, Sep 6, 2017 at 6:08 PM, Abi Askushi <rightkicktech at gmail.com>
>> wrote:
>>
>>> For a first idea I use:
>>>
>>> dd if=/dev/zero of=testfile bs=1GB count=1
>>>
>>
>> This is an incorrect way to test performance, for various reasons:
>> 1. You are not using oflag=direct , thus not using DirectIO, but using
>> cache.
>> 2. It's unrealistic - it is very uncommon to write large blocks of zeros
>> (sometimes during FS creation or wiping). Certainly not 1GB
>> 3. It is a single thread of IO - again, unrealistic for VM's IO.
>>
>> I forgot to mention that I include oflag=direct in my tests. I agree
though that dd is not the correct way to test, hence I mentioned I just use
it to get a first feel. More tests are done within the VM benchmarking its
disk IO (with tools like IOmeter).

I suggest using fio and such. See https://github.com/pcuzner/fio-tools for
>> example.
>>
> Do you have any recommended config file to use for VM workload?

>>
>>>
>>> When testing on the gluster mount point using above command I hardly get
>>> 10MB/s. (On the same time the network traffic hardly reaches 100Mbit).
>>>
>>> When testing our of the gluster (for example at /root) I get 600 -
>>> 700MB/s.
>>>
>>
>> That's very fast - from 4 disks doing RAID5? Impressive (unless you use
>> caching!). Are those HDDs or SSDs/NVMe?
>>
>>
> These are SAS disks. But there is also a RAID controller with 1GB cache.

>>> When I mount the gluster volume with NFS and test on it I get 90 - 100
>>> MB/s, (almost 10x from gluster results) which is the max I can get
>>> considering I have only 1 Gbit network for the storage.
>>>
>>> Also, when using glusterfs the general VM performance is very poor and
>>> disk write benchmarks show that is it at least 4 times slower then when the
>>> VM is hosted on the same data store when NFS mounted.
>>>
>>> I don't know why I hitting such a significant performance penalty, and
>>> every possible tweak that I was able to find out there did not make any
>>> difference on the performance.
>>>
>>> The hardware I am using is pretty decent for the purposes intended:
>>> 3 nodes, each node having with 32 MB of RAM, 16 physical CPU cores, 2 TB
>>> of storage in RAID5 (4 disks), of which 1.5 TB are sliced for the data
>>> store of ovirt where VMs are stored.
>>>
>>
> I forgot to ask why are you using RAID 5 with 4 disks and not RAID 10?
> Same usable capacity, higher performance, same protection and faster
> recovery, I believe.
>
Correction: there are 5 disks of 600GB each. The main reason going with
RAID 5 was the capacity. With RAID 10 I can use only 4 of them and get only
1.1 TB usable, with RAID 5 I get 2.2 TB usable. I agree going with RAID 10
(+ one additional drive to go with 6 drives) would be better but this is
what I have now.

Y.
>
>
>> You have not mentioned your NIC speeds. Please ensure all work well, with
>> 10g.
>> Is the network dedicated for Gluster traffic? How are they connected?
>>
>>
> I have mentioned that I have 1 Gbit dedicated for the storage. A different
network is used for this and a dedicated 1Gbit switch. The throughput has
been confirmed between all nodes with iperf.
I know 10Gbit would be better, but when using native gluster at ovirt the
network pipe was hardly reaching 100Mbps thus the bottleneck was gluster
and not the network. If I can saturate 1Gbit and I still have performance
issues then I may think to go with 10Gbit. With NFS on top gluster I see
traffic reaching 800Mbit when testing with dd which is much better.

>>> The gluster configuration is the following:
>>>
>>
>> Which version of Gluster are you using?
>>
>>
> The version is  3.8.12

>
>>> Volume Name: vms
>>> Type: Replicate
>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gluster0:/gluster/vms/brick
>>> Brick2: gluster1:/gluster/vms/brick
>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>> Options Reconfigured:
>>> nfs.export-volumes: on
>>> nfs.disable: off
>>> performance.readdir-ahead: on
>>> transport.address-family: inet
>>> performance.quick-read: off
>>> performance.read-ahead: off
>>> performance.io-cache: off
>>> performance.stat-prefetch: on
>>>
>>
>> I think this should be off.
>>
>>
>>> performance.low-prio-threads: 32
>>> network.remote-dio: off
>>>
>>
>> I think this should be enabled.
>>
>>
>>> cluster.eager-lock: off
>>> cluster.quorum-type: auto
>>> cluster.server-quorum-type: server
>>> cluster.data-self-heal-algorithm: full
>>> cluster.locking-scheme: granular
>>> cluster.shd-max-threads: 8
>>> cluster.shd-wait-qlength: 10000
>>> features.shard: on
>>> user.cifs: off
>>> storage.owner-uid: 36
>>> storage.owner-gid: 36
>>> network.ping-timeout: 30
>>> performance.strict-o-direct: on
>>> cluster.granular-entry-heal: enable
>>> features.shard-block-size: 64MB
>>>
>>
>> I'm not sure if this should not be 512MB.  I don't remember the last
>> resolution on this.
>> Y.
>>
>>
>>> performance.client-io-threads: on
>>> client.event-threads: 4
>>> server.event-threads: 4
>>> performance.write-behind-window-size: 4MB
>>> performance.cache-size: 1GB
>>>
>>> I have been playing with all above with very little difference on
performance I was getting.

In case I can provide any other details let me know.
>>>
>>
>> What is your tuned profile?
>>
>>
> the tuned profile is virtual-host

At the moment I already switched to gluster based NFS but I have a similar
>>> setup with 2 nodes  where the data store is mounted through gluster (and
>>> again relatively good hardware) where I might check any tweaks or
>>> improvements on this setup.
>>>
>>> Thanx
>>>
>>>
>>> On Wed, Sep 6, 2017 at 5:32 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Sep 6, 2017 at 3:32 PM, Abi Askushi <rightkicktech at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I've playing with ovirt self hosted engine setup and I even use it to
>>>>> production for several VM. The setup I have is 3 server with gluster
>>>>> storage in replica 2+1 (1 arbiter).
>>>>> The data storage domain where VMs are stored is mounted with gluster
>>>>> through ovirt. The performance I get for the VMs is very low and I was
>>>>> thinking to switch and mount the same storage through NFS instead of
>>>>> glusterfs.
>>>>>
>>>>
>>>> I don't see how it'll improve performance.
>>>> I suggest you share the gluster configuration (as well as the storage
>>>> HW) so we can understand why the performance is low.
>>>> Y.
>>>>
>>>>
>>>>>
>>>>> The only think I am hesitant is how can I ensure high availability of
>>>>> the storage when I loose one server? I was thinking to have at /etc/hosts
>>>>> sth like below:
>>>>>
>>>>> 10.100.100.1 nfsmount
>>>>> 10.100.100.2 nfsmount
>>>>> 10.100.100.3 nfsmount
>>>>>
>>>>> then use nfsmount as the server name when adding this domain through
>>>>> ovirt GUI.
>>>>> Are there any other more elegant solutions? What do you do for such
>>>>> cases?
>>>>> Note: gluster has the back-vol-file option which provides a lean way
>>>>> to have redundancy on the mount point and I am using this when mounting
>>>>> with glusterfs.
>>>>>
>>>>> Thanx
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170907/ebc57941/attachment.html>