[ovirt-users] Very poor GlusterFS performance

Mon Jun 19 15:23:32 UTC 2017

Chris-

You probably need to head over to gluster-users at gluster.org <mailto:gluster-users at gluster.org> for help with performance issues.

That said, what kind of performance are you getting, via some form or testing like bonnie++ or even dd runs? Raw bricks vs gluster performance is useful to determine what kind of performance you’re actually getting.

Beyond that, I’d recommend dropping the arbiter bricks and re-adding them as full replicas, they can’t serve distributed data in this configuration and may be slowing things down on you. If you’ve got a storage network setup, make sure it’s using the largest MTU it can, and consider adding/testing these settings that I use on my main storage volume:

performance.io-thread-count: 32
client.event-threads: 8
server.event-threads: 3
performance.stat-prefetch: on

Good luck,

  -Darrell

> On Jun 19, 2017, at 9:46 AM, Chris Boot <bootc at bootc.net> wrote:
> 
> Hi folks,
> 
> I have 3x servers in a "hyper-converged" oVirt 4.1.2 + GlusterFS 3.10
> configuration. My VMs run off a replica 3 arbiter 1 volume comprised of
> 6 bricks, which themselves live on two SSDs in each of the servers (one
> brick per SSD). The bricks are XFS on LVM thin volumes straight onto the
> SSDs. Connectivity is 10G Ethernet.
> 
> Performance within the VMs is pretty terrible. I experience very low
> throughput and random IO is really bad: it feels like a latency issue.
> On my oVirt nodes the SSDs are not generally very busy. The 10G network
> seems to run without errors (iperf3 gives bandwidth measurements of >=
> 9.20 Gbits/sec between the three servers).
> 
> To put this into perspective: I was getting better behaviour from NFS4
> on a gigabit connection than I am with GlusterFS on 10G: that doesn't
> feel right at all.
> 
> My volume configuration looks like this:
> 
> Volume Name: vmssd
> Type: Distributed-Replicate
> Volume ID: d5a5ddd1-a140-4e0d-b514-701cfe464853
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x (2 + 1) = 6
> Transport-type: tcp
> Bricks:
> Brick1: ovirt3:/gluster/ssd0_vmssd/brick
> Brick2: ovirt1:/gluster/ssd0_vmssd/brick
> Brick3: ovirt2:/gluster/ssd0_vmssd/brick (arbiter)
> Brick4: ovirt3:/gluster/ssd1_vmssd/brick
> Brick5: ovirt1:/gluster/ssd1_vmssd/brick
> Brick6: ovirt2:/gluster/ssd1_vmssd/brick (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet6
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> performance.low-prio-threads: 32
> network.remote-dio: off
> cluster.eager-lock: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 10000
> features.shard: on
> user.cifs: off
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard-block-size: 128MB
> performance.strict-o-direct: on
> network.ping-timeout: 30
> cluster.granular-entry-heal: enable
> 
> I would really appreciate some guidance on this to try to improve things
> because at this rate I will need to reconsider using GlusterFS altogether.
> 
> Cheers,
> Chris
> 
> -- 
> Chris Boot
> bootc at bootc.net
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170619/0d7e3b57/attachment.html>