Couple of things:

1. Like Darrell suggested, you should enable stat-prefetch and increase client and server event threads to 4.
# gluster volume set <VOL> performance.stat-prefetch on
# gluster volume set <VOL> client.event-threads 4
# gluster volume set <VOL> server.event-threads 4

2. Also glusterfs-3.10.1 and above has a shard performance bug fix - https://review.gluster.org/#/c/16966/

With these two changes, we saw great improvement in performance in our internal testing.

Do you mind trying these two options above?

-Krutika

On Tue, Jun 20, 2017 at 1:00 PM, Lindsay Mathieson <lindsay.mathieson@gmail.com> wrote:
Have you tried with:

performance.strict-o-direct : off
performance.strict-write-ordering : off

They can be changed dynamically.


On 20 June 2017 at 17:21, Sahina Bose <sabose@redhat.com> wrote:
[Adding gluster-users]

On Mon, Jun 19, 2017 at 8:16 PM, Chris Boot <bootc@bootc.net> wrote:
Hi folks,

I have 3x servers in a "hyper-converged" oVirt 4.1.2 + GlusterFS 3.10
configuration. My VMs run off a replica 3 arbiter 1 volume comprised of
6 bricks, which themselves live on two SSDs in each of the servers (one
brick per SSD). The bricks are XFS on LVM thin volumes straight onto the
SSDs. Connectivity is 10G Ethernet.

Performance within the VMs is pretty terrible. I experience very low
throughput and random IO is really bad: it feels like a latency issue.
On my oVirt nodes the SSDs are not generally very busy. The 10G network
seems to run without errors (iperf3 gives bandwidth measurements of >=
9.20 Gbits/sec between the three servers).

To put this into perspective: I was getting better behaviour from NFS4
on a gigabit connection than I am with GlusterFS on 10G: that doesn't
feel right at all.

My volume configuration looks like this:

Volume Name: vmssd
Type: Distributed-Replicate
Volume ID: d5a5ddd1-a140-4e0d-b514-701cfe464853
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: ovirt3:/gluster/ssd0_vmssd/brick
Brick2: ovirt1:/gluster/ssd0_vmssd/brick
Brick3: ovirt2:/gluster/ssd0_vmssd/brick (arbiter)
Brick4: ovirt3:/gluster/ssd1_vmssd/brick
Brick5: ovirt1:/gluster/ssd1_vmssd/brick
Brick6: ovirt2:/gluster/ssd1_vmssd/brick (arbiter)
Options Reconfigured:
nfs.disable: on
transport.address-family: inet6
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
performance.low-prio-threads: 32
network.remote-dio: off
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
storage.owner-uid: 36
storage.owner-gid: 36
features.shard-block-size: 128MB
performance.strict-o-direct: on
network.ping-timeout: 30
cluster.granular-entry-heal: enable

I would really appreciate some guidance on this to try to improve things
because at this rate I will need to reconsider using GlusterFS altogether.


Could you provide the gluster volume profile output while you're running your I/O tests.

# gluster volume profile <volname> start
to start profiling

# gluster volume profile <volname> info

for the profile output.
 

Cheers,
Chris

--
Chris Boot
bootc@bootc.net
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



--
Lindsay

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users