Re: [ovirt-users] Very poor GlusterFS performance

Monday, 19 June 2017

--Apple-Mail=_DC124CB5-AC50-425D-BE5D-768F40A9DA3F
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Chris-

You probably need to head over to gluster-users(a)gluster.org =
<mailto:gluster-users@gluster.org> for help with performance issues.

That said, what kind of performance are you getting, via some form or =
testing like bonnie++ or even dd runs? Raw bricks vs gluster performance =
is useful to determine what kind of performance you=E2=80=99re actually =
getting.

Beyond that, I=E2=80=99d recommend dropping the arbiter bricks and =
re-adding them as full replicas, they can=E2=80=99t serve distributed =
data in this configuration and may be slowing things down on you. If =
you=E2=80=99ve got a storage network setup, make sure it=E2=80=99s using =
the largest MTU it can, and consider adding/testing these settings that =
I use on my main storage volume:

performance.io-thread-count: 32
client.event-threads: 8
server.event-threads: 3
performance.stat-prefetch: on

Good luck,

  -Darrell

...
 On Jun 19, 2017, at 9:46 AM, Chris Boot <bootc(a)bootc.net&gt;
wrote:
=20
 Hi folks,
=20
 I have 3x servers in a "hyper-converged" oVirt 4.1.2 + GlusterFS 3.10
 configuration. My VMs run off a replica 3 arbiter 1 volume comprised = of
...
 6 bricks, which themselves live on two SSDs in each of the servers =
(one
...
 brick per SSD). The bricks are XFS on LVM thin volumes straight onto
= the
...
 SSDs. Connectivity is 10G Ethernet.
=20
 Performance within the VMs is pretty terrible. I experience very low
 throughput and random IO is really bad: it feels like a latency issue.
 On my oVirt nodes the SSDs are not generally very busy. The 10G = network
...
 seems to run without errors (iperf3 gives bandwidth measurements of
>=3D= 
...
 9.20 Gbits/sec between the three servers).
=20
 To put this into perspective: I was getting better behaviour from NFS4
 on a gigabit connection than I am with GlusterFS on 10G: that doesn't
 feel right at all.
=20
 My volume configuration looks like this:
=20
 Volume Name: vmssd
 Type: Distributed-Replicate
 Volume ID: d5a5ddd1-a140-4e0d-b514-701cfe464853
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 2 x (2 + 1) =3D 6
 Transport-type: tcp
 Bricks:
 Brick1: ovirt3:/gluster/ssd0_vmssd/brick
 Brick2: ovirt1:/gluster/ssd0_vmssd/brick
 Brick3: ovirt2:/gluster/ssd0_vmssd/brick (arbiter)
 Brick4: ovirt3:/gluster/ssd1_vmssd/brick
 Brick5: ovirt1:/gluster/ssd1_vmssd/brick
 Brick6: ovirt2:/gluster/ssd1_vmssd/brick (arbiter)
 Options Reconfigured:
 nfs.disable: on
 transport.address-family: inet6
 performance.quick-read: off
 performance.read-ahead: off
 performance.io-cache: off
 performance.stat-prefetch: off
 performance.low-prio-threads: 32
 network.remote-dio: off
 cluster.eager-lock: enable
 cluster.quorum-type: auto
 cluster.server-quorum-type: server
 cluster.data-self-heal-algorithm: full
 cluster.locking-scheme: granular
 cluster.shd-max-threads: 8
 cluster.shd-wait-qlength: 10000
 features.shard: on
 user.cifs: off
 storage.owner-uid: 36
 storage.owner-gid: 36
 features.shard-block-size: 128MB
 performance.strict-o-direct: on
 network.ping-timeout: 30
 cluster.granular-entry-heal: enable
=20
 I would really appreciate some guidance on this to try to improve = things
...
 because at this rate I will need to reconsider using GlusterFS =
altogether.
...
=20
 Cheers,
 Chris
=20
 --=20
 Chris Boot
 bootc(a)bootc.net
 _______________________________________________
 Users mailing list
 Users(a)ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users 

--Apple-Mail=_DC124CB5-AC50-425D-BE5D-768F40A9DA3F
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type"
content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D"">Chris-<div class=3D""><br
class=3D""></div><div class=3D"">You =
probably need to head over to <a href=3D"mailto:gluster-users@gluster.org"=
 class=3D&quot;&quot;&gt;gluster-users(a)gluster.org&lt;/a&gt;&amp;nbsp;for help with
performance =
issues.</div><div class=3D""><br
class=3D""></div><div class=3D"">That =
said, what kind of performance are you getting, via some form or testing =
like bonnie++ or even dd runs? Raw bricks vs gluster performance is =
useful to determine what kind of performance you=E2=80=99re actually =
getting.</div><div class=3D""><br
class=3D""></div><div class=3D"">Beyond =
that, I=E2=80=99d recommend dropping the arbiter bricks and re-adding =
them as full replicas, they can=E2=80=99t serve distributed data in this =
configuration and may be slowing things down on you. If you=E2=80=99ve =
got a storage network setup, make sure it=E2=80=99s using the largest =
MTU it can, and consider adding/testing these settings that I use on my =
main storage volume:</div><div class=3D""><br
class=3D""></div><div =
class=3D""><div style=3D"margin: 0px; line-height: normal;"
class=3D""><a =
href=3D"http://performance.io"
class=3D"">performance.io</a>-thread-count:=
 32</div><div style=3D"margin: 0px; line-height: normal;"
class=3D""><span=
 style=3D"font-variant-ligatures: no-common-ligatures" =
class=3D"">client.event-threads: 8</span></div><div
style=3D"margin: =
0px; line-height: normal;" class=3D""><span =
style=3D"font-variant-ligatures: no-common-ligatures" =
class=3D"">server.event-threads: 3</span></div><div
style=3D"margin: =
0px; line-height: normal;" class=3D"">performance.stat-prefetch: =
on</div></div><div class=3D""><span
style=3D"font-variant-ligatures: =
no-common-ligatures" class=3D""><br
class=3D""></span></div><div =
class=3D""><span style=3D"font-variant-ligatures:
no-common-ligatures" =
class=3D"">Good luck,</span></div><div
class=3D""><span =
style=3D"font-variant-ligatures: no-common-ligatures"
class=3D""><br =
class=3D""></span></div><div class=3D""><span
=
style=3D"font-variant-ligatures: no-common-ligatures"
class=3D"">&nbsp; =
-Darrell</span></div><div class=3D""><br
class=3D""></div><div =
class=3D""><br class=3D""><div><blockquote
type=3D"cite" class=3D""><div =
class=3D"">On Jun 19, 2017, at 9:46 AM, Chris Boot &lt;<a =
href=3D"mailto:bootc@bootc.net"
class=3D&quot;&quot;&gt;bootc(a)bootc.net&lt;/a&gt;&amp;gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><div
class=3D""><div =
class=3D"">Hi folks,<br class=3D""><br
class=3D"">I have 3x servers in a =
"hyper-converged" oVirt 4.1.2 + GlusterFS 3.10<br =
class=3D"">configuration. My VMs run off a replica 3 arbiter 1 volume =
comprised of<br class=3D"">6 bricks, which themselves live on two SSDs =
in each of the servers (one<br class=3D"">brick per SSD). The bricks are
=
XFS on LVM thin volumes straight onto the<br class=3D"">SSDs. =
Connectivity is 10G Ethernet.<br class=3D""><br
class=3D"">Performance =
within the VMs is pretty terrible. I experience very low<br =
class=3D"">throughput and random IO is really bad: it feels like a =
latency issue.<br class=3D"">On my oVirt nodes the SSDs are not =
generally very busy. The 10G network<br class=3D"">seems to run without =
errors (iperf3 gives bandwidth measurements of &gt;=3D<br
class=3D"">9.20 =
Gbits/sec between the three servers).<br class=3D""><br
class=3D"">To =
put this into perspective: I was getting better behaviour from NFS4<br =
class=3D"">on a gigabit connection than I am with GlusterFS on 10G: that =
doesn't<br class=3D"">feel right at all.<br
class=3D""><br class=3D"">My =
volume configuration looks like this:<br class=3D""><br
class=3D"">Volume =
Name: vmssd<br class=3D"">Type: Distributed-Replicate<br
class=3D"">Volume=
 ID: d5a5ddd1-a140-4e0d-b514-701cfe464853<br class=3D"">Status: =
Started<br class=3D"">Snapshot Count: 0<br
class=3D"">Number of Bricks: =
2 x (2 + 1) =3D 6<br class=3D"">Transport-type: tcp<br =
class=3D"">Bricks:<br class=3D"">Brick1: =
ovirt3:/gluster/ssd0_vmssd/brick<br class=3D"">Brick2: =
ovirt1:/gluster/ssd0_vmssd/brick<br class=3D"">Brick3: =
ovirt2:/gluster/ssd0_vmssd/brick (arbiter)<br class=3D"">Brick4: =
ovirt3:/gluster/ssd1_vmssd/brick<br class=3D"">Brick5: =
ovirt1:/gluster/ssd1_vmssd/brick<br class=3D"">Brick6: =
ovirt2:/gluster/ssd1_vmssd/brick (arbiter)<br class=3D"">Options =
Reconfigured:<br class=3D"">nfs.disable: on<br =
class=3D"">transport.address-family: inet6<br =
class=3D"">performance.quick-read: off<br =
class=3D"">performance.read-ahead: off<br class=3D""><a =
href=3D"http://performance.io"
class=3D"">performance.io</a>-cache: =
off<br class=3D"">performance.stat-prefetch: off<br =
class=3D"">performance.low-prio-threads: 32<br =
class=3D"">network.remote-dio: off<br
class=3D"">cluster.eager-lock: =
enable<br class=3D"">cluster.quorum-type: auto<br =
class=3D"">cluster.server-quorum-type: server<br =
class=3D"">cluster.data-self-heal-algorithm: full<br =
class=3D"">cluster.locking-scheme: granular<br =
class=3D"">cluster.shd-max-threads: 8<br =
class=3D"">cluster.shd-wait-qlength: 10000<br
class=3D"">features.shard: =
on<br class=3D"">user.cifs: off<br
class=3D"">storage.owner-uid: 36<br =
class=3D"">storage.owner-gid: 36<br
class=3D"">features.shard-block-size: =
128MB<br class=3D"">performance.strict-o-direct: on<br =
class=3D"">network.ping-timeout: 30<br =
class=3D"">cluster.granular-entry-heal: enable<br
class=3D""><br =
class=3D"">I would really appreciate some guidance on this to try to =
improve things<br class=3D"">because at this rate I will need to =
reconsider using GlusterFS altogether.<br class=3D""><br =
class=3D"">Cheers,<br class=3D"">Chris<br
class=3D""><br class=3D"">-- =
<br class=3D"">Chris Boot<br class=3D""><a
href=3D"mailto:bootc@bootc.net"=
 class=3D&quot;&quot;&gt;bootc(a)bootc.net&lt;/a&gt;&lt;br =
class=3D"">_______________________________________________<br =
class=3D"">Users mailing list<br
class=3D&quot;&quot;&gt;Users(a)ovirt.org&lt;br =
class=3D"">http://lists.ovirt.org/mailman/listinfo/users<br =
class=3D""></div></div></blockquote></div><br =
class=3D""></div></body></html>=

--Apple-Mail=_DC124CB5-AC50-425D-BE5D-768F40A9DA3F--

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [ovirt-users] Very poor GlusterFS performance