
<p class=3DMsoNormal><span lang=3DEN-US>cluster.locking-scheme: granular<o= :p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>cluster.shd-wai= t-qlength: 10000<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-= US>cluster.shd-max-threads: 6<o:p></o:p></span></p><p class=3DMsoNormal><sp= an lang=3DEN-US>network.ping-timeout: 30<o:p></o:p></span></p><p class=3DMs= oNormal><span lang=3DEN-US>user.cifs: off<o:p></o:p></span></p><p class=3DM= soNormal><span lang=3DEN-US>nfs.disable: on<o:p></o:p></span></p><p class= =3DMsoNormal><span lang=3DEN-US>performance.strict-o-direct: on<o:p></o:p><= /span></p><p class=3DMsoNormal><span lang=3DEN-US>server.event-threads: 4<o= :p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>client.event-th= reads: 4<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US><o:p>= </o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>It feel like=
--_000_BFAB40933B3367488CE6299BAF8592D1014E5314F510SOCRATESasl_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi All, I'm experiencing huge issues when working with big VMs on Gluster volumes. = Doing a Snapshot or removing a big Disk lead to the effect that the SPM nod= e is getting non responsive. Fencing is than kicking in and taking the node= down with the hard reset/reboot. My setup has three nodes with 10Gbit/s NICs for the Gluster network. The Br= icks are on Raid-6 with a 1GB cache on the raid controller and the volumes = are setup as follows: Volume Name: data Type: Replicate Volume ID: c734d678-91e3-449c-8a24-d26b73bef965 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 =3D 3 Transport-type: tcp Bricks: Brick1: ovirt-node01-gfs.storage.lan:/gluster/brick2/data Brick2: ovirt-node02-gfs.storage.lan:/gluster/brick2/data Brick3: ovirt-node03-gfs.storage.lan:/gluster/brick2/data Options Reconfigured: features.barrier: disable cluster.granular-entry-heal: enable performance.readdir-ahead: on performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: on cluster.eager-lock: enable network.remote-dio: off cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 features.shard: on features.shard-block-size: 512MB performance.low-prio-threads: 32 cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 6 network.ping-timeout: 30 user.cifs: off nfs.disable: on performance.strict-o-direct: on server.event-threads: 4 client.event-threads: 4 It feel like the System looks up during snapshotting or removing of a big d= isk and this delay triggers things to go wrong. Is there anything that is n= ot setup right on my gluster or is this behavior normal with bigger disks (= 50GB+) ? Is there a reliable option for caching with SSDs ? Thank you, Sven --_000_BFAB40933B3367488CE6299BAF8592D1014E5314F510SOCRATESasl_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr= osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:= //www.w3.org/TR/REC-html40"><head><meta http-equiv=3DContent-Type content= =3D"text/html; charset=3Dus-ascii"><meta name=3DGenerator content=3D"Micros= oft Word 15 (filtered medium)"><style><!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri",sans-serif; mso-fareast-language:EN-US;} a:link, span.MsoHyperlink {mso-style-priority:99; color:#0563C1; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:#954F72; text-decoration:underline;} span.E-MailFormatvorlage17 {mso-style-type:personal-compose; font-family:"Calibri",sans-serif; color:windowtext;} .MsoChpDefault {mso-style-type:export-only; font-family:"Calibri",sans-serif; mso-fareast-language:EN-US;} @page WordSection1 {size:612.0pt 792.0pt; margin:70.85pt 70.85pt 2.0cm 70.85pt;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--></head><body lang=3DDE link=3D"#0563C1" v= link=3D"#954F72"><div class=3DWordSection1><p class=3DMsoNormal>Hi All, <o:= p></o:p></p><p class=3DMsoNormal><o:p> </o:p></p><p class=3DMsoNormal>= <span lang=3DEN-US>I’m experiencing huge issues when working with big= VMs on Gluster volumes. Doing a Snapshot or removing a big Disk lead to th= e effect that the SPM node is getting non responsive. Fencing is than kicki= ng in and taking the node down with the hard reset/reboot. <o:p></o:p></spa= n></p><p class=3DMsoNormal><span lang=3DEN-US><o:p> </o:p></span></p><= p class=3DMsoNormal><span lang=3DEN-US>My setup has three nodes with 10Gbit= /s NICs for the Gluster network. The Bricks are on Raid-6 with a 1GB cache = on the raid controller and the volumes are setup as follows:<o:p></o:p></sp= an></p><p class=3DMsoNormal><span lang=3DEN-US><o:p> </o:p></span></p>= <p class=3DMsoNormal><span lang=3DEN-US>Volume Name: data<o:p></o:p></span>= </p><p class=3DMsoNormal><span lang=3DEN-US>Type: Replicate<o:p></o:p></spa= n></p><p class=3DMsoNormal><span lang=3DEN-US>Volume ID: c734d678-91e3-449c= -8a24-d26b73bef965<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DE= N-US>Status: Started<o:p></o:p></span></p><p class=3DMsoNormal><span lang= =3DEN-US>Snapshot Count: 0<o:p></o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US>Number of Bricks: 1 x 3 =3D 3<o:p></o:p></span></p><p class=3D= MsoNormal><span lang=3DEN-US>Transport-type: tcp<o:p></o:p></span></p><p cl= ass=3DMsoNormal><span lang=3DEN-US>Bricks:<o:p></o:p></span></p><p class=3D= MsoNormal><span lang=3DEN-US>Brick1: ovirt-node01-gfs.storage.lan:/gluster/= brick2/data<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>Br= ick2: ovirt-node02-gfs.storage.lan:/gluster/brick2/data<o:p></o:p></span></= p><p class=3DMsoNormal><span lang=3DEN-US>Brick3: ovirt-node03-gfs.storage.= lan:/gluster/brick2/data<o:p></o:p></span></p><p class=3DMsoNormal><span la= ng=3DEN-US>Options Reconfigured:<o:p></o:p></span></p><p class=3DMsoNormal>= <span lang=3DEN-US>features.barrier: disable<o:p></o:p></span></p><p class= =3DMsoNormal><span lang=3DEN-US>cluster.granular-entry-heal: enable<o:p></o= :p></span></p><p class=3DMsoNormal><span lang=3DEN-US>performance.readdir-a= head: on<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>perfo= rmance.quick-read: off<o:p></o:p></span></p><p class=3DMsoNormal><span lang= =3DEN-US>performance.read-ahead: off<o:p></o:p></span></p><p class=3DMsoNor= mal><span lang=3DEN-US>performance.io-cache: off<o:p></o:p></span></p><p cl= ass=3DMsoNormal><span lang=3DEN-US>performance.stat-prefetch: on<o:p></o:p>= </span></p><p class=3DMsoNormal><span lang=3DEN-US>cluster.eager-lock: enab= le<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>network.rem= ote-dio: off<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>c= luster.quorum-type: auto<o:p></o:p></span></p><p class=3DMsoNormal><span la= ng=3DEN-US>cluster.server-quorum-type: server<o:p></o:p></span></p><p class= =3DMsoNormal><span lang=3DEN-US>storage.owner-uid: 36<o:p></o:p></span></p>= <p class=3DMsoNormal><span lang=3DEN-US>storage.owner-gid: 36<o:p></o:p></s= pan></p><p class=3DMsoNormal><span lang=3DEN-US>features.shard: on<o:p></o:= p></span></p><p class=3DMsoNormal><span lang=3DEN-US>features.shard-block-s= ize: 512MB<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>per= formance.low-prio-threads: 32<o:p></o:p></span></p><p class=3DMsoNormal><sp= an lang=3DEN-US>cluster.data-self-heal-algorithm: full<o:p></o:p></span></p= the System looks up during snapshotting or removing of a big disk and this= delay triggers things to go wrong. Is there anything that is not setup rig= ht on my gluster or is this behavior normal with bigger disks (50GB+) ? Is = there a reliable option for caching with SSDs ?<o:p></o:p></span></p><p cla= ss=3DMsoNormal><span lang=3DEN-US><o:p> </o:p></span></p><p class=3DMs= oNormal><span lang=3DEN-US>Thank you, <o:p></o:p></span></p><p class=3DMsoN= ormal><span lang=3DEN-US>Sven </span><o:p></o:p></p></div></body></html>= --_000_BFAB40933B3367488CE6299BAF8592D1014E5314F510SOCRATESasl_--