
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it? host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config: NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0 errors: No known data errors Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free

This is a multi-part message in MIME format. --------------D5468FBE72B4A6F3BF994ECE Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM. Fernando On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--------------D5468FBE72B4A6F3BF994ECE Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <p>Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.</p> <p>Fernando<br> </p> <br> <div class="moz-cite-prefix">On 02/03/2017 06:19, Arman Khalatyan wrote:<br> </div> <blockquote cite="mid:CAAqDm6bgqQLjh8rb=KG-DqGtcz4ghbGb2XS614nRbVJo2WmW7Q@mail.gmail.com" type="cite"> <div dir="ltr"> <div> <div>Hi, <br> </div> I use 3 nodes with zfs and glusterfs.<br> </div> Are there any suggestions to optimize it?<br> <div><br> host zfs config 4TB-HDD+250GB-SSD:<br> [root@clei22 ~]# zpool status <br> pool: zclei22<br> state: ONLINE<br> scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017<br> config:<br> <br> NAME STATE READ WRITE CKSUM<br> zclei22 ONLINE 0 0 0<br> HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0<br> logs<br> lv_slog ONLINE 0 0 0<br> cache<br> lv_cache ONLINE 0 0 0<br> <br> errors: No known data errors<br> <br> Name:<br> GluReplica<br> Volume ID:<br> ee686dfe-203a-4caa-a691-26353460cc48<br> Volume Type:<br> Replicate (Arbiter)<br> Replica Count:<br> 2 + 1<br> Number of Bricks:<br> 3<br> Transport Types:<br> TCP, RDMA<br> Maximum no of snapshots:<br> 256<br> Capacity:<br> 3.51 TiB total, 190.56 GiB used, 3.33 TiB free<br> </div> </div> <br> <fieldset class="mimeAttachmentHeader"></fieldset> <br> <pre wrap="">_______________________________________________ Users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> </pre> </blockquote> <br> </body> </html> --------------D5468FBE72B4A6F3BF994ECE--

no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs) On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI <fernando.frediani@upx.com
wrote:
Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.
Fernando
On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

hey, what are you using for zfs? get an arc status and show please 2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs)
On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < fernando.frediani@upx.com> wrote:
Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.
Fernando
On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
hey, what are you using for zfs? get an arc status and show please
2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs)
On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < fernando.frediani@upx.com> wrote:
Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.
Fernando
On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- ----- On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G
Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G
On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
hey, what are you using for zfs? get an arc status and show please
2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs)
On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < fernando.frediani@upx.com> wrote:
Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.
Fernando
On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Which operating system version are you using for your zfs storage? do: zfs get all your-pool-name use arc_summary.py from freenas git repo if you wish. 2017-03-03 10:33 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- -----
On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G
Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G
On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
hey, what are you using for zfs? get an arc status and show please
2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs)
On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < fernando.frediani@upx.com> wrote:
Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.
Fernando
On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

This is CentOS 7.3 ZoL version 0.6.5.9-1 [root@clei22 ~]# lsscsi [2:0:0:0] disk ATA INTEL SSDSC2CW24 400i /dev/sda [3:0:0:0] disk ATA HGST HUS724040AL AA70 /dev/sdb [4:0:0:0] disk ATA WDC WD2002FYPS-0 1G01 /dev/sdc [root@clei22 ~]# pvs ;vgs;lvs PV VG Fmt Attr PSize PFree /dev/mapper/INTEL_SSDSC2CW240A3_CVCV306302RP240CGN vg_cache lvm2 a-- 223.57g 0 /dev/sdc2 centos_clei22 lvm2 a-- 1.82t 64.00m VG #PV #LV #SN Attr VSize VFree centos_clei22 1 3 0 wz--n- 1.82t 64.00m vg_cache 1 2 0 wz--n- 223.57g 0 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert home centos_clei22 -wi-ao---- 1.74t root centos_clei22 -wi-ao---- 50.00g swap centos_clei22 -wi-ao---- 31.44g lv_cache vg_cache -wi-ao---- 213.57g lv_slog vg_cache -wi-ao---- 10.00g [root@clei22 ~]# zpool status -v pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config: NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0 errors: No known data errors *ZFS config:* [root@clei22 ~]# zfs get all zclei22/01 NAME PROPERTY VALUE SOURCE zclei22/01 type filesystem - zclei22/01 creation Tue Feb 28 14:06 2017 - zclei22/01 used 389G - zclei22/01 available 3.13T - zclei22/01 referenced 389G - zclei22/01 compressratio 1.01x - zclei22/01 mounted yes - zclei22/01 quota none default zclei22/01 reservation none default zclei22/01 recordsize 128K local zclei22/01 mountpoint /zclei22/01 default zclei22/01 sharenfs off default zclei22/01 checksum on default zclei22/01 compression off local zclei22/01 atime on default zclei22/01 devices on default zclei22/01 exec on default zclei22/01 setuid on default zclei22/01 readonly off default zclei22/01 zoned off default zclei22/01 snapdir hidden default zclei22/01 aclinherit restricted default zclei22/01 canmount on default zclei22/01 xattr sa local zclei22/01 copies 1 default zclei22/01 version 5 - zclei22/01 utf8only off - zclei22/01 normalization none - zclei22/01 casesensitivity sensitive - zclei22/01 vscan off default zclei22/01 nbmand off default zclei22/01 sharesmb off default zclei22/01 refquota none default zclei22/01 refreservation none default zclei22/01 primarycache metadata local zclei22/01 secondarycache metadata local zclei22/01 usedbysnapshots 0 - zclei22/01 usedbydataset 389G - zclei22/01 usedbychildren 0 - zclei22/01 usedbyrefreservation 0 - zclei22/01 logbias latency default zclei22/01 dedup off default zclei22/01 mlslabel none default zclei22/01 sync disabled local zclei22/01 refcompressratio 1.01x - zclei22/01 written 389G - zclei22/01 logicalused 396G - zclei22/01 logicalreferenced 396G - zclei22/01 filesystem_limit none default zclei22/01 snapshot_limit none default zclei22/01 filesystem_count none default zclei22/01 snapshot_count none default zclei22/01 snapdev hidden default zclei22/01 acltype off default zclei22/01 context none default zclei22/01 fscontext none default zclei22/01 defcontext none default zclei22/01 rootcontext none default zclei22/01 relatime off default zclei22/01 redundant_metadata all default zclei22/01 overlay off default On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
Which operating system version are you using for your zfs storage? do: zfs get all your-pool-name use arc_summary.py from freenas git repo if you wish.
2017-03-03 10:33 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- -----
On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G
Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G
On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
hey, what are you using for zfs? get an arc status and show please
2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs)
On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < fernando.frediani@upx.com> wrote:
Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.
Fernando
On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

ok, you have 3 pools, zclei22, logs and cache, thats wrong. you should have 1 pool, with zlog+cache if you are looking for performance. also, dont mix drives. whats the performance issue you are facing? regards, 2017-03-03 11:00 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
This is CentOS 7.3 ZoL version 0.6.5.9-1
[root@clei22 ~]# lsscsi
[2:0:0:0] disk ATA INTEL SSDSC2CW24 400i /dev/sda
[3:0:0:0] disk ATA HGST HUS724040AL AA70 /dev/sdb
[4:0:0:0] disk ATA WDC WD2002FYPS-0 1G01 /dev/sdc
[root@clei22 ~]# pvs ;vgs;lvs
PV VG Fmt Attr PSize PFree
/dev/mapper/INTEL_SSDSC2CW240A3_CVCV306302RP240CGN vg_cache lvm2 a-- 223.57g 0
/dev/sdc2 centos_clei22 lvm2 a-- 1.82t 64.00m
VG #PV #LV #SN Attr VSize VFree
centos_clei22 1 3 0 wz--n- 1.82t 64.00m
vg_cache 1 2 0 wz--n- 223.57g 0
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home centos_clei22 -wi-ao---- 1.74t
root centos_clei22 -wi-ao---- 50.00g
swap centos_clei22 -wi-ao---- 31.44g
lv_cache vg_cache -wi-ao---- 213.57g
lv_slog vg_cache -wi-ao---- 10.00g
[root@clei22 ~]# zpool status -v
pool: zclei22
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017
config:
NAME STATE READ WRITE CKSUM
zclei22 ONLINE 0 0 0
HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0
logs
lv_slog ONLINE 0 0 0
cache
lv_cache ONLINE 0 0 0
errors: No known data errors
*ZFS config:*
[root@clei22 ~]# zfs get all zclei22/01
NAME PROPERTY VALUE SOURCE
zclei22/01 type filesystem -
zclei22/01 creation Tue Feb 28 14:06 2017 -
zclei22/01 used 389G -
zclei22/01 available 3.13T -
zclei22/01 referenced 389G -
zclei22/01 compressratio 1.01x -
zclei22/01 mounted yes -
zclei22/01 quota none default
zclei22/01 reservation none default
zclei22/01 recordsize 128K local
zclei22/01 mountpoint /zclei22/01 default
zclei22/01 sharenfs off default
zclei22/01 checksum on default
zclei22/01 compression off local
zclei22/01 atime on default
zclei22/01 devices on default
zclei22/01 exec on default
zclei22/01 setuid on default
zclei22/01 readonly off default
zclei22/01 zoned off default
zclei22/01 snapdir hidden default
zclei22/01 aclinherit restricted default
zclei22/01 canmount on default
zclei22/01 xattr sa local
zclei22/01 copies 1 default
zclei22/01 version 5 -
zclei22/01 utf8only off -
zclei22/01 normalization none -
zclei22/01 casesensitivity sensitive -
zclei22/01 vscan off default
zclei22/01 nbmand off default
zclei22/01 sharesmb off default
zclei22/01 refquota none default
zclei22/01 refreservation none default
zclei22/01 primarycache metadata local
zclei22/01 secondarycache metadata local
zclei22/01 usedbysnapshots 0 -
zclei22/01 usedbydataset 389G -
zclei22/01 usedbychildren 0 -
zclei22/01 usedbyrefreservation 0 -
zclei22/01 logbias latency default
zclei22/01 dedup off default
zclei22/01 mlslabel none default
zclei22/01 sync disabled local
zclei22/01 refcompressratio 1.01x -
zclei22/01 written 389G -
zclei22/01 logicalused 396G -
zclei22/01 logicalreferenced 396G -
zclei22/01 filesystem_limit none default
zclei22/01 snapshot_limit none default
zclei22/01 filesystem_count none default
zclei22/01 snapshot_count none default
zclei22/01 snapdev hidden default
zclei22/01 acltype off default
zclei22/01 context none default
zclei22/01 fscontext none default
zclei22/01 defcontext none default
zclei22/01 rootcontext none default
zclei22/01 relatime off default
zclei22/01 redundant_metadata all default
zclei22/01 overlay off default
On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
Which operating system version are you using for your zfs storage? do: zfs get all your-pool-name use arc_summary.py from freenas git repo if you wish.
2017-03-03 10:33 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- -----
On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G
Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G
On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
hey, what are you using for zfs? get an arc status and show please
2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs)
On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < fernando.frediani@upx.com> wrote:
> Am I understanding correctly, but you have Gluster on the top of ZFS > which is on the top of LVM ? If so, why the usage of LVM was necessary ? I > have ZFS with any need of LVM. > > Fernando > > On 02/03/2017 06:19, Arman Khalatyan wrote: > > Hi, > I use 3 nodes with zfs and glusterfs. > Are there any suggestions to optimize it? > > host zfs config 4TB-HDD+250GB-SSD: > [root@clei22 ~]# zpool status > pool: zclei22 > state: ONLINE > scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 > 14:16:07 2017 > config: > > NAME STATE READ WRITE > CKSUM > zclei22 ONLINE 0 > 0 0 > HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 > 0 0 > logs > lv_slog ONLINE 0 > 0 0 > cache > lv_cache ONLINE 0 > 0 0 > > errors: No known data errors > > Name: > GluReplica > Volume ID: > ee686dfe-203a-4caa-a691-26353460cc48 > Volume Type: > Replicate (Arbiter) > Replica Count: > 2 + 1 > Number of Bricks: > 3 > Transport Types: > TCP, RDMA > Maximum no of snapshots: > 256 > Capacity: > 3.51 TiB total, 190.56 GiB used, 3.33 TiB free > > > _______________________________________________ > Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users > > > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

No, I have one pool made of the one disk and ssd as a cache and log device. I have 3 Glusterfs bricks- separate 3 hosts:Volume type Replicate (Arbiter)= replica 2+1! That how much you can push into compute nodes(they have only 3 disk slots). On Fri, Mar 3, 2017 at 3:19 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
ok, you have 3 pools, zclei22, logs and cache, thats wrong. you should have 1 pool, with zlog+cache if you are looking for performance. also, dont mix drives. whats the performance issue you are facing?
regards,
2017-03-03 11:00 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
This is CentOS 7.3 ZoL version 0.6.5.9-1
[root@clei22 ~]# lsscsi
[2:0:0:0] disk ATA INTEL SSDSC2CW24 400i /dev/sda
[3:0:0:0] disk ATA HGST HUS724040AL AA70 /dev/sdb
[4:0:0:0] disk ATA WDC WD2002FYPS-0 1G01 /dev/sdc
[root@clei22 ~]# pvs ;vgs;lvs
PV VG Fmt Attr PSize PFree
/dev/mapper/INTEL_SSDSC2CW240A3_CVCV306302RP240CGN vg_cache lvm2 a-- 223.57g 0
/dev/sdc2 centos_clei22 lvm2 a-- 1.82t 64.00m
VG #PV #LV #SN Attr VSize VFree
centos_clei22 1 3 0 wz--n- 1.82t 64.00m
vg_cache 1 2 0 wz--n- 223.57g 0
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home centos_clei22 -wi-ao---- 1.74t
root centos_clei22 -wi-ao---- 50.00g
swap centos_clei22 -wi-ao---- 31.44g
lv_cache vg_cache -wi-ao---- 213.57g
lv_slog vg_cache -wi-ao---- 10.00g
[root@clei22 ~]# zpool status -v
pool: zclei22
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017
config:
NAME STATE READ WRITE CKSUM
zclei22 ONLINE 0 0 0
HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0
logs
lv_slog ONLINE 0 0 0
cache
lv_cache ONLINE 0 0 0
errors: No known data errors
*ZFS config:*
[root@clei22 ~]# zfs get all zclei22/01
NAME PROPERTY VALUE SOURCE
zclei22/01 type filesystem -
zclei22/01 creation Tue Feb 28 14:06 2017 -
zclei22/01 used 389G -
zclei22/01 available 3.13T -
zclei22/01 referenced 389G -
zclei22/01 compressratio 1.01x -
zclei22/01 mounted yes -
zclei22/01 quota none default
zclei22/01 reservation none default
zclei22/01 recordsize 128K local
zclei22/01 mountpoint /zclei22/01 default
zclei22/01 sharenfs off default
zclei22/01 checksum on default
zclei22/01 compression off local
zclei22/01 atime on default
zclei22/01 devices on default
zclei22/01 exec on default
zclei22/01 setuid on default
zclei22/01 readonly off default
zclei22/01 zoned off default
zclei22/01 snapdir hidden default
zclei22/01 aclinherit restricted default
zclei22/01 canmount on default
zclei22/01 xattr sa local
zclei22/01 copies 1 default
zclei22/01 version 5 -
zclei22/01 utf8only off -
zclei22/01 normalization none -
zclei22/01 casesensitivity sensitive -
zclei22/01 vscan off default
zclei22/01 nbmand off default
zclei22/01 sharesmb off default
zclei22/01 refquota none default
zclei22/01 refreservation none default
zclei22/01 primarycache metadata local
zclei22/01 secondarycache metadata local
zclei22/01 usedbysnapshots 0 -
zclei22/01 usedbydataset 389G -
zclei22/01 usedbychildren 0 -
zclei22/01 usedbyrefreservation 0 -
zclei22/01 logbias latency default
zclei22/01 dedup off default
zclei22/01 mlslabel none default
zclei22/01 sync disabled local
zclei22/01 refcompressratio 1.01x -
zclei22/01 written 389G -
zclei22/01 logicalused 396G -
zclei22/01 logicalreferenced 396G -
zclei22/01 filesystem_limit none default
zclei22/01 snapshot_limit none default
zclei22/01 filesystem_count none default
zclei22/01 snapshot_count none default
zclei22/01 snapdev hidden default
zclei22/01 acltype off default
zclei22/01 context none default
zclei22/01 fscontext none default
zclei22/01 defcontext none default
zclei22/01 rootcontext none default
zclei22/01 relatime off default
zclei22/01 redundant_metadata all default
zclei22/01 overlay off default
On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
Which operating system version are you using for your zfs storage? do: zfs get all your-pool-name use arc_summary.py from freenas git repo if you wish.
2017-03-03 10:33 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- -----
On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G
Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G
On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
hey, what are you using for zfs? get an arc status and show please
2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
> no, > ZFS itself is not on top of lvm. only ssd was spitted by lvm for > slog(10G) and cache (the rest) > but in any-case the ssd does not help much on glusterfs/ovirt load > it has almost 100% cache misses....:( (terrible performance compare with > nfs) > > > > > > On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < > fernando.frediani@upx.com> wrote: > >> Am I understanding correctly, but you have Gluster on the top of >> ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary >> ? I have ZFS with any need of LVM. >> >> Fernando >> >> On 02/03/2017 06:19, Arman Khalatyan wrote: >> >> Hi, >> I use 3 nodes with zfs and glusterfs. >> Are there any suggestions to optimize it? >> >> host zfs config 4TB-HDD+250GB-SSD: >> [root@clei22 ~]# zpool status >> pool: zclei22 >> state: ONLINE >> scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 >> 14:16:07 2017 >> config: >> >> NAME STATE READ WRITE >> CKSUM >> zclei22 ONLINE 0 >> 0 0 >> HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 >> 0 0 >> logs >> lv_slog ONLINE 0 >> 0 0 >> cache >> lv_cache ONLINE 0 >> 0 0 >> >> errors: No known data errors >> >> Name: >> GluReplica >> Volume ID: >> ee686dfe-203a-4caa-a691-26353460cc48 >> Volume Type: >> Replicate (Arbiter) >> Replica Count: >> 2 + 1 >> Number of Bricks: >> 3 >> Transport Types: >> TCP, RDMA >> Maximum no of snapshots: >> 256 >> Capacity: >> 3.51 TiB total, 190.56 GiB used, 3.33 TiB free >> >> >> _______________________________________________ >> Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users >> >> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >

cd to inside the pool path then dd if=/dev/zero of=test.tt bs=1M leave it runing 5/10 minutes. do ctrl+c paste result here. etc. 2017-03-03 11:30 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
No, I have one pool made of the one disk and ssd as a cache and log device. I have 3 Glusterfs bricks- separate 3 hosts:Volume type Replicate (Arbiter)= replica 2+1! That how much you can push into compute nodes(they have only 3 disk slots).
On Fri, Mar 3, 2017 at 3:19 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
ok, you have 3 pools, zclei22, logs and cache, thats wrong. you should have 1 pool, with zlog+cache if you are looking for performance. also, dont mix drives. whats the performance issue you are facing?
regards,
2017-03-03 11:00 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
This is CentOS 7.3 ZoL version 0.6.5.9-1
[root@clei22 ~]# lsscsi
[2:0:0:0] disk ATA INTEL SSDSC2CW24 400i /dev/sda
[3:0:0:0] disk ATA HGST HUS724040AL AA70 /dev/sdb
[4:0:0:0] disk ATA WDC WD2002FYPS-0 1G01 /dev/sdc
[root@clei22 ~]# pvs ;vgs;lvs
PV VG Fmt Attr PSize PFree
/dev/mapper/INTEL_SSDSC2CW240A3_CVCV306302RP240CGN vg_cache lvm2 a-- 223.57g 0
/dev/sdc2 centos_clei22 lvm2 a-- 1.82t 64.00m
VG #PV #LV #SN Attr VSize VFree
centos_clei22 1 3 0 wz--n- 1.82t 64.00m
vg_cache 1 2 0 wz--n- 223.57g 0
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home centos_clei22 -wi-ao---- 1.74t
root centos_clei22 -wi-ao---- 50.00g
swap centos_clei22 -wi-ao---- 31.44g
lv_cache vg_cache -wi-ao---- 213.57g
lv_slog vg_cache -wi-ao---- 10.00g
[root@clei22 ~]# zpool status -v
pool: zclei22
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017
config:
NAME STATE READ WRITE CKSUM
zclei22 ONLINE 0 0 0
HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0
logs
lv_slog ONLINE 0 0 0
cache
lv_cache ONLINE 0 0 0
errors: No known data errors
*ZFS config:*
[root@clei22 ~]# zfs get all zclei22/01
NAME PROPERTY VALUE SOURCE
zclei22/01 type filesystem -
zclei22/01 creation Tue Feb 28 14:06 2017 -
zclei22/01 used 389G -
zclei22/01 available 3.13T -
zclei22/01 referenced 389G -
zclei22/01 compressratio 1.01x -
zclei22/01 mounted yes -
zclei22/01 quota none default
zclei22/01 reservation none default
zclei22/01 recordsize 128K local
zclei22/01 mountpoint /zclei22/01 default
zclei22/01 sharenfs off default
zclei22/01 checksum on default
zclei22/01 compression off local
zclei22/01 atime on default
zclei22/01 devices on default
zclei22/01 exec on default
zclei22/01 setuid on default
zclei22/01 readonly off default
zclei22/01 zoned off default
zclei22/01 snapdir hidden default
zclei22/01 aclinherit restricted default
zclei22/01 canmount on default
zclei22/01 xattr sa local
zclei22/01 copies 1 default
zclei22/01 version 5 -
zclei22/01 utf8only off -
zclei22/01 normalization none -
zclei22/01 casesensitivity sensitive -
zclei22/01 vscan off default
zclei22/01 nbmand off default
zclei22/01 sharesmb off default
zclei22/01 refquota none default
zclei22/01 refreservation none default
zclei22/01 primarycache metadata local
zclei22/01 secondarycache metadata local
zclei22/01 usedbysnapshots 0 -
zclei22/01 usedbydataset 389G -
zclei22/01 usedbychildren 0 -
zclei22/01 usedbyrefreservation 0 -
zclei22/01 logbias latency default
zclei22/01 dedup off default
zclei22/01 mlslabel none default
zclei22/01 sync disabled local
zclei22/01 refcompressratio 1.01x -
zclei22/01 written 389G -
zclei22/01 logicalused 396G -
zclei22/01 logicalreferenced 396G -
zclei22/01 filesystem_limit none default
zclei22/01 snapshot_limit none default
zclei22/01 filesystem_count none default
zclei22/01 snapshot_count none default
zclei22/01 snapdev hidden default
zclei22/01 acltype off default
zclei22/01 context none default
zclei22/01 fscontext none default
zclei22/01 defcontext none default
zclei22/01 rootcontext none default
zclei22/01 relatime off default
zclei22/01 redundant_metadata all default
zclei22/01 overlay off default
On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
Which operating system version are you using for your zfs storage? do: zfs get all your-pool-name use arc_summary.py from freenas git repo if you wish.
2017-03-03 10:33 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- -----
On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G
Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G
On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com > wrote:
> hey, > what are you using for zfs? get an arc status and show please > > > 2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>: > >> no, >> ZFS itself is not on top of lvm. only ssd was spitted by lvm for >> slog(10G) and cache (the rest) >> but in any-case the ssd does not help much on glusterfs/ovirt load >> it has almost 100% cache misses....:( (terrible performance compare with >> nfs) >> >> >> >> >> >> On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < >> fernando.frediani@upx.com> wrote: >> >>> Am I understanding correctly, but you have Gluster on the top of >>> ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary >>> ? I have ZFS with any need of LVM. >>> >>> Fernando >>> >>> On 02/03/2017 06:19, Arman Khalatyan wrote: >>> >>> Hi, >>> I use 3 nodes with zfs and glusterfs. >>> Are there any suggestions to optimize it? >>> >>> host zfs config 4TB-HDD+250GB-SSD: >>> [root@clei22 ~]# zpool status >>> pool: zclei22 >>> state: ONLINE >>> scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 >>> 14:16:07 2017 >>> config: >>> >>> NAME STATE READ WRITE >>> CKSUM >>> zclei22 ONLINE 0 >>> 0 0 >>> HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 >>> 0 0 >>> logs >>> lv_slog ONLINE 0 >>> 0 0 >>> cache >>> lv_cache ONLINE 0 >>> 0 0 >>> >>> errors: No known data errors >>> >>> Name: >>> GluReplica >>> Volume ID: >>> ee686dfe-203a-4caa-a691-26353460cc48 >>> Volume Type: >>> Replicate (Arbiter) >>> Replica Count: >>> 2 + 1 >>> Number of Bricks: >>> 3 >>> Transport Types: >>> TCP, RDMA >>> Maximum no of snapshots: >>> 256 >>> Capacity: >>> 3.51 TiB total, 190.56 GiB used, 3.33 TiB free >>> >>> >>> _______________________________________________ >>> Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users >>> >>> >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >

The problem itself is not the streaming data performance., and also dd zero does not help much in the production zfs running with compression. the main problem comes when the gluster is starting to do something with that, it is using xattrs, probably accessing extended attributes inside the zfs is slower than XFS. Also primitive find file or ls -l in the (dot)gluster folders takes ages: now I can see that arbiter host has almost 100% cache miss during the rebuild, which is actually natural while he is reading always the new datasets: [root@clei26 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 15:57:31 29 29 100 29 100 0 0 29 100 685M 31G 15:57:32 530 476 89 476 89 0 0 457 89 685M 31G 15:57:33 480 467 97 467 97 0 0 463 97 685M 31G 15:57:34 452 443 98 443 98 0 0 435 97 685M 31G 15:57:35 582 547 93 547 93 0 0 536 94 685M 31G 15:57:36 439 417 94 417 94 0 0 393 94 685M 31G 15:57:38 435 392 90 392 90 0 0 374 89 685M 31G 15:57:39 364 352 96 352 96 0 0 352 96 685M 31G 15:57:40 408 375 91 375 91 0 0 360 91 685M 31G 15:57:41 552 539 97 539 97 0 0 539 97 685M 31G It looks like we cannot have in the same system performance and reliability :( Simply final conclusion is with the single disk+ssd even zfs doesnot help to speedup the glusterfs healing. I will stop here:) On Fri, Mar 3, 2017 at 3:35 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
cd to inside the pool path then dd if=/dev/zero of=test.tt bs=1M leave it runing 5/10 minutes. do ctrl+c paste result here. etc.
2017-03-03 11:30 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
No, I have one pool made of the one disk and ssd as a cache and log device. I have 3 Glusterfs bricks- separate 3 hosts:Volume type Replicate (Arbiter)= replica 2+1! That how much you can push into compute nodes(they have only 3 disk slots).
On Fri, Mar 3, 2017 at 3:19 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
ok, you have 3 pools, zclei22, logs and cache, thats wrong. you should have 1 pool, with zlog+cache if you are looking for performance. also, dont mix drives. whats the performance issue you are facing?
regards,
2017-03-03 11:00 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
This is CentOS 7.3 ZoL version 0.6.5.9-1
[root@clei22 ~]# lsscsi
[2:0:0:0] disk ATA INTEL SSDSC2CW24 400i /dev/sda
[3:0:0:0] disk ATA HGST HUS724040AL AA70 /dev/sdb
[4:0:0:0] disk ATA WDC WD2002FYPS-0 1G01 /dev/sdc
[root@clei22 ~]# pvs ;vgs;lvs
PV VG Fmt Attr PSize PFree
/dev/mapper/INTEL_SSDSC2CW240A3_CVCV306302RP240CGN vg_cache lvm2 a-- 223.57g 0
/dev/sdc2 centos_clei22 lvm2 a-- 1.82t 64.00m
VG #PV #LV #SN Attr VSize VFree
centos_clei22 1 3 0 wz--n- 1.82t 64.00m
vg_cache 1 2 0 wz--n- 223.57g 0
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home centos_clei22 -wi-ao---- 1.74t
root centos_clei22 -wi-ao---- 50.00g
swap centos_clei22 -wi-ao---- 31.44g
lv_cache vg_cache -wi-ao---- 213.57g
lv_slog vg_cache -wi-ao---- 10.00g
[root@clei22 ~]# zpool status -v
pool: zclei22
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017
config:
NAME STATE READ WRITE CKSUM
zclei22 ONLINE 0 0 0
HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0
logs
lv_slog ONLINE 0 0 0
cache
lv_cache ONLINE 0 0 0
errors: No known data errors
*ZFS config:*
[root@clei22 ~]# zfs get all zclei22/01
NAME PROPERTY VALUE SOURCE
zclei22/01 type filesystem -
zclei22/01 creation Tue Feb 28 14:06 2017 -
zclei22/01 used 389G -
zclei22/01 available 3.13T -
zclei22/01 referenced 389G -
zclei22/01 compressratio 1.01x -
zclei22/01 mounted yes -
zclei22/01 quota none default
zclei22/01 reservation none default
zclei22/01 recordsize 128K local
zclei22/01 mountpoint /zclei22/01 default
zclei22/01 sharenfs off default
zclei22/01 checksum on default
zclei22/01 compression off local
zclei22/01 atime on default
zclei22/01 devices on default
zclei22/01 exec on default
zclei22/01 setuid on default
zclei22/01 readonly off default
zclei22/01 zoned off default
zclei22/01 snapdir hidden default
zclei22/01 aclinherit restricted default
zclei22/01 canmount on default
zclei22/01 xattr sa local
zclei22/01 copies 1 default
zclei22/01 version 5 -
zclei22/01 utf8only off -
zclei22/01 normalization none -
zclei22/01 casesensitivity sensitive -
zclei22/01 vscan off default
zclei22/01 nbmand off default
zclei22/01 sharesmb off default
zclei22/01 refquota none default
zclei22/01 refreservation none default
zclei22/01 primarycache metadata local
zclei22/01 secondarycache metadata local
zclei22/01 usedbysnapshots 0 -
zclei22/01 usedbydataset 389G -
zclei22/01 usedbychildren 0 -
zclei22/01 usedbyrefreservation 0 -
zclei22/01 logbias latency default
zclei22/01 dedup off default
zclei22/01 mlslabel none default
zclei22/01 sync disabled local
zclei22/01 refcompressratio 1.01x -
zclei22/01 written 389G -
zclei22/01 logicalused 396G -
zclei22/01 logicalreferenced 396G -
zclei22/01 filesystem_limit none default
zclei22/01 snapshot_limit none default
zclei22/01 filesystem_count none default
zclei22/01 snapshot_count none default
zclei22/01 snapdev hidden default
zclei22/01 acltype off default
zclei22/01 context none default
zclei22/01 fscontext none default
zclei22/01 defcontext none default
zclei22/01 rootcontext none default
zclei22/01 relatime off default
zclei22/01 redundant_metadata all default
zclei22/01 overlay off default
On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo <pablo.localhost@gmail.com> wrote:
Which operating system version are you using for your zfs storage? do: zfs get all your-pool-name use arc_summary.py from freenas git repo if you wish.
2017-03-03 10:33 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>:
Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- -----
On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
> Glusterfs now in healing mode: > Receiver: > [root@clei21 ~]# arcstat.py 1 > time read miss miss% dmis dm% pmis pm% mmis mm% > arcsz c > 13:24:49 0 0 0 0 0 0 0 0 0 > 4.6G 31G > 13:24:50 154 80 51 80 51 0 0 80 51 > 4.6G 31G > 13:24:51 179 62 34 62 34 0 0 62 42 > 4.6G 31G > 13:24:52 148 68 45 68 45 0 0 68 45 > 4.6G 31G > 13:24:53 140 64 45 64 45 0 0 64 45 > 4.6G 31G > 13:24:54 124 48 38 48 38 0 0 48 38 > 4.6G 31G > 13:24:55 157 80 50 80 50 0 0 80 50 > 4.7G 31G > 13:24:56 202 68 33 68 33 0 0 68 41 > 4.7G 31G > 13:24:57 127 54 42 54 42 0 0 54 42 > 4.7G 31G > 13:24:58 126 50 39 50 39 0 0 50 39 > 4.7G 31G > 13:24:59 116 40 34 40 34 0 0 40 34 > 4.7G 31G > > > Sender > [root@clei22 ~]# arcstat.py 1 > time read miss miss% dmis dm% pmis pm% mmis mm% > arcsz c > 13:28:37 8 2 25 2 25 0 0 2 25 > 468M 31G > 13:28:38 1.2K 727 62 727 62 0 0 525 54 > 469M 31G > 13:28:39 815 508 62 508 62 0 0 376 55 > 469M 31G > 13:28:40 994 624 62 624 62 0 0 450 54 > 469M 31G > 13:28:41 783 456 58 456 58 0 0 338 50 > 470M 31G > 13:28:42 916 541 59 541 59 0 0 390 50 > 470M 31G > 13:28:43 768 437 56 437 57 0 0 313 48 > 471M 31G > 13:28:44 877 534 60 534 60 0 0 393 53 > 470M 31G > 13:28:45 957 630 65 630 65 0 0 450 57 > 470M 31G > 13:28:46 819 479 58 479 58 0 0 357 51 > 471M 31G > > > On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo < > pablo.localhost@gmail.com> wrote: > >> hey, >> what are you using for zfs? get an arc status and show please >> >> >> 2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com>: >> >>> no, >>> ZFS itself is not on top of lvm. only ssd was spitted by lvm for >>> slog(10G) and cache (the rest) >>> but in any-case the ssd does not help much on glusterfs/ovirt >>> load it has almost 100% cache misses....:( (terrible performance compare >>> with nfs) >>> >>> >>> >>> >>> >>> On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < >>> fernando.frediani@upx.com> wrote: >>> >>>> Am I understanding correctly, but you have Gluster on the top of >>>> ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary >>>> ? I have ZFS with any need of LVM. >>>> >>>> Fernando >>>> >>>> On 02/03/2017 06:19, Arman Khalatyan wrote: >>>> >>>> Hi, >>>> I use 3 nodes with zfs and glusterfs. >>>> Are there any suggestions to optimize it? >>>> >>>> host zfs config 4TB-HDD+250GB-SSD: >>>> [root@clei22 ~]# zpool status >>>> pool: zclei22 >>>> state: ONLINE >>>> scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 >>>> 14:16:07 2017 >>>> config: >>>> >>>> NAME STATE READ WRITE >>>> CKSUM >>>> zclei22 ONLINE 0 >>>> 0 0 >>>> HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 >>>> 0 0 >>>> logs >>>> lv_slog ONLINE 0 >>>> 0 0 >>>> cache >>>> lv_cache ONLINE 0 >>>> 0 0 >>>> >>>> errors: No known data errors >>>> >>>> Name: >>>> GluReplica >>>> Volume ID: >>>> ee686dfe-203a-4caa-a691-26353460cc48 >>>> Volume Type: >>>> Replicate (Arbiter) >>>> Replica Count: >>>> 2 + 1 >>>> Number of Bricks: >>>> 3 >>>> Transport Types: >>>> TCP, RDMA >>>> Maximum no of snapshots: >>>> 256 >>>> Capacity: >>>> 3.51 TiB total, 190.56 GiB used, 3.33 TiB free >>>> >>>> >>>> _______________________________________________ >>>> Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> Users@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >> >

--Apple-Mail=_33A5E632-5864-4B33-9FED-1901AA57DEBD Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Why are you using an arbitrator if all your HW configs are identical? = I=E2=80=99d use a true replica 3 in this case. Also in my experience with gluster and vm hosting, the ZIL/slog degrades = write performance unless it=E2=80=99s a truly dedicated disk. But I have = 8 spinners backing my ZFS volumes, so trying to share a sata disk = wasn=E2=80=99t a good zil. If yours is dedicated SAS, keep it, if it=E2=80= =99s SATA, try testing without it. You don=E2=80=99t have compression enabled on your zfs volume, and I=E2=80= =99d recommend enabling relatime on it. Depending on the amount of RAM = in these boxes, you probably want to limit your zfs arc size to 8G or so = (1/4 total ram or less). Gluster just works volumes hard during a = rebuild, what=E2=80=99s the problem you=E2=80=99re seeing? If it=E2=80=99s= affecting your VMs, using shading and tuning client & server threads = can help avoid interruptions to your VMs while repairs are running. If = you really need to limit it, you can use cgroups to keep it from hogging = all the CPU, but it takes longer to heal, of course. There are a couple = older posts and blogs about it, if you go back a while. > On Mar 3, 2017, at 9:02 AM, Arman Khalatyan <arm2arm@gmail.com> wrote: >=20 > The problem itself is not the streaming data performance., and also dd = zero does not help much in the production zfs running with compression. > the main problem comes when the gluster is starting to do something = with that, it is using xattrs, probably accessing extended attributes = inside the zfs is slower than XFS. > Also primitive find file or ls -l in the (dot)gluster folders takes = ages:=20 >=20 > now I can see that arbiter host has almost 100% cache miss during the = rebuild, which is actually natural while he is reading always the new = datasets: > [root@clei26 ~]# arcstat.py 1 > time read miss miss% dmis dm% pmis pm% mmis mm% arcsz = c =20 > 15:57:31 29 29 100 29 100 0 0 29 100 685M = 31G =20 > 15:57:32 530 476 89 476 89 0 0 457 89 685M = 31G =20 > 15:57:33 480 467 97 467 97 0 0 463 97 685M = 31G =20 > 15:57:34 452 443 98 443 98 0 0 435 97 685M = 31G =20 > 15:57:35 582 547 93 547 93 0 0 536 94 685M = 31G =20 > 15:57:36 439 417 94 417 94 0 0 393 94 685M = 31G =20 > 15:57:38 435 392 90 392 90 0 0 374 89 685M = 31G =20 > 15:57:39 364 352 96 352 96 0 0 352 96 685M = 31G =20 > 15:57:40 408 375 91 375 91 0 0 360 91 685M = 31G =20 > 15:57:41 552 539 97 539 97 0 0 539 97 685M = 31G =20 >=20 > It looks like we cannot have in the same system performance and = reliability :( > Simply final conclusion is with the single disk+ssd even zfs doesnot = help to speedup the glusterfs healing. > I will stop here:) >=20 >=20 >=20 >=20 > On Fri, Mar 3, 2017 at 3:35 PM, Juan Pablo <pablo.localhost@gmail.com = <mailto:pablo.localhost@gmail.com>> wrote: > cd to inside the pool path > then dd if=3D/dev/zero of=3Dtest.tt <http://test.tt/> bs=3D1M=20 > leave it runing 5/10 minutes. > do ctrl+c paste result here. > etc. >=20 > 2017-03-03 11:30 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com = <mailto:arm2arm@gmail.com>>: > No, I have one pool made of the one disk and ssd as a cache and log = device. > I have 3 Glusterfs bricks- separate 3 hosts:Volume type Replicate = (Arbiter)=3D replica 2+1! > That how much you can push into compute nodes(they have only 3 disk = slots). >=20 >=20 > On Fri, Mar 3, 2017 at 3:19 PM, Juan Pablo <pablo.localhost@gmail.com = <mailto:pablo.localhost@gmail.com>> wrote: > ok, you have 3 pools, zclei22, logs and cache, thats wrong. you should = have 1 pool, with zlog+cache if you are looking for performance. > also, dont mix drives.=20 > whats the performance issue you are facing?=20 >=20 >=20 > regards, >=20 > 2017-03-03 11:00 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com = <mailto:arm2arm@gmail.com>>: > This is CentOS 7.3 ZoL version 0.6.5.9-1 >=20 > [root@clei22 ~]# lsscsi >=20 > [2:0:0:0] disk ATA INTEL SSDSC2CW24 400i /dev/sda >=20 > [3:0:0:0] disk ATA HGST HUS724040AL AA70 /dev/sdb >=20 > [4:0:0:0] disk ATA WDC WD2002FYPS-0 1G01 /dev/sdc >=20 >=20 >=20 > [root@clei22 ~]# pvs ;vgs;lvs >=20 > PV VG Fmt = Attr PSize PFree >=20 > /dev/mapper/INTEL_SSDSC2CW240A3_CVCV306302RP240CGN vg_cache = lvm2 a-- 223.57g 0 >=20 > /dev/sdc2 centos_clei22 = lvm2 a-- 1.82t 64.00m >=20 > VG #PV #LV #SN Attr VSize VFree >=20 > centos_clei22 1 3 0 wz--n- 1.82t 64.00m >=20 > vg_cache 1 2 0 wz--n- 223.57g 0 >=20 > LV VG Attr LSize Pool Origin Data% Meta% = Move Log Cpy%Sync Convert >=20 > home centos_clei22 -wi-ao---- 1.74t = =20 >=20 > root centos_clei22 -wi-ao---- 50.00g = =20 >=20 > swap centos_clei22 -wi-ao---- 31.44g = =20 >=20 > lv_cache vg_cache -wi-ao---- 213.57g = =20 >=20 > lv_slog vg_cache -wi-ao---- 10.00g =20 >=20 >=20 >=20 > [root@clei22 ~]# zpool status -v >=20 > pool: zclei22 >=20 > state: ONLINE >=20 > scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 = 2017 >=20 > config: >=20 >=20 >=20 > NAME STATE READ WRITE CKSUM >=20 > zclei22 ONLINE 0 0 0 >=20 > HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 >=20 > logs >=20 > lv_slog ONLINE 0 0 0 >=20 > cache >=20 > lv_cache ONLINE 0 0 0 >=20 >=20 >=20 > errors: No known data errors >=20 >=20 > ZFS config: >=20 > [root@clei22 ~]# zfs get all zclei22/01 >=20 > NAME PROPERTY VALUE SOURCE >=20 > zclei22/01 type filesystem - >=20 > zclei22/01 creation Tue Feb 28 14:06 2017 - >=20 > zclei22/01 used 389G - >=20 > zclei22/01 available 3.13T - >=20 > zclei22/01 referenced 389G - >=20 > zclei22/01 compressratio 1.01x - >=20 > zclei22/01 mounted yes - >=20 > zclei22/01 quota none default >=20 > zclei22/01 reservation none default >=20 > zclei22/01 recordsize 128K local >=20 > zclei22/01 mountpoint /zclei22/01 default >=20 > zclei22/01 sharenfs off default >=20 > zclei22/01 checksum on default >=20 > zclei22/01 compression off local >=20 > zclei22/01 atime on default >=20 > zclei22/01 devices on default >=20 > zclei22/01 exec on default >=20 > zclei22/01 setuid on default >=20 > zclei22/01 readonly off default >=20 > zclei22/01 zoned off default >=20 > zclei22/01 snapdir hidden default >=20 > zclei22/01 aclinherit restricted default >=20 > zclei22/01 canmount on default >=20 > zclei22/01 xattr sa local >=20 > zclei22/01 copies 1 default >=20 > zclei22/01 version 5 - >=20 > zclei22/01 utf8only off - >=20 > zclei22/01 normalization none - >=20 > zclei22/01 casesensitivity sensitive - >=20 > zclei22/01 vscan off default >=20 > zclei22/01 nbmand off default >=20 > zclei22/01 sharesmb off default >=20 > zclei22/01 refquota none default >=20 > zclei22/01 refreservation none default >=20 > zclei22/01 primarycache metadata local >=20 > zclei22/01 secondarycache metadata local >=20 > zclei22/01 usedbysnapshots 0 - >=20 > zclei22/01 usedbydataset 389G - >=20 > zclei22/01 usedbychildren 0 - >=20 > zclei22/01 usedbyrefreservation 0 - >=20 > zclei22/01 logbias latency default >=20 > zclei22/01 dedup off default >=20 > zclei22/01 mlslabel none default >=20 > zclei22/01 sync disabled local >=20 > zclei22/01 refcompressratio 1.01x - >=20 > zclei22/01 written 389G - >=20 > zclei22/01 logicalused 396G - >=20 > zclei22/01 logicalreferenced 396G - >=20 > zclei22/01 filesystem_limit none default >=20 > zclei22/01 snapshot_limit none default >=20 > zclei22/01 filesystem_count none default >=20 > zclei22/01 snapshot_count none default >=20 > zclei22/01 snapdev hidden default >=20 > zclei22/01 acltype off default >=20 > zclei22/01 context none default >=20 > zclei22/01 fscontext none default >=20 > zclei22/01 defcontext none default >=20 > zclei22/01 rootcontext none default >=20 > zclei22/01 relatime off default >=20 > zclei22/01 redundant_metadata all default >=20 > zclei22/01 overlay off default >=20 >=20 >=20 >=20 >=20 >=20 > On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo <pablo.localhost@gmail.com = <mailto:pablo.localhost@gmail.com>> wrote: > Which operating system version are you using for your zfs storage?=20 > do: > zfs get all your-pool-name > use arc_summary.py from freenas git repo if you wish. >=20 >=20 > 2017-03-03 10:33 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com = <mailto:arm2arm@gmail.com>>: > Pool load: > [root@clei21 ~]# zpool iostat -v 1=20 > capacity operations = bandwidth > pool alloc free read write = read write > -------------------------------------- ----- ----- ----- ----- = ----- ----- > zclei21 10.1G 3.62T 0 112 = 823 8.82M > HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 = 626 4.40M > logs - - - - = - - > lv_slog 225M 9.72G 0 66 = 198 4.45M > cache - - - - = - - > lv_cache 9.81G 204G 0 46 = 56 4.13M > -------------------------------------- ----- ----- ----- ----- = ----- ----- >=20 > capacity operations = bandwidth > pool alloc free read write = read write > -------------------------------------- ----- ----- ----- ----- = ----- ----- > zclei21 10.1G 3.62T 0 191 = 0 12.8M > HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 = 0 0 > logs - - - - = - - > lv_slog 225M 9.72G 0 191 = 0 12.8M > cache - - - - = - - > lv_cache 9.83G 204G 0 218 = 0 20.0M > -------------------------------------- ----- ----- ----- ----- = ----- ----- >=20 > capacity operations = bandwidth > pool alloc free read write = read write > -------------------------------------- ----- ----- ----- ----- = ----- ----- > zclei21 10.1G 3.62T 0 191 = 0 12.7M > HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 = 0 0 > logs - - - - = - - > lv_slog 225M 9.72G 0 191 = 0 12.7M > cache - - - - = - - > lv_cache 9.83G 204G 0 72 = 0 7.68M > -------------------------------------- ----- ----- ----- ----- = ----- ----- >=20 >=20 > On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan <arm2arm@gmail.com = <mailto:arm2arm@gmail.com>> wrote: > Glusterfs now in healing mode: > Receiver: > [root@clei21 ~]# arcstat.py 1 > time read miss miss% dmis dm% pmis pm% mmis mm% arcsz = c =20 > 13:24:49 0 0 0 0 0 0 0 0 0 4.6G = 31G =20 > 13:24:50 154 80 51 80 51 0 0 80 51 4.6G = 31G =20 > 13:24:51 179 62 34 62 34 0 0 62 42 4.6G = 31G =20 > 13:24:52 148 68 45 68 45 0 0 68 45 4.6G = 31G =20 > 13:24:53 140 64 45 64 45 0 0 64 45 4.6G = 31G =20 > 13:24:54 124 48 38 48 38 0 0 48 38 4.6G = 31G =20 > 13:24:55 157 80 50 80 50 0 0 80 50 4.7G = 31G =20 > 13:24:56 202 68 33 68 33 0 0 68 41 4.7G = 31G =20 > 13:24:57 127 54 42 54 42 0 0 54 42 4.7G = 31G =20 > 13:24:58 126 50 39 50 39 0 0 50 39 4.7G = 31G =20 > 13:24:59 116 40 34 40 34 0 0 40 34 4.7G = 31G =20 >=20 >=20 > Sender > [root@clei22 ~]# arcstat.py 1 > time read miss miss% dmis dm% pmis pm% mmis mm% arcsz = c =20 > 13:28:37 8 2 25 2 25 0 0 2 25 468M = 31G =20 > 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M = 31G =20 > 13:28:39 815 508 62 508 62 0 0 376 55 469M = 31G =20 > 13:28:40 994 624 62 624 62 0 0 450 54 469M = 31G =20 > 13:28:41 783 456 58 456 58 0 0 338 50 470M = 31G =20 > 13:28:42 916 541 59 541 59 0 0 390 50 470M = 31G =20 > 13:28:43 768 437 56 437 57 0 0 313 48 471M = 31G =20 > 13:28:44 877 534 60 534 60 0 0 393 53 470M = 31G =20 > 13:28:45 957 630 65 630 65 0 0 450 57 470M = 31G =20 > 13:28:46 819 479 58 479 58 0 0 357 51 471M = 31G =20 >=20 >=20 > On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <pablo.localhost@gmail.com = <mailto:pablo.localhost@gmail.com>> wrote: > hey, > what are you using for zfs? get an arc status and show please >=20 >=20 > 2017-03-02 9:57 GMT-03:00 Arman Khalatyan <arm2arm@gmail.com = <mailto:arm2arm@gmail.com>>: > no,=20 > ZFS itself is not on top of lvm. only ssd was spitted by lvm for = slog(10G) and cache (the rest) > but in any-case the ssd does not help much on glusterfs/ovirt load it = has almost 100% cache misses....:( (terrible performance compare with = nfs) >=20 >=20 >=20 >=20 >=20 > On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI = <fernando.frediani@upx.com <mailto:fernando.frediani@upx.com>> wrote: > Am I understanding correctly, but you have Gluster on the top of ZFS = which is on the top of LVM ? If so, why the usage of LVM was necessary ? = I have ZFS with any need of LVM. >=20 > Fernando >=20 > On 02/03/2017 06:19, Arman Khalatyan wrote: >> Hi,=20 >> I use 3 nodes with zfs and glusterfs. >> Are there any suggestions to optimize it? >>=20 >> host zfs config 4TB-HDD+250GB-SSD: >> [root@clei22 ~]# zpool status=20 >> pool: zclei22 >> state: ONLINE >> scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 = 2017 >> config: >>=20 >> NAME STATE READ WRITE = CKSUM >> zclei22 ONLINE 0 0 = 0 >> HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 = 0 >> logs >> lv_slog ONLINE 0 0 = 0 >> cache >> lv_cache ONLINE 0 0 = 0 >>=20 >> errors: No known data errors >>=20 >> Name: >> GluReplica >> Volume ID: >> ee686dfe-203a-4caa-a691-26353460cc48 >> Volume Type: >> Replicate (Arbiter) >> Replica Count: >> 2 + 1 >> Number of Bricks: >> 3 >> Transport Types: >> TCP, RDMA >> Maximum no of snapshots: >> 256 >> Capacity: >> 3.51 TiB total, 190.56 GiB used, 3.33 TiB free >>=20 >>=20 >> _______________________________________________ >> Users mailing list >> Users@ovirt.org <mailto:Users@ovirt.org> >> http://lists.ovirt.org/mailman/listinfo/users = <http://lists.ovirt.org/mailman/listinfo/users> >=20 >=20 > _______________________________________________ > Users mailing list > Users@ovirt.org <mailto:Users@ovirt.org> > http://lists.ovirt.org/mailman/listinfo/users = <http://lists.ovirt.org/mailman/listinfo/users> >=20 >=20 >=20 > _______________________________________________ > Users mailing list > Users@ovirt.org <mailto:Users@ovirt.org> > http://lists.ovirt.org/mailman/listinfo/users = <http://lists.ovirt.org/mailman/listinfo/users> >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 >=20 > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users --Apple-Mail=_33A5E632-5864-4B33-9FED-1901AA57DEBD Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D"">Why are you using an arbitrator if all your HW configs are = identical? I=E2=80=99d use a true replica 3 in this case.<div = class=3D""><br class=3D""></div><div class=3D"">Also in my experience = with gluster and vm hosting, the ZIL/slog degrades write performance = unless it=E2=80=99s a truly dedicated disk. But I have 8 spinners = backing my ZFS volumes, so trying to share a sata disk wasn=E2=80=99t a = good zil. If yours is dedicated SAS, keep it, if it=E2=80=99s SATA, try = testing without it.</div><div class=3D""><br class=3D""></div><div = class=3D"">You don=E2=80=99t have compression enabled on your zfs = volume, and I=E2=80=99d recommend enabling relatime on it. Depending on = the amount of RAM in these boxes, you probably want to limit your zfs = arc size to 8G or so (1/4 total ram or less). Gluster just works volumes = hard during a rebuild, what=E2=80=99s the problem you=E2=80=99re seeing? = If it=E2=80=99s affecting your VMs, using shading and tuning client = & server threads can help avoid interruptions to your VMs while = repairs are running. If you really need to limit it, you can use cgroups = to keep it from hogging all the CPU, but it takes longer to heal, of = course. There are a couple older posts and blogs about it, if you go = back a while.</div><div class=3D""><br class=3D""></div><div = class=3D""><br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On Mar 3, 2017, at 9:02 AM, Arman Khalatyan <<a = href=3D"mailto:arm2arm@gmail.com" class=3D"">arm2arm@gmail.com</a>> = wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><div = dir=3D"ltr" class=3D""><div class=3D""><div class=3D""><div = class=3D""><div class=3D""><div class=3D""><div class=3D"">The problem = itself is not the streaming data performance., and also dd zero does not = help much in the production zfs running with compression.<br = class=3D""></div>the main problem comes when the gluster is starting to = do something with that, it is using xattrs, probably accessing extended = attributes inside the zfs is slower than XFS.<br class=3D""></div>Also = primitive find file or ls -l in the (dot)gluster folders takes ages: <br = class=3D""><br class=3D""></div>now I can see that arbiter host has = almost 100% cache miss during the rebuild, which is actually natural = while he is reading always the new datasets:<br class=3D"">[root@clei26 = ~]# arcstat.py 1<br class=3D""> time read = miss miss% dmis dm% pmis pm% = mmis mm% arcsz c <br = class=3D"">15:57:31 29 = 29 100 29 = 100 0 0 = 29 100 685M 31G <br = class=3D"">15:57:32 530 = 476 89 476 = 89 0 0 = 457 89 685M 31G <br = class=3D"">15:57:33 480 = 467 97 467 = 97 0 0 = 463 97 685M 31G <br = class=3D"">15:57:34 452 = 443 98 443 = 98 0 0 = 435 97 685M 31G <br = class=3D"">15:57:35 582 = 547 93 547 = 93 0 0 = 536 94 685M 31G <br = class=3D"">15:57:36 439 = 417 94 417 = 94 0 0 = 393 94 685M 31G <br = class=3D"">15:57:38 435 = 392 90 392 = 90 0 0 = 374 89 685M 31G <br = class=3D"">15:57:39 364 = 352 96 352 = 96 0 0 = 352 96 685M 31G <br = class=3D"">15:57:40 408 = 375 91 375 = 91 0 0 = 360 91 685M 31G <br = class=3D"">15:57:41 552 = 539 97 539 = 97 0 0 = 539 97 685M 31G <br = class=3D""><br class=3D""></div>It looks like we cannot have in the same = system performance and reliability :(<br class=3D""></div>Simply final = conclusion is with the single disk+ssd even zfs doesnot help to speedup = the glusterfs healing.<br class=3D""></div>I will stop here:)<br = class=3D""><br class=3D""><div class=3D""><div class=3D""><div = class=3D""><br class=3D""><div class=3D""><br class=3D""><br = class=3D""><div class=3D"gmail_extra"><div class=3D"gmail_quote">On Fri, = Mar 3, 2017 at 3:35 PM, Juan Pablo <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:pablo.localhost@gmail.com" target=3D"_blank" = class=3D"">pablo.localhost@gmail.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = dir=3D"ltr" class=3D""><div class=3D""><div class=3D""><div = class=3D""><div class=3D"">cd to inside the pool path<br = class=3D""></div>then dd if=3D/dev/zero of=3D<a href=3D"http://test.tt/" = target=3D"_blank" class=3D"">test.tt</a> bs=3D1M <br = class=3D""></div>leave it runing 5/10 minutes.<br class=3D""></div>do = ctrl+c paste result here.<br class=3D""></div>etc.<br = class=3D""></div><div class=3D"gmail-HOEnZb"><div class=3D"gmail-h5"><div = class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">2017-03-03= 11:30 GMT-03:00 Arman Khalatyan <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:arm2arm@gmail.com" target=3D"_blank" = class=3D"">arm2arm@gmail.com</a>></span>:<br class=3D""><blockquote = class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px = solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr" class=3D""><div = class=3D""><div class=3D"">No, I have one pool made of the one disk and = ssd as a cache and log device.<br class=3D""></div>I have 3 Glusterfs = bricks- separate 3 hosts:Volume type Replicate (Arbiter)=3D replica = 2+1!<br class=3D""></div>That how much you can push into compute = nodes(they have only 3 disk slots).<br class=3D""><br = class=3D""></div><div class=3D"gmail-m_6281861324822600694HOEnZb"><div = class=3D"gmail-m_6281861324822600694h5"><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote">On Fri, Mar 3, 2017 at 3:19 PM, = Juan Pablo <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:pablo.localhost@gmail.com" target=3D"_blank" = class=3D"">pablo.localhost@gmail.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = dir=3D"ltr" class=3D""><div class=3D""><div class=3D"">ok, you have 3 = pools, zclei22, logs and cache, thats wrong. you should have 1 pool, = with zlog+cache if you are looking for performance.<br = class=3D""></div>also, dont mix drives. <br class=3D""></div>whats the = performance issue you are facing? <br class=3D""><div class=3D""><div = class=3D""><br class=3D""><br class=3D""></div><div = class=3D"">regards,</div></div></div><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465HOEnZb"><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465h5"><div = class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">2017-03-03= 11:00 GMT-03:00 Arman Khalatyan <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:arm2arm@gmail.com" target=3D"_blank" = class=3D"">arm2arm@gmail.com</a>></span>:<br class=3D""><blockquote = class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px = solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr" class=3D"">This = is CentOS 7.3 ZoL version 0.6.5.9-1<br class=3D""><br = class=3D""><blockquote class=3D""><p class=3D"">[root@clei22 ~]# lsscsi = </p><p class=3D"">[2:0:0:0] disk = ATA INTEL SSDSC2CW24 400i /dev/sda = </p><p class=3D"">[3:0:0:0] disk = ATA HGST HUS724040AL AA70 /dev/sdb = </p><p class=3D"">[4:0:0:0] disk = ATA WDC WD2002FYPS-0 1G01 /dev/sdc = </p><p class=3D""><br class=3D""></p><p class=3D"">[root@clei22 ~]# pvs = ;vgs;lvs</p><p class=3D""> = PV = &n= bsp; <wbr = class=3D""> &nb= sp; = VG = Fmt Attr PSize PFree </p><p class=3D""> = /dev/mapper/INTEL_SSDSC2CW240A<wbr class=3D"">3_CVCV306302RP240CGN = vg_cache lvm2 a-- = 223.57g 0 </p><p class=3D""> = /dev/sdc2  = ; <wbr = class=3D""> &nb= sp; centos_clei22 = lvm2 a-- 1.82t 64.00m</p><p class=3D""> = VG #PV = #LV #SN Attr VSize VFree </p><p class=3D""> = centos_clei22 1 3 0 = wz--n- 1.82t 64.00m</p><p class=3D""> = vg_cache 1 = 2 0 wz--n- 223.57g 0 </p><p = class=3D""> LV = VG = Attr LSize Pool Origin = Data% Meta% Move Log Cpy%Sync Convert</p><p class=3D""> = home centos_clei22 -wi-ao---- = 1.74t &nb= sp;  = ; <wbr = class=3D""> &nb= sp;  = ; </p><p class=3D""> = root centos_clei22 -wi-ao---- = 50.00g &n= bsp; &nbs= p;<wbr = class=3D""> &nb= sp;  = ; </p><p class=3D""> = swap centos_clei22 -wi-ao---- = 31.44g &n= bsp; &nbs= p;<wbr = class=3D""> &nb= sp;  = ; </p><p class=3D""> lv_cache = vg_cache -wi-ao---- = 213.57g &= nbsp; <wb= r = class=3D""> &nb= sp;  = ; </p><p class=3D""> lv_slog = vg_cache -wi-ao---- = 10.00g </p><p class=3D""><br class=3D""></p><p = class=3D"">[root@clei22 ~]# zpool status -v</p><span class=3D""><p = class=3D""> pool: zclei22</p><p class=3D""> state: = ONLINE</p><p class=3D""> scan: scrub repaired 0 in 0h0m with 0 = errors on Tue Feb 28 14:16:07 2017</p><p class=3D"">config:</p><p = class=3D""><br class=3D""></p><p class=3D""> = NAME &nbs= p; = <wbr = class=3D""> = STATE READ WRITE CKSUM</p><p = class=3D""> = zclei22 &= nbsp; <wb= r class=3D""> = ONLINE 0 = 0 0</p><p class=3D""> = HGST_HUS724040ALA640_PN2334PBJ<wbr class=3D"">4SV6T1 = ONLINE 0 = 0 0</p><p class=3D""> = logs</p><p class=3D""> = lv_slog &= nbsp; <wb= r class=3D""> = ONLINE 0 = 0 0</p><p class=3D""> = cache</p><p class=3D""> = lv_cache = <wbr = class=3D""> = ONLINE 0 = 0 0</p><p class=3D""><br class=3D""></p><p = class=3D"">errors: No known data errors</p></span></blockquote><p = class=3D""><font size=3D"3" class=3D""><b class=3D""><br class=3D"">ZFS = config:</b></font></p><blockquote = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294gmail-tr_bq"><p class=3D"">[root@clei22 ~]# zfs = get all zclei22/01</p><p = class=3D"">NAME = PROPERTY = = VALUE &nb= sp; SOURCE</p><p class=3D"">zclei22/01 = = type &nbs= p; = filesystem &nbs= p; -</p><p class=3D"">zclei22/01 = creation = Tue Feb 28 14:06 2017 -</p><p = class=3D"">zclei22/01 = used &nbs= p; = 389G &nbs= p; -</p><p class=3D"">zclei22/01 = = available  = ; = 3.13T &nb= sp; -</p><p class=3D"">zclei22/01 = referenced &nbs= p; = 389G &nbs= p; -</p><p class=3D"">zclei22/01 = compressratio = 1.01x &nb= sp; -</p><p class=3D"">zclei22/01 = mounted &= nbsp; = yes  = ; -</p><p = class=3D"">zclei22/01 = quota &nb= sp; = none &nbs= p; default</p><p = class=3D"">zclei22/01 = reservation = none &nbs= p; default</p><p = class=3D"">zclei22/01 = recordsize &nbs= p; = 128K &nbs= p; local</p><p = class=3D"">zclei22/01 = mountpoint &nbs= p; = /zclei22/01 &nb= sp; default</p><p class=3D"">zclei22/01 = sharenfs = = off  = ; default</p><p = class=3D"">zclei22/01 = checksum = = on = default</p><p = class=3D"">zclei22/01 = compression = off  = ; local</p><p = class=3D"">zclei22/01 = atime &nb= sp; = on = default</p><p = class=3D"">zclei22/01 = devices &= nbsp; = on = default</p><p = class=3D"">zclei22/01 = exec &nbs= p; = on = default</p><p = class=3D"">zclei22/01 = setuid &n= bsp; = on = default</p><p = class=3D"">zclei22/01 = readonly = = off  = ; default</p><p = class=3D"">zclei22/01 = zoned &nb= sp; = off  = ; default</p><p = class=3D"">zclei22/01 = snapdir &= nbsp; = hidden &n= bsp; default</p><p class=3D"">zclei22/01 = aclinherit &nbs= p; = restricted &nbs= p; default</p><p class=3D"">zclei22/01 = canmount = = on = default</p><p = class=3D"">zclei22/01 = xattr &nb= sp; = sa = local</p><p = class=3D"">zclei22/01 = copies &n= bsp; = 1 &= nbsp; default</p><p = class=3D"">zclei22/01 = version &= nbsp; = 5 &= nbsp; -</p><p = class=3D"">zclei22/01 = utf8only = = off  = ; -</p><p = class=3D"">zclei22/01 = normalization = none &nbs= p; -</p><p class=3D"">zclei22/01 = casesensitivity = sensitive  = ; -</p><p class=3D"">zclei22/01 = vscan &nb= sp; = off  = ; default</p><p = class=3D"">zclei22/01 = nbmand &n= bsp; = off  = ; default</p><p = class=3D"">zclei22/01 = sharesmb = = off  = ; default</p><p = class=3D"">zclei22/01 = refquota = = none &nbs= p; default</p><p = class=3D"">zclei22/01 = refreservation = none &nbs= p; default</p><p = class=3D"">zclei22/01 = primarycache = metadata = local</p><p class=3D"">zclei22/01 = secondarycache = metadata = local</p><p class=3D"">zclei22/01 = usedbysnapshots = 0 &= nbsp; -</p><p = class=3D"">zclei22/01 = usedbydataset = 389G &nbs= p; -</p><p class=3D"">zclei22/01 = usedbychildren = 0 &= nbsp; -</p><p = class=3D"">zclei22/01 usedbyrefreservation = 0 &= nbsp; -</p><p = class=3D"">zclei22/01 = logbias &= nbsp; = latency &= nbsp; default</p><p class=3D"">zclei22/01 = dedup &nb= sp; = off  = ; default</p><p = class=3D"">zclei22/01 = mlslabel = = none &nbs= p; default</p><p = class=3D"">zclei22/01 = sync &nbs= p; = disabled = local</p><p class=3D"">zclei22/01 = refcompressratio = 1.01x &nb= sp; -</p><p class=3D"">zclei22/01 = written &= nbsp; = 389G &nbs= p; -</p><p class=3D"">zclei22/01 = logicalused = 396G &nbs= p; -</p><p class=3D"">zclei22/01 = logicalreferenced = 396G &nbs= p; -</p><p class=3D"">zclei22/01 = filesystem_limit = none &nbs= p; default</p><p = class=3D"">zclei22/01 = snapshot_limit = none &nbs= p; default</p><p = class=3D"">zclei22/01 = filesystem_count = none &nbs= p; default</p><p = class=3D"">zclei22/01 = snapshot_count = none &nbs= p; default</p><p = class=3D"">zclei22/01 = snapdev &= nbsp; = hidden &n= bsp; default</p><p class=3D"">zclei22/01 = acltype &= nbsp; = off  = ; default</p><p = class=3D"">zclei22/01 = context &= nbsp; = none &nbs= p; default</p><p = class=3D"">zclei22/01 = fscontext  = ; = none &nbs= p; default</p><p = class=3D"">zclei22/01 = defcontext &nbs= p; = none &nbs= p; default</p><p = class=3D"">zclei22/01 = rootcontext = none &nbs= p; default</p><p = class=3D"">zclei22/01 = relatime = = off  = ; default</p><p = class=3D"">zclei22/01 redundant_metadata = all  = ; default</p><p = class=3D"">zclei22/01 = overlay &= nbsp; = off  = ; default</p></blockquote><p = class=3D""><br class=3D""></p><br class=3D""><br class=3D""></div><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878HOEnZb"><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878h5"><div class=3D"gmail_extra"><br class=3D""><div = class=3D"gmail_quote">On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo <span = dir=3D"ltr" class=3D""><<a href=3D"mailto:pablo.localhost@gmail.com" = target=3D"_blank" class=3D"">pablo.localhost@gmail.com</a>></span> = wrote:<br class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px= 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr" class=3D""><div = class=3D""><div class=3D""><div class=3D"">Which operating system = version are you using for your zfs storage? <br class=3D""></div>do:<br = class=3D""></div>zfs get all your-pool-name<br class=3D""></div>use = arc_summary.py from freenas git repo if you wish.<br class=3D""><br = class=3D""></div><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294HOEnZb"><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294h5"><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote">2017-03-03 10:33 GMT-03:00 Arman = Khalatyan <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:arm2arm@gmail.com" target=3D"_blank" = class=3D"">arm2arm@gmail.com</a>></span>:<br class=3D""><blockquote = class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px = solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr" class=3D"">Pool = load:<br class=3D"">[root@clei21 ~]# zpool iostat -v 1 <br = class=3D""> &nb= sp;  = ; <wbr = class=3D""> &nb= sp; capacity operations = bandwidth<br = class=3D"">pool  = ; &= nbsp; <wbr = class=3D""> = alloc free read write = read write<br class=3D"">------------------------------<wbr = class=3D"">-------- ----- ----- ----- = ----- ----- -----<br = class=3D"">zclei21 &n= bsp; &nbs= p; <wbr = class=3D""> = 10.1G 3.62T 0 = 112 823 8.82M<br class=3D""> = HGST_HUS724040ALA640_PN2334PBJ<wbr class=3D"">52XWT1 10.1G = 3.62T 0 = 46 626 4.40M<br = class=3D"">logs  = ; &= nbsp; <wbr = class=3D""> &nb= sp; - = - - = - - -<br = class=3D""> = lv_slog &= nbsp; <wb= r class=3D""> 225M = 9.72G 0 = 66 198 4.45M<br = class=3D"">cache &nbs= p; = <wbr = class=3D""> &nb= sp; - = - - = - - -<br = class=3D""> = lv_cache = <wbr = class=3D""> 9.81G = 204G 0 = 46 56 4.13M<br = class=3D"">------------------------------<wbr class=3D"">-------- = ----- ----- ----- ----- ----- -----<br = class=3D""><br = class=3D""> &nb= sp;  = ; <wbr = class=3D""> &nb= sp; capacity operations = bandwidth<br = class=3D"">pool  = ; &= nbsp; <wbr = class=3D""> = alloc free read write = read write<br class=3D"">------------------------------<wbr = class=3D"">-------- ----- ----- ----- = ----- ----- -----<br = class=3D"">zclei21 &n= bsp; &nbs= p; <wbr = class=3D""> = 10.1G 3.62T 0 = 191 0 12.8M<br class=3D""> = HGST_HUS724040ALA640_PN2334PBJ<wbr class=3D"">52XWT1 10.1G = 3.62T 0 = 0 0 0<br = class=3D"">logs  = ; &= nbsp; <wbr = class=3D""> &nb= sp; - = - - = - - -<br = class=3D""> = lv_slog &= nbsp; <wb= r class=3D""> 225M = 9.72G 0 = 191 0 12.8M<br = class=3D"">cache &nbs= p; = <wbr = class=3D""> &nb= sp; - = - - = - - -<br = class=3D""> = lv_cache = <wbr = class=3D""> 9.83G = 204G 0 = 218 0 20.0M<br = class=3D"">------------------------------<wbr class=3D"">-------- = ----- ----- ----- ----- ----- -----<br = class=3D""><br = class=3D""> &nb= sp;  = ; <wbr = class=3D""> &nb= sp; capacity operations = bandwidth<br = class=3D"">pool  = ; &= nbsp; <wbr = class=3D""> = alloc free read write = read write<br class=3D"">------------------------------<wbr = class=3D"">-------- ----- ----- ----- = ----- ----- -----<br = class=3D"">zclei21 &n= bsp; &nbs= p; <wbr = class=3D""> = 10.1G 3.62T 0 = 191 0 12.7M<br class=3D""> = HGST_HUS724040ALA640_PN2334PBJ<wbr class=3D"">52XWT1 10.1G = 3.62T 0 = 0 0 0<br = class=3D"">logs  = ; &= nbsp; <wbr = class=3D""> &nb= sp; - = - - = - - -<br = class=3D""> = lv_slog &= nbsp; <wb= r class=3D""> 225M = 9.72G 0 = 191 0 12.7M<br = class=3D"">cache &nbs= p; = <wbr = class=3D""> &nb= sp; - = - - = - - -<br = class=3D""> = lv_cache = <wbr = class=3D""> 9.83G = 204G 0 = 72 0 7.68M<br = class=3D"">------------------------------<wbr class=3D"">-------- = ----- ----- ----- ----- ----- -----<br = class=3D""><br class=3D""></div><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647HOEnZb"><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647h5"><div = class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On Fri, = Mar 3, 2017 at 2:32 PM, Arman Khalatyan <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:arm2arm@gmail.com" target=3D"_blank" = class=3D"">arm2arm@gmail.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = dir=3D"ltr" class=3D""><div class=3D"">Glusterfs now in healing mode:<br = class=3D""></div>Receiver:<br class=3D"">[root@clei21 ~]# arcstat.py = 1<br class=3D""> time read miss = miss% dmis dm% pmis pm% mmis = mm% arcsz c <br = class=3D"">13:24:49 0 = 0 0 = 0 0 0 = 0 0 0 = 4.6G 31G <br class=3D"">13:24:50 = 154 80 51 = 80 51 0 = 0 80 51 4.6G = 31G <br class=3D"">13:24:51 179 = 62 34 62 = 34 0 0 = 62 42 4.6G 31G <br = class=3D"">13:24:52 148 = 68 45 68 = 45 0 0 = 68 45 4.6G 31G <br = class=3D"">13:24:53 140 = 64 45 64 = 45 0 0 = 64 45 4.6G 31G <br = class=3D"">13:24:54 124 = 48 38 48 = 38 0 0 = 48 38 4.6G 31G <br = class=3D"">13:24:55 157 = 80 50 80 = 50 0 0 = 80 50 4.7G 31G <br = class=3D"">13:24:56 202 = 68 33 68 = 33 0 0 = 68 41 4.7G 31G <br = class=3D"">13:24:57 127 = 54 42 54 = 42 0 0 = 54 42 4.7G 31G <br = class=3D"">13:24:58 126 = 50 39 50 = 39 0 0 = 50 39 4.7G 31G <br = class=3D"">13:24:59 116 = 40 34 40 = 34 0 0 = 40 34 4.7G 31G <br = class=3D""><br class=3D""><div class=3D""><br class=3D"">Sender<br = class=3D"">[root@clei22 ~]# arcstat.py 1<br class=3D""> = time read miss miss% dmis dm% = pmis pm% mmis mm% arcsz = c <br class=3D"">13:28:37 = 8 2 = 25 2 25 = 0 0 2 = 25 468M 31G <br class=3D"">13:28:38 = 1.2K 727 62 = 727 62 0 = 0 525 54 469M 31G = <br class=3D"">13:28:39 815 = 508 62 508 = 62 0 0 = 376 55 469M 31G <br = class=3D"">13:28:40 994 = 624 62 624 = 62 0 0 = 450 54 469M 31G <br = class=3D"">13:28:41 783 = 456 58 456 = 58 0 0 = 338 50 470M 31G <br = class=3D"">13:28:42 916 = 541 59 541 = 59 0 0 = 390 50 470M 31G <br = class=3D"">13:28:43 768 = 437 56 437 = 57 0 0 = 313 48 471M 31G <br = class=3D"">13:28:44 877 = 534 60 534 = 60 0 0 = 393 53 470M 31G <br = class=3D"">13:28:45 957 = 630 65 630 = 65 0 0 = 450 57 470M 31G <br = class=3D"">13:28:46 819 = 479 58 479 = 58 0 0 = 357 51 471M 31G <br = class=3D""><br class=3D""></div></div><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713HOEnZ= b"><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713h5"><= div class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On = Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:pablo.localhost@gmail.com" = target=3D"_blank" class=3D"">pablo.localhost@gmail.com</a>></span> = wrote:<br class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px= 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr" class=3D"">hey,<br = class=3D"">what are you using for zfs? get an arc status and show = please<br class=3D""><br class=3D""></div><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781HOEnZb"><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781h5"><div class=3D"gmail_extra"><br class=3D""><div = class=3D"gmail_quote">2017-03-02 9:57 GMT-03:00 Arman Khalatyan <span = dir=3D"ltr" class=3D""><<a href=3D"mailto:arm2arm@gmail.com" = target=3D"_blank" class=3D"">arm2arm@gmail.com</a>></span>:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = dir=3D"ltr" class=3D""><div class=3D""><div class=3D"">no, <br = class=3D""></div>ZFS itself is not on top of lvm. only ssd was spitted = by lvm for slog(10G) and cache (the rest)<br class=3D""></div><div = class=3D"">but in any-case the ssd does not help much on = glusterfs/ovirt load it has almost 100% cache misses....:( = (terrible performance compare with nfs)<br class=3D""><br = class=3D""></div><div class=3D""><br class=3D""></div><br class=3D""><br = class=3D""></div><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570HOEnZb"><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570h5"><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote">On Thu, Mar 2, 2017 at 1:47 PM, = FERNANDO FREDIANI <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:fernando.frediani@upx.com" target=3D"_blank" = class=3D"">fernando.frediani@upx.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> =20 =20 =20 <div bgcolor=3D"#FFFFFF" class=3D""><p class=3D"">Am I understanding = correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.</p><span = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570m_-6203522006917276901HOEnZb"><font = color=3D"#888888" class=3D""><p class=3D"">Fernando<br class=3D""> </p></font></span><div class=3D""><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570m_-6203522006917276901h5"> <br class=3D""> <div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570m_-6203522006917276901m_35422971570659266= 81moz-cite-prefix">On 02/03/2017 06:19, Arman Khalatyan wrote:<br class=3D""> </div> </div></div><blockquote type=3D"cite" class=3D""><div class=3D""><div = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570m_-6203522006917276901h5"> <div dir=3D"ltr" class=3D""> <div class=3D""> <div class=3D"">Hi, <br class=3D""> </div> I use 3 nodes with zfs and glusterfs.<br class=3D""> </div> Are there any suggestions to optimize it?<br class=3D""> <div class=3D""><br class=3D""> host zfs config 4TB-HDD+250GB-SSD:<br class=3D""> [root@clei22 ~]# zpool status <br class=3D""> pool: zclei22<br class=3D""> state: ONLINE<br class=3D""> scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb = 28 14:16:07 2017<br class=3D""> config:<br class=3D""> <br class=3D""> = NAME &nbs= p; = <wbr = class=3D""> = STATE READ WRITE CKSUM<br class=3D""> = zclei22 &= nbsp; <wb= r class=3D""> = ONLINE 0 0 0<br class=3D""> HGST_HUS724040ALA640_PN2334PBJ<wbr = class=3D"">4SV6T1 ONLINE = 0 0 0<br class=3D""> logs<br class=3D""> = lv_slog &= nbsp; <wb= r class=3D""> = ONLINE 0 0 0<br class=3D""> cache<br class=3D""> = lv_cache = <wbr = class=3D""> = ONLINE 0 0 0<br class=3D""> <br class=3D""> errors: No known data errors<br class=3D""> <br class=3D""> Name:<br class=3D""> GluReplica<br class=3D""> Volume ID:<br class=3D""> ee686dfe-203a-4caa-a691-263534<wbr class=3D"">60cc48<br = class=3D""> Volume Type:<br class=3D""> Replicate (Arbiter)<br class=3D""> Replica Count:<br class=3D""> 2 + 1<br class=3D""> Number of Bricks:<br class=3D""> 3<br class=3D""> Transport Types:<br class=3D""> TCP, RDMA<br class=3D""> Maximum no of snapshots:<br class=3D""> 256<br class=3D""> Capacity:<br class=3D""> 3.51 TiB total, 190.56 GiB used, 3.33 TiB free<br class=3D""> </div> </div> <br class=3D""> <fieldset = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570m_-6203522006917276901m_35422971570659266= 81mimeAttachmentHeader"></fieldset> <br class=3D""> </div></div><span class=3D""><pre = class=3D"">______________________________<wbr class=3D"">_________________= Users mailing list <a = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570m_-6203522006917276901m_35422971570659266= 81moz-txt-link-abbreviated" href=3D"mailto:Users@ovirt.org" = target=3D"_blank">Users@ovirt.org</a> <a = class=3D"gmail-m_6281861324822600694m_9102919535904979465m_-77235589057693= 4878m_8401411119881083294m_-2168826882466388647m_-4808390406438333713m_690= 6981569320781m_349397128160904570m_-6203522006917276901m_35422971570659266= 81moz-txt-link-freetext" = href=3D"http://lists.ovirt.org/mailman/listinfo/users" = target=3D"_blank">http://lists.ovirt.org/mailman<wbr = class=3D"">/listinfo/users</a> </pre> </span></blockquote> <br class=3D""> </div> <br class=3D"">______________________________<wbr = class=3D"">_________________<br class=3D""> Users mailing list<br class=3D""> <a href=3D"mailto:Users@ovirt.org" target=3D"_blank" = class=3D"">Users@ovirt.org</a><br class=3D""> <a href=3D"http://lists.ovirt.org/mailman/listinfo/users" = rel=3D"noreferrer" target=3D"_blank" = class=3D"">http://lists.ovirt.org/mailman<wbr = class=3D"">/listinfo/users</a><br class=3D""> <br class=3D""></blockquote></div><br class=3D""></div> </div></div><br class=3D"">______________________________<wbr = class=3D"">_________________<br class=3D""> Users mailing list<br class=3D""> <a href=3D"mailto:Users@ovirt.org" target=3D"_blank" = class=3D"">Users@ovirt.org</a><br class=3D""> <a href=3D"http://lists.ovirt.org/mailman/listinfo/users" = rel=3D"noreferrer" target=3D"_blank" = class=3D"">http://lists.ovirt.org/mailman<wbr = class=3D"">/listinfo/users</a><br class=3D""> <br class=3D""></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br = class=3D""></div></div></div></div></div></div> _______________________________________________<br class=3D"">Users = mailing list<br class=3D""><a href=3D"mailto:Users@ovirt.org" = class=3D"">Users@ovirt.org</a><br = class=3D"">http://lists.ovirt.org/mailman/listinfo/users<br = class=3D""></div></blockquote></div><br class=3D""></div></body></html>= --Apple-Mail=_33A5E632-5864-4B33-9FED-1901AA57DEBD--

+gluster-users Regards, Ramesh ----- Original Message -----
From: "Arman Khalatyan" <arm2arm@gmail.com> To: "Juan Pablo" <pablo.localhost@gmail.com> Cc: "users" <users@ovirt.org>, "FERNANDO FREDIANI" <fernando.frediani@upx.com> Sent: Friday, March 3, 2017 8:32:31 PM Subject: Re: [ovirt-users] Replicated Glusterfs on top of ZFS
The problem itself is not the streaming data performance., and also dd zero does not help much in the production zfs running with compression. the main problem comes when the gluster is starting to do something with that, it is using xattrs, probably accessing extended attributes inside the zfs is slower than XFS. Also primitive find file or ls -l in the (dot)gluster folders takes ages:
now I can see that arbiter host has almost 100% cache miss during the rebuild, which is actually natural while he is reading always the new datasets: [root@clei26 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 15:57:31 29 29 100 29 100 0 0 29 100 685M 31G 15:57:32 530 476 89 476 89 0 0 457 89 685M 31G 15:57:33 480 467 97 467 97 0 0 463 97 685M 31G 15:57:34 452 443 98 443 98 0 0 435 97 685M 31G 15:57:35 582 547 93 547 93 0 0 536 94 685M 31G 15:57:36 439 417 94 417 94 0 0 393 94 685M 31G 15:57:38 435 392 90 392 90 0 0 374 89 685M 31G 15:57:39 364 352 96 352 96 0 0 352 96 685M 31G 15:57:40 408 375 91 375 91 0 0 360 91 685M 31G 15:57:41 552 539 97 539 97 0 0 539 97 685M 31G
It looks like we cannot have in the same system performance and reliability :( Simply final conclusion is with the single disk+ssd even zfs doesnot help to speedup the glusterfs healing. I will stop here:)
On Fri, Mar 3, 2017 at 3:35 PM, Juan Pablo < pablo.localhost@gmail.com > wrote:
cd to inside the pool path then dd if=/dev/zero of= test.tt bs=1M leave it runing 5/10 minutes. do ctrl+c paste result here. etc.
2017-03-03 11:30 GMT-03:00 Arman Khalatyan < arm2arm@gmail.com > :
No, I have one pool made of the one disk and ssd as a cache and log device. I have 3 Glusterfs bricks- separate 3 hosts:Volume type Replicate (Arbiter)= replica 2+1! That how much you can push into compute nodes(they have only 3 disk slots).
On Fri, Mar 3, 2017 at 3:19 PM, Juan Pablo < pablo.localhost@gmail.com > wrote:
ok, you have 3 pools, zclei22, logs and cache, thats wrong. you should have 1 pool, with zlog+cache if you are looking for performance. also, dont mix drives. whats the performance issue you are facing?
regards,
2017-03-03 11:00 GMT-03:00 Arman Khalatyan < arm2arm@gmail.com > :
This is CentOS 7.3 ZoL version 0.6.5.9-1
[root@clei22 ~]# lsscsi
[2:0:0:0] disk ATA INTEL SSDSC2CW24 400i /dev/sda
[3:0:0:0] disk ATA HGST HUS724040AL AA70 /dev/sdb
[4:0:0:0] disk ATA WDC WD2002FYPS-0 1G01 /dev/sdc
[root@clei22 ~]# pvs ;vgs;lvs
PV VG Fmt Attr PSize PFree
/dev/mapper/INTEL_SSDSC2CW240A3_CVCV306302RP240CGN vg_cache lvm2 a-- 223.57g 0
/dev/sdc2 centos_clei22 lvm2 a-- 1.82t 64.00m
VG #PV #LV #SN Attr VSize VFree
centos_clei22 1 3 0 wz--n- 1.82t 64.00m
vg_cache 1 2 0 wz--n- 223.57g 0
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home centos_clei22 -wi-ao---- 1.74t
root centos_clei22 -wi-ao---- 50.00g
swap centos_clei22 -wi-ao---- 31.44g
lv_cache vg_cache -wi-ao---- 213.57g
lv_slog vg_cache -wi-ao---- 10.00g
[root@clei22 ~]# zpool status -v
pool: zclei22
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017
config:
NAME STATE READ WRITE CKSUM
zclei22 ONLINE 0 0 0
HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0
logs
lv_slog ONLINE 0 0 0
cache
lv_cache ONLINE 0 0 0
errors: No known data errors
ZFS config:
[root@clei22 ~]# zfs get all zclei22/01
NAME PROPERTY VALUE SOURCE
zclei22/01 type filesystem -
zclei22/01 creation Tue Feb 28 14:06 2017 -
zclei22/01 used 389G -
zclei22/01 available 3.13T -
zclei22/01 referenced 389G -
zclei22/01 compressratio 1.01x -
zclei22/01 mounted yes -
zclei22/01 quota none default
zclei22/01 reservation none default
zclei22/01 recordsize 128K local
zclei22/01 mountpoint /zclei22/01 default
zclei22/01 sharenfs off default
zclei22/01 checksum on default
zclei22/01 compression off local
zclei22/01 atime on default
zclei22/01 devices on default
zclei22/01 exec on default
zclei22/01 setuid on default
zclei22/01 readonly off default
zclei22/01 zoned off default
zclei22/01 snapdir hidden default
zclei22/01 aclinherit restricted default
zclei22/01 canmount on default
zclei22/01 xattr sa local
zclei22/01 copies 1 default
zclei22/01 version 5 -
zclei22/01 utf8only off -
zclei22/01 normalization none -
zclei22/01 casesensitivity sensitive -
zclei22/01 vscan off default
zclei22/01 nbmand off default
zclei22/01 sharesmb off default
zclei22/01 refquota none default
zclei22/01 refreservation none default
zclei22/01 primarycache metadata local
zclei22/01 secondarycache metadata local
zclei22/01 usedbysnapshots 0 -
zclei22/01 usedbydataset 389G -
zclei22/01 usedbychildren 0 -
zclei22/01 usedbyrefreservation 0 -
zclei22/01 logbias latency default
zclei22/01 dedup off default
zclei22/01 mlslabel none default
zclei22/01 sync disabled local
zclei22/01 refcompressratio 1.01x -
zclei22/01 written 389G -
zclei22/01 logicalused 396G -
zclei22/01 logicalreferenced 396G -
zclei22/01 filesystem_limit none default
zclei22/01 snapshot_limit none default
zclei22/01 filesystem_count none default
zclei22/01 snapshot_count none default
zclei22/01 snapdev hidden default
zclei22/01 acltype off default
zclei22/01 context none default
zclei22/01 fscontext none default
zclei22/01 defcontext none default
zclei22/01 rootcontext none default
zclei22/01 relatime off default
zclei22/01 redundant_metadata all default
zclei22/01 overlay off default
On Fri, Mar 3, 2017 at 2:52 PM, Juan Pablo < pablo.localhost@gmail.com > wrote:
Which operating system version are you using for your zfs storage? do: zfs get all your-pool-name use arc_summary.py from freenas git repo if you wish.
2017-03-03 10:33 GMT-03:00 Arman Khalatyan < arm2arm@gmail.com > :
Pool load: [root@clei21 ~]# zpool iostat -v 1 capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 112 823 8.82M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 46 626 4.40M logs - - - - - - lv_slog 225M 9.72G 0 66 198 4.45M cache - - - - - - lv_cache 9.81G 204G 0 46 56 4.13M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.8M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.8M cache - - - - - - lv_cache 9.83G 204G 0 218 0 20.0M -------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth pool alloc free read write read write -------------------------------------- ----- ----- ----- ----- ----- ----- zclei21 10.1G 3.62T 0 191 0 12.7M HGST_HUS724040ALA640_PN2334PBJ52XWT1 10.1G 3.62T 0 0 0 0 logs - - - - - - lv_slog 225M 9.72G 0 191 0 12.7M cache - - - - - - lv_cache 9.83G 204G 0 72 0 7.68M -------------------------------------- ----- ----- ----- ----- ----- -----
On Fri, Mar 3, 2017 at 2:32 PM, Arman Khalatyan < arm2arm@gmail.com > wrote:
Glusterfs now in healing mode: Receiver: [root@clei21 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:24:49 0 0 0 0 0 0 0 0 0 4.6G 31G 13:24:50 154 80 51 80 51 0 0 80 51 4.6G 31G 13:24:51 179 62 34 62 34 0 0 62 42 4.6G 31G 13:24:52 148 68 45 68 45 0 0 68 45 4.6G 31G 13:24:53 140 64 45 64 45 0 0 64 45 4.6G 31G 13:24:54 124 48 38 48 38 0 0 48 38 4.6G 31G 13:24:55 157 80 50 80 50 0 0 80 50 4.7G 31G 13:24:56 202 68 33 68 33 0 0 68 41 4.7G 31G 13:24:57 127 54 42 54 42 0 0 54 42 4.7G 31G 13:24:58 126 50 39 50 39 0 0 50 39 4.7G 31G 13:24:59 116 40 34 40 34 0 0 40 34 4.7G 31G
Sender [root@clei22 ~]# arcstat.py 1 time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 13:28:37 8 2 25 2 25 0 0 2 25 468M 31G 13:28:38 1.2K 727 62 727 62 0 0 525 54 469M 31G 13:28:39 815 508 62 508 62 0 0 376 55 469M 31G 13:28:40 994 624 62 624 62 0 0 450 54 469M 31G 13:28:41 783 456 58 456 58 0 0 338 50 470M 31G 13:28:42 916 541 59 541 59 0 0 390 50 470M 31G 13:28:43 768 437 56 437 57 0 0 313 48 471M 31G 13:28:44 877 534 60 534 60 0 0 393 53 470M 31G 13:28:45 957 630 65 630 65 0 0 450 57 470M 31G 13:28:46 819 479 58 479 58 0 0 357 51 471M 31G
On Thu, Mar 2, 2017 at 7:18 PM, Juan Pablo < pablo.localhost@gmail.com > wrote:
hey, what are you using for zfs? get an arc status and show please
2017-03-02 9:57 GMT-03:00 Arman Khalatyan < arm2arm@gmail.com > :
no, ZFS itself is not on top of lvm. only ssd was spitted by lvm for slog(10G) and cache (the rest) but in any-case the ssd does not help much on glusterfs/ovirt load it has almost 100% cache misses....:( (terrible performance compare with nfs)
On Thu, Mar 2, 2017 at 1:47 PM, FERNANDO FREDIANI < fernando.frediani@upx.com
wrote:
Am I understanding correctly, but you have Gluster on the top of ZFS which is on the top of LVM ? If so, why the usage of LVM was necessary ? I have ZFS with any need of LVM.
Fernando
On 02/03/2017 06:19, Arman Khalatyan wrote:
Hi, I use 3 nodes with zfs and glusterfs. Are there any suggestions to optimize it?
host zfs config 4TB-HDD+250GB-SSD: [root@clei22 ~]# zpool status pool: zclei22 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Tue Feb 28 14:16:07 2017 config:
NAME STATE READ WRITE CKSUM zclei22 ONLINE 0 0 0 HGST_HUS724040ALA640_PN2334PBJ4SV6T1 ONLINE 0 0 0 logs lv_slog ONLINE 0 0 0 cache lv_cache ONLINE 0 0 0
errors: No known data errors
Name: GluReplica Volume ID: ee686dfe-203a-4caa-a691-26353460cc48 Volume Type: Replicate (Arbiter) Replica Count: 2 + 1 Number of Bricks: 3 Transport Types: TCP, RDMA Maximum no of snapshots: 256 Capacity: 3.51 TiB total, 190.56 GiB used, 3.33 TiB free
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Fri, Mar 3, 2017 at 7:00 PM, Darrell Budic <budic@onholyground.com> wrote:
Why are you using an arbitrator if all your HW configs are identical? I’d use a true replica 3 in this case.
This was just GIU suggestion when I was creating the cluster it was asking for the 3 Hosts , I did not knew even that an Arbiter does not keep the data. I am not so sure if I can change the type of the glusterfs to triplicated one in the running system, probably I need to destroy whole cluster.
Also in my experience with gluster and vm hosting, the ZIL/slog degrades write performance unless it’s a truly dedicated disk. But I have 8 spinners backing my ZFS volumes, so trying to share a sata disk wasn’t a good zil. If yours is dedicated SAS, keep it, if it’s SATA, try testing without it.
We have also several huge systems running with zfs quite successful over the years. This was an idea to use zfs + glusterfs for the HA solutions.
You don’t have compression enabled on your zfs volume, and I’d recommend enabling relatime on it. Depending on the amount of RAM in these boxes, you probably want to limit your zfs arc size to 8G or so (1/4 total ram or less). Gluster just works volumes hard during a rebuild, what’s the problem you’re seeing? If it’s affecting your VMs, using shading and tuning client & server threads can help avoid interruptions to your VMs while repairs are running. If you really need to limit it, you can use cgroups to keep it from hogging all the CPU, but it takes longer to heal, of course. There are a couple older posts and blogs about it, if you go back a while.
Yes I saw that glusterfs is CPU/RAM hugry!!! 99% of all 16 cores used just for healing 500GB vm disks. It was taking almost infinity compare with nfs storage (single disk+zfs ssd cache, for sure one get an penalty for the HA:) )

On Mon, Mar 6, 2017 at 3:21 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
On Fri, Mar 3, 2017 at 7:00 PM, Darrell Budic <budic@onholyground.com> wrote:
Why are you using an arbitrator if all your HW configs are identical? I’d use a true replica 3 in this case.
This was just GIU suggestion when I was creating the cluster it was asking for the 3 Hosts , I did not knew even that an Arbiter does not keep the data. I am not so sure if I can change the type of the glusterfs to triplicated one in the running system, probably I need to destroy whole cluster.
Also in my experience with gluster and vm hosting, the ZIL/slog degrades write performance unless it’s a truly dedicated disk. But I have 8 spinners backing my ZFS volumes, so trying to share a sata disk wasn’t a good zil. If yours is dedicated SAS, keep it, if it’s SATA, try testing without it.
We have also several huge systems running with zfs quite successful over the years. This was an idea to use zfs + glusterfs for the HA solutions.
You don’t have compression enabled on your zfs volume, and I’d recommend enabling relatime on it. Depending on the amount of RAM in these boxes, you probably want to limit your zfs arc size to 8G or so (1/4 total ram or less). Gluster just works volumes hard during a rebuild, what’s the problem you’re seeing? If it’s affecting your VMs, using shading and tuning client & server threads can help avoid interruptions to your VMs while repairs are running. If you really need to limit it, you can use cgroups to keep it from hogging all the CPU, but it takes longer to heal, of course. There are a couple older posts and blogs about it, if you go back a while.
Yes I saw that glusterfs is CPU/RAM hugry!!! 99% of all 16 cores used just for healing 500GB vm disks. It was taking almost infinity compare with nfs storage (single disk+zfs ssd cache, for sure one get an penalty for the HA:) )
Is your gluster volume configured to use sharding feature? Could you provide output of gluster vol info?
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users

hi Sahina, yes shard is enabled. actually the setup of the gluster was generated over the ovirt GUI I putall configs here: http://arm2armcos.blogspot.de/2017/03/glusterfs-zfs-ovirt-rdma.html On Tue, Mar 7, 2017 at 8:08 AM, Sahina Bose <sabose@redhat.com> wrote:
On Mon, Mar 6, 2017 at 3:21 PM, Arman Khalatyan <arm2arm@gmail.com> wrote:
On Fri, Mar 3, 2017 at 7:00 PM, Darrell Budic <budic@onholyground.com> wrote:
Why are you using an arbitrator if all your HW configs are identical? I’d use a true replica 3 in this case.
This was just GIU suggestion when I was creating the cluster it was asking for the 3 Hosts , I did not knew even that an Arbiter does not keep the data. I am not so sure if I can change the type of the glusterfs to triplicated one in the running system, probably I need to destroy whole cluster.
Also in my experience with gluster and vm hosting, the ZIL/slog degrades write performance unless it’s a truly dedicated disk. But I have 8 spinners backing my ZFS volumes, so trying to share a sata disk wasn’t a good zil. If yours is dedicated SAS, keep it, if it’s SATA, try testing without it.
We have also several huge systems running with zfs quite successful over the years. This was an idea to use zfs + glusterfs for the HA solutions.
You don’t have compression enabled on your zfs volume, and I’d recommend enabling relatime on it. Depending on the amount of RAM in these boxes, you probably want to limit your zfs arc size to 8G or so (1/4 total ram or less). Gluster just works volumes hard during a rebuild, what’s the problem you’re seeing? If it’s affecting your VMs, using shading and tuning client & server threads can help avoid interruptions to your VMs while repairs are running. If you really need to limit it, you can use cgroups to keep it from hogging all the CPU, but it takes longer to heal, of course. There are a couple older posts and blogs about it, if you go back a while.
Yes I saw that glusterfs is CPU/RAM hugry!!! 99% of all 16 cores used just for healing 500GB vm disks. It was taking almost infinity compare with nfs storage (single disk+zfs ssd cache, for sure one get an penalty for the HA:) )
Is your gluster volume configured to use sharding feature? Could you provide output of gluster vol info?
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
participants (6)
-
Arman Khalatyan
-
Darrell Budic
-
FERNANDO FREDIANI
-
Juan Pablo
-
Ramesh Nachimuthu
-
Sahina Bose