
I have been building out HCI stack with KVM/RHEV + oVirt with the HCI deployment process. This is very nice for small / remote site use cases, but with Gluster being anounced as EOL in 18 months, what is the replacement plan? Are their working projects and plans to replace Gluster with CEPH? Are their deployment plans to get an HCI stack onto a supported file system? I liked gluster for the control plan for the oVirt engine and smaller utility VMs as each system has a full copy, I can retrieve /extract a copy of the VM without having all bricks back... it was just "easy" to use. CEPH just means more complexity.. and though it scales better and has better features, it means that repair means having critical mass of nodes up before you can extra data (vs any disk can be pulled out of a gluster node, plugged into my laptop and I can at least extract the data). I guess I am not trying to debate shifting to CEPH.. it does not matter.. that ship sailed... What I am asking is when / what are the plans for replacement of Gluster for HCI. Because right now, for small sites for HCI, when Gluster is no longer supported.. and CEPH does not make it... is to go VMWare and vSAN or some other total different stack.

What do you mean Gluster being announced as EOL? Where did you find this information? On Mon, Apr 26, 2021 at 9:34 AM penguin pages <jeremey.wise@gmail.com> wrote:
I have been building out HCI stack with KVM/RHEV + oVirt with the HCI deployment process. This is very nice for small / remote site use cases, but with Gluster being anounced as EOL in 18 months, what is the replacement plan?
Are their working projects and plans to replace Gluster with CEPH? Are their deployment plans to get an HCI stack onto a supported file system?
I liked gluster for the control plan for the oVirt engine and smaller utility VMs as each system has a full copy, I can retrieve /extract a copy of the VM without having all bricks back... it was just "easy" to use. CEPH just means more complexity.. and though it scales better and has better features, it means that repair means having critical mass of nodes up before you can extra data (vs any disk can be pulled out of a gluster node, plugged into my laptop and I can at least extract the data).
I guess I am not trying to debate shifting to CEPH.. it does not matter.. that ship sailed... What I am asking is when / what are the plans for replacement of Gluster for HCI. Because right now, for small sites for HCI, when Gluster is no longer supported.. and CEPH does not make it... is to go VMWare and vSAN or some other total different stack. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MJHR2GFVIVHQBY...

On Mon, Apr 26, 2021 at 2:34 PM penguin pages <jeremey.wise@gmail.com> wrote:
I have been building out HCI stack with KVM/RHEV + oVirt with the HCI deployment process. This is very nice for small / remote site use cases, but with Gluster being anounced as EOL in 18 months, what is the replacement plan?
Are you referring to this: https://access.redhat.com/support/policy/updates/rhhiv ? If so, possibly in the meantime there will be a new release? or what? Gianluca

It was on a support ticket / call I was having. I googled around and the only article I found was the one about features being removed.. But not sure if this effects oVirt / HCI. My ticket was about trying to deploy OCP on a full SSD cluster of three nodes and disk performance over 10Gb will too slow and RH support was " We don't support use of gluster for OCP.. and need you to move off gluster for CEPH. So I opened another ticket about CEPH on HCI .. and was told "not supported.. CEPH nodes must be external" So my three server small work office and demo stack, now is rethinking having to go to anther stack / vendor such as VMWare and vSAN, just because I can't get a stack that meets needs for small HCI stack with Linux.

On Mon, Apr 26, 2021 at 4:30 PM penguin pages <jeremey.wise@gmail.com> wrote:
It was on a support ticket / call I was having. I googled around and the only article I found was the one about features being removed.. But not sure if this effects oVirt / HCI.
My ticket was about trying to deploy OCP on a full SSD cluster of three nodes and disk performance over 10Gb will too slow and RH support was " We don't support use of gluster for OCP.. and need you to move off gluster for CEPH.
So I opened another ticket about CEPH on HCI .. and was told "not supported.. CEPH nodes must be external" So my three server small work office and demo stack, now is rethinking having to go to anther stack / vendor such as VMWare and vSAN, just because I can't get a stack that meets needs for small HCI stack with Linux. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https:/ <https://www.ovirt.org/privacy-policy.html>
Remaining in enterprise and products/solutions supported by Red Hat, here there are two different use cases for Red Hat Hyperconverged infrastructure, see: https://access.redhat.com/products/red-hat-hyperconverged-infrastructure 1) Red Hat Hyperconverged Infrastructure for Cloud that is for Openstack and in that case Ceph (RHCS) is the only storage solution supported. 2) Red Hat Hyperconverged Infrastructure for Virtualization that is for RHV and in that case Gluster (RHGS) is the only storage solution supported Then there is the use case of OCP where you want to use persistent storage, and again the only supported solution is Ceph (RHCS). See https://docs.openshift.com/container-platform/4.7/storage/persistent_storage... https://access.redhat.com/articles/4731161 HIH clarifying, Gianluca

I haven't seen your email on the gluster users' mailing list . What was your problem with the performance ? Best Regards,Strahil Nikolov On Mon, Apr 26, 2021 at 17:30, penguin pages<jeremey.wise@gmail.com> wrote: It was on a support ticket / call I was having. I googled around and the only article I found was the one about features being removed.. But not sure if this effects oVirt / HCI. My ticket was about trying to deploy OCP on a full SSD cluster of three nodes and disk performance over 10Gb will too slow and RH support was " We don't support use of gluster for OCP.. and need you to move off gluster for CEPH. So I opened another ticket about CEPH on HCI .. and was told "not supported.. CEPH nodes must be external" So my three server small work office and demo stack, now is rethinking having to go to anther stack / vendor such as VMWare and vSAN, just because I can't get a stack that meets needs for small HCI stack with Linux. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/K4QWNUSG6QDJDA...

The response was that when I select the oVirt HCI storage volumes to deploy to (with VDO enabled) which are a single 512GB SSD with only one small IDM VM running. The IPI OCP 4.7 deployment fails. RH closed ticket because "gluster volume is too slow". I then tried to create with my other 1TB SSD in each server gluster volume without VDO and see if that worked.. but though I matched all settings / gluster options oVirt set, IPI OCP would not show disk as deployable option. I then figured I would use GUI to create bricks vs ansible means (trying to be good and stop doing direct shell but build as ansible playbooks).. but that test is on hold because servers for next two weeks are being re-tasked for another POC. So I figured I would do some rethinking if oVirt HCI on RHEV 4.5 with Gluster is a rat hole that will never work. Below is the output from fio test OCP team asked me to run to show gluster was too slow. ################## ansible@LT-A0070501:/mnt/c/GitHub/penguinpages_cluster_devops/cluster_devops$ ssh core@172.16.100.184 [core@localhost ~]$ journalctl -b -f -u release-image.service -u bootkube.service -- Logs begin at Sun 2021-04-11 19:18:07 UTC. -- Apr 13 11:50:23 localhost bootkube.sh[1276476]: [#404] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": x509: certificate has expired or is not yet valid: current time 2021-04-13T11:50:23Z is after 2021-04-12T19:12:30Z Apr 13 11:50:23 localhost bootkube.sh[1276476]: [#405] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": x509: certificate has expired or is not yet valid: current time 2021-04-13T11:50:23Z is after 2021-04-12T19:12:30Z <snip> [core@localhost ~]$ su - Password: su: Authentication failure [core@localhost ~]$ sudo podman run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/openshift-scale/etcd-perf Trying to pull quay.io/openshift-scale/etcd-perf Trying to pull quay.io/openshift-scale/etcd-perf... Getting image source signatures Copying blob a3ed95caeb02 done Copying blob a3ed95caeb02 done Copying blob a3ed95caeb02 done Copying blob a3ed95caeb02 skipped: already exists Copying blob fcc022b71ae4 done Copying blob a93d706457b7 done Copying blob 763b3f36c462 done Writing manifest to image destination Storing signatures ---------------------------------------------------------------- Running fio --------------------------------------------------------------------------- { "fio version" : "fio-3.7", "timestamp" : 1618315279, "timestamp_ms" : 1618315279798, "time" : "Tue Apr 13 12:01:19 2021", "global options" : { "rw" : "write", "ioengine" : "sync", "fdatasync" : "1", "directory" : "/var/lib/etcd", "size" : "22m", "bs" : "2300" }, "jobs" : [ { "jobname" : "etcd_perf", "groupid" : 0, "error" : 0, "eta" : 0, "elapsed" : 507, "job options" : { "name" : "etcd_perf" }, "read" : { "io_bytes" : 0, "io_kbytes" : 0, "bw_bytes" : 0, "bw" : 0, "iops" : 0.000000, "runtime" : 0, "total_ios" : 0, "short_ios" : 10029, "drop_ios" : 0, "slat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "clat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000, "percentile" : { "1.000000" : 0, "5.000000" : 0, "10.000000" : 0, "20.000000" : 0, "30.000000" : 0, "40.000000" : 0, "50.000000" : 0, "60.000000" : 0, "70.000000" : 0, "80.000000" : 0, "90.000000" : 0, "95.000000" : 0, "99.000000" : 0, "99.500000" : 0, "99.900000" : 0, "99.950000" : 0, "99.990000" : 0 } }, "lat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "bw_min" : 0, "bw_max" : 0, "bw_agg" : 0.000000, "bw_mean" : 0.000000, "bw_dev" : 0.000000, "bw_samples" : 0, "iops_min" : 0, "iops_max" : 0, "iops_mean" : 0.000000, "iops_stddev" : 0.000000, "iops_samples" : 0 }, "write" : { "io_bytes" : 23066700, "io_kbytes" : 22526, "bw_bytes" : 45589, "bw" : 44, "iops" : 19.821372, "runtime" : 505969, "total_ios" : 10029, "short_ios" : 0, "drop_ios" : 0, "slat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "clat_ns" : { "min" : 12011, "max" : 680340, "mean" : 26360.617210, "stddev" : 15390.749240, "percentile" : { "1.000000" : 13632, "5.000000" : 14528, "10.000000" : 15808, "20.000000" : 18304, "30.000000" : 19584, "40.000000" : 20864, "50.000000" : 22656, "60.000000" : 25472, "70.000000" : 28800, "80.000000" : 33536, "90.000000" : 40192, "95.000000" : 46848, "99.000000" : 64768, "99.500000" : 73216, "99.900000" : 96768, "99.950000" : 103936, "99.990000" : 651264 } }, "lat_ns" : { "min" : 13570, "max" : 682541, "mean" : 28604.818825, "stddev" : 15726.819671 }, "bw_min" : 4, "bw_max" : 89, "bw_agg" : 100.000000, "bw_mean" : 46.770285, "bw_dev" : 15.478852, "bw_samples" : 949, "iops_min" : 1, "iops_max" : 40, "iops_mean" : 21.048472, "iops_stddev" : 6.908059, "iops_samples" : 949 }, "trim" : { "io_bytes" : 0, "io_kbytes" : 0, "bw_bytes" : 0, "bw" : 0, "iops" : 0.000000, "runtime" : 0, "total_ios" : 0, "short_ios" : 0, "drop_ios" : 0, "slat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "clat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000, "percentile" : { "1.000000" : 0, "5.000000" : 0, "10.000000" : 0, "20.000000" : 0, "30.000000" : 0, "40.000000" : 0, "50.000000" : 0, "60.000000" : 0, "70.000000" : 0, "80.000000" : 0, "90.000000" : 0, "95.000000" : 0, "99.000000" : 0, "99.500000" : 0, "99.900000" : 0, "99.950000" : 0, "99.990000" : 0 } }, "lat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "bw_min" : 0, "bw_max" : 0, "bw_agg" : 0.000000, "bw_mean" : 0.000000, "bw_dev" : 0.000000, "bw_samples" : 0, "iops_min" : 0, "iops_max" : 0, "iops_mean" : 0.000000, "iops_stddev" : 0.000000, "iops_samples" : 0 }, "sync" : { "lat_ns" : { "min" : 10082235, "max" : 6139419269, "mean" : 50405768.015057, "stddev" : 121884279.530146, "percentile" : { "1.000000" : 11075584, "5.000000" : 11993088, "10.000000" : 12779520, "20.000000" : 15925248, "30.000000" : 19267584, "40.000000" : 30539776, "50.000000" : 42729472, "60.000000" : 49545216, "70.000000" : 55836672, "80.000000" : 64225280, "90.000000" : 81264640, "95.000000" : 113770496, "99.000000" : 212860928, "99.500000" : 258998272, "99.900000" : 1535115264, "99.950000" : 2868903936, "99.990000" : 5335154688 } }, "total_ios" : 0 }, "usr_cpu" : 0.045457, "sys_cpu" : 0.145266, "ctx" : 40541, "majf" : 0, "minf" : 13, "iodepth_level" : { "1" : 200.000000, "2" : 0.000000, "4" : 0.000000, "8" : 0.000000, "16" : 0.000000, "32" : 0.000000, ">=64" : 0.000000 }, "latency_ns" : { "2" : 0.000000, "4" : 0.000000, "10" : 0.000000, "20" : 0.000000, "50" : 0.000000, "100" : 0.000000, "250" : 0.000000, "500" : 0.000000, "750" : 0.000000, "1000" : 0.000000 }, "latency_us" : { "2" : 0.000000, "4" : 0.000000, "10" : 0.000000, "20" : 33.702263, "50" : 62.498754, "100" : 3.709243, "250" : 0.059827, "500" : 0.000000, "750" : 0.029913, "1000" : 0.000000 }, "latency_ms" : { "2" : 0.000000, "4" : 0.000000, "10" : 0.000000, "20" : 0.000000, "50" : 0.000000, "100" : 0.000000, "250" : 0.000000, "500" : 0.000000, "750" : 0.000000, "1000" : 0.000000, "2000" : 0.000000, ">=2000" : 0.000000 }, "latency_depth" : 1, "latency_target" : 0, "latency_percentile" : 100.000000, "latency_window" : 0 } ], "disk_util" : [ { "name" : "sda", "read_ios" : 86, "write_ios" : 34333, "read_merges" : 0, "write_merges" : 4065, "read_ticks" : 1207, "write_ticks" : 1389885, "in_queue" : 1372762, "util" : 95.134115 } ] } -------------------------------------------------------------------------------------------------------------------------------------------------------- 99th percentile of fsync is 212860928 ns 99th percentile of the fsync is greater than the recommended value which is 10 ms, faster disks are recommended to host etcd for better performance [core@localhost ~]$

Tuning Gluster with VDO bellow is quite difficult and the overhead of using VDO could reduce performance . I would try with VDO compression and dedup disabled.If your SSD has 512 byte physical & logical size, you can skip VDO at all to check performance. Also FS mount options are very important for XFS. Best Regards,Strahil Nikolov On Mon, Apr 26, 2021 at 20:50, penguin pages<jeremey.wise@gmail.com> wrote: The response was that when I select the oVirt HCI storage volumes to deploy to (with VDO enabled) which are a single 512GB SSD with only one small IDM VM running. The IPI OCP 4.7 deployment fails. RH closed ticket because "gluster volume is too slow". I then tried to create with my other 1TB SSD in each server gluster volume without VDO and see if that worked.. but though I matched all settings / gluster options oVirt set, IPI OCP would not show disk as deployable option. I then figured I would use GUI to create bricks vs ansible means (trying to be good and stop doing direct shell but build as ansible playbooks).. but that test is on hold because servers for next two weeks are being re-tasked for another POC. So I figured I would do some rethinking if oVirt HCI on RHEV 4.5 with Gluster is a rat hole that will never work. Below is the output from fio test OCP team asked me to run to show gluster was too slow. ################## ansible@LT-A0070501:/mnt/c/GitHub/penguinpages_cluster_devops/cluster_devops$ ssh core@172.16.100.184 [core@localhost ~]$ journalctl -b -f -u release-image.service -u bootkube.service -- Logs begin at Sun 2021-04-11 19:18:07 UTC. -- Apr 13 11:50:23 localhost bootkube.sh[1276476]: [#404] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": x509: certificate has expired or is not yet valid: current time 2021-04-13T11:50:23Z is after 2021-04-12T19:12:30Z Apr 13 11:50:23 localhost bootkube.sh[1276476]: [#405] failed to fetch discovery: Get "https://localhost:6443/api?timeout=32s": x509: certificate has expired or is not yet valid: current time 2021-04-13T11:50:23Z is after 2021-04-12T19:12:30Z <snip> [core@localhost ~]$ su - Password: su: Authentication failure [core@localhost ~]$ sudo podman run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/openshift-scale/etcd-perf Trying to pull quay.io/openshift-scale/etcd-perf Trying to pull quay.io/openshift-scale/etcd-perf... Getting image source signatures Copying blob a3ed95caeb02 done Copying blob a3ed95caeb02 done Copying blob a3ed95caeb02 done Copying blob a3ed95caeb02 skipped: already exists Copying blob fcc022b71ae4 done Copying blob a93d706457b7 done Copying blob 763b3f36c462 done Writing manifest to image destination Storing signatures ---------------------------------------------------------------- Running fio --------------------------------------------------------------------------- { "fio version" : "fio-3.7", "timestamp" : 1618315279, "timestamp_ms" : 1618315279798, "time" : "Tue Apr 13 12:01:19 2021", "global options" : { "rw" : "write", "ioengine" : "sync", "fdatasync" : "1", "directory" : "/var/lib/etcd", "size" : "22m", "bs" : "2300" }, "jobs" : [ { "jobname" : "etcd_perf", "groupid" : 0, "error" : 0, "eta" : 0, "elapsed" : 507, "job options" : { "name" : "etcd_perf" }, "read" : { "io_bytes" : 0, "io_kbytes" : 0, "bw_bytes" : 0, "bw" : 0, "iops" : 0.000000, "runtime" : 0, "total_ios" : 0, "short_ios" : 10029, "drop_ios" : 0, "slat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "clat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000, "percentile" : { "1.000000" : 0, "5.000000" : 0, "10.000000" : 0, "20.000000" : 0, "30.000000" : 0, "40.000000" : 0, "50.000000" : 0, "60.000000" : 0, "70.000000" : 0, "80.000000" : 0, "90.000000" : 0, "95.000000" : 0, "99.000000" : 0, "99.500000" : 0, "99.900000" : 0, "99.950000" : 0, "99.990000" : 0 } }, "lat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "bw_min" : 0, "bw_max" : 0, "bw_agg" : 0.000000, "bw_mean" : 0.000000, "bw_dev" : 0.000000, "bw_samples" : 0, "iops_min" : 0, "iops_max" : 0, "iops_mean" : 0.000000, "iops_stddev" : 0.000000, "iops_samples" : 0 }, "write" : { "io_bytes" : 23066700, "io_kbytes" : 22526, "bw_bytes" : 45589, "bw" : 44, "iops" : 19.821372, "runtime" : 505969, "total_ios" : 10029, "short_ios" : 0, "drop_ios" : 0, "slat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "clat_ns" : { "min" : 12011, "max" : 680340, "mean" : 26360.617210, "stddev" : 15390.749240, "percentile" : { "1.000000" : 13632, "5.000000" : 14528, "10.000000" : 15808, "20.000000" : 18304, "30.000000" : 19584, "40.000000" : 20864, "50.000000" : 22656, "60.000000" : 25472, "70.000000" : 28800, "80.000000" : 33536, "90.000000" : 40192, "95.000000" : 46848, "99.000000" : 64768, "99.500000" : 73216, "99.900000" : 96768, "99.950000" : 103936, "99.990000" : 651264 } }, "lat_ns" : { "min" : 13570, "max" : 682541, "mean" : 28604.818825, "stddev" : 15726.819671 }, "bw_min" : 4, "bw_max" : 89, "bw_agg" : 100.000000, "bw_mean" : 46.770285, "bw_dev" : 15.478852, "bw_samples" : 949, "iops_min" : 1, "iops_max" : 40, "iops_mean" : 21.048472, "iops_stddev" : 6.908059, "iops_samples" : 949 }, "trim" : { "io_bytes" : 0, "io_kbytes" : 0, "bw_bytes" : 0, "bw" : 0, "iops" : 0.000000, "runtime" : 0, "total_ios" : 0, "short_ios" : 0, "drop_ios" : 0, "slat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "clat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000, "percentile" : { "1.000000" : 0, "5.000000" : 0, "10.000000" : 0, "20.000000" : 0, "30.000000" : 0, "40.000000" : 0, "50.000000" : 0, "60.000000" : 0, "70.000000" : 0, "80.000000" : 0, "90.000000" : 0, "95.000000" : 0, "99.000000" : 0, "99.500000" : 0, "99.900000" : 0, "99.950000" : 0, "99.990000" : 0 } }, "lat_ns" : { "min" : 0, "max" : 0, "mean" : 0.000000, "stddev" : 0.000000 }, "bw_min" : 0, "bw_max" : 0, "bw_agg" : 0.000000, "bw_mean" : 0.000000, "bw_dev" : 0.000000, "bw_samples" : 0, "iops_min" : 0, "iops_max" : 0, "iops_mean" : 0.000000, "iops_stddev" : 0.000000, "iops_samples" : 0 }, "sync" : { "lat_ns" : { "min" : 10082235, "max" : 6139419269, "mean" : 50405768.015057, "stddev" : 121884279.530146, "percentile" : { "1.000000" : 11075584, "5.000000" : 11993088, "10.000000" : 12779520, "20.000000" : 15925248, "30.000000" : 19267584, "40.000000" : 30539776, "50.000000" : 42729472, "60.000000" : 49545216, "70.000000" : 55836672, "80.000000" : 64225280, "90.000000" : 81264640, "95.000000" : 113770496, "99.000000" : 212860928, "99.500000" : 258998272, "99.900000" : 1535115264, "99.950000" : 2868903936, "99.990000" : 5335154688 } }, "total_ios" : 0 }, "usr_cpu" : 0.045457, "sys_cpu" : 0.145266, "ctx" : 40541, "majf" : 0, "minf" : 13, "iodepth_level" : { "1" : 200.000000, "2" : 0.000000, "4" : 0.000000, "8" : 0.000000, "16" : 0.000000, "32" : 0.000000, ">=64" : 0.000000 }, "latency_ns" : { "2" : 0.000000, "4" : 0.000000, "10" : 0.000000, "20" : 0.000000, "50" : 0.000000, "100" : 0.000000, "250" : 0.000000, "500" : 0.000000, "750" : 0.000000, "1000" : 0.000000 }, "latency_us" : { "2" : 0.000000, "4" : 0.000000, "10" : 0.000000, "20" : 33.702263, "50" : 62.498754, "100" : 3.709243, "250" : 0.059827, "500" : 0.000000, "750" : 0.029913, "1000" : 0.000000 }, "latency_ms" : { "2" : 0.000000, "4" : 0.000000, "10" : 0.000000, "20" : 0.000000, "50" : 0.000000, "100" : 0.000000, "250" : 0.000000, "500" : 0.000000, "750" : 0.000000, "1000" : 0.000000, "2000" : 0.000000, ">=2000" : 0.000000 }, "latency_depth" : 1, "latency_target" : 0, "latency_percentile" : 100.000000, "latency_window" : 0 } ], "disk_util" : [ { "name" : "sda", "read_ios" : 86, "write_ios" : 34333, "read_merges" : 0, "write_merges" : 4065, "read_ticks" : 1207, "write_ticks" : 1389885, "in_queue" : 1372762, "util" : 95.134115 } ] } -------------------------------------------------------------------------------------------------------------------------------------------------------- 99th percentile of fsync is 212860928 ns 99th percentile of the fsync is greater than the recommended value which is 10 ms, faster disks are recommended to host etcd for better performance [core@localhost ~]$ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZIYECHKULWF7TC...

"...Tuning Gluster with VDO bellow is quite difficult and the overhead of using VDO could reduce performance ...." Yup. hense creation of a dedicated data00 volume from the 1TB SSD each server had. Matched options listed in oVirt.. but still OCP would not address the drive as target for deployment. That is when I opened ticket with RH and they noted Gluster is not a supported target for OCP. Hense then off to check if we could do CEPH HCI.. nope. "..I would try with VDO compression and dedup disabled.If your SSD has 512 byte physical..& logical size, you can skip VDO at all to check performance....." Yes.. VDO removed was/ is next test. But your note about 512 is yes.. Are their tuning parameters for Gluster with this? "...Also FS mount options are very important for XFS...." - What options do you use / recommend? Do you have a link to said tuning manual page where I could review and knowing the base HCI volume is VDO + XFS + Gluster. But second volume for OCP will be just XFS + Gluster I would assume this may change recommendations. Thanks,

Due to POSIX compliance, oVirt needs 512 byte physical sector size. If your SSD/NVME has the new standard (4096) you will need to use VDO with '--emulate512' flag (or whatever it was named). Yet, if you already got 512 physical sector size - you can skip VDO totally. About the mount options of the bricks, you can use noatime & inode64 Also if you use SELINUX use the 'context=' mount option to tell the kernel "skip looking for SELINUX Label, it's always..." Also, consider setting the SSD I/O scheduler to none (multique should be enabled on EL8 Hypervisors by default) which will reduce reordering of your I/O requests and speed up on fast storage. NVMEs by default use that. Best Regards,Strahil Nikolov On Mon, Apr 26, 2021 at 22:43, penguin pages<jeremey.wise@gmail.com> wrote: "...Tuning Gluster with VDO bellow is quite difficult and the overhead of using VDO could reduce performance ...." Yup. hense creation of a dedicated data00 volume from the 1TB SSD each server had. Matched options listed in oVirt.. but still OCP would not address the drive as target for deployment. That is when I opened ticket with RH and they noted Gluster is not a supported target for OCP. Hense then off to check if we could do CEPH HCI.. nope. "..I would try with VDO compression and dedup disabled.If your SSD has 512 byte physical..& logical size, you can skip VDO at all to check performance....." Yes.. VDO removed was/ is next test. But your note about 512 is yes.. Are their tuning parameters for Gluster with this? "...Also FS mount options are very important for XFS...." - What options do you use / recommend? Do you have a link to said tuning manual page where I could review and knowing the base HCI volume is VDO + XFS + Gluster. But second volume for OCP will be just XFS + Gluster I would assume this may change recommendations. Thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JHRUZV6JS2F25A...
participants (4)
-
Gianluca Cecchi
-
Jayme
-
penguin pages
-
Strahil Nikolov