
Hello everyone, I'm new to oVirt. I want to set up a cost-efficient and fault-tolerant virtualization environment. Our VMs have low to medium IOPS, 70% of VMs have low IOPS and about 30% of them require about 5K IOPS. All the disks are NVMe. I found that the most cost-efficient storage plan in the current oVirt is a two-way replica with an arbiter and its storage overhead is more than 100%. As far as I know, we can use erasure coding with less storage overhead. Is there any plan to support erasure coding in oVirt? Would you even recommend erasure coding for VM workloads? Best regards, Ali

Erasire coding is highly cpu intensive and usually it's not recommended for VMs. You can create your own gluster volume [1] and then test your workload. [1] https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht... Best Regards,Strahil Nikolov On Tue, Feb 1, 2022 at 11:33, alishirv--- via Users<users@ovirt.org> wrote: Hello everyone, I'm new to oVirt. I want to set up a cost-efficient and fault-tolerant virtualization environment. Our VMs have low to medium IOPS, 70% of VMs have low IOPS and about 30% of them require about 5K IOPS. All the disks are NVMe. I found that the most cost-efficient storage plan in the current oVirt is a two-way replica with an arbiter and its storage overhead is more than 100%. As far as I know, we can use erasure coding with less storage overhead. Is there any plan to support erasure coding in oVirt? Would you even recommend erasure coding for VM workloads? Best regards, Ali _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5HACZHHGJS7BL2...

Thanks a lot, another reason alongside the storage overhead that led me to erasure coding was the idle CPU cycles on our servers. Currently, we have enough idle CPU cycles on our servers. Do you think erasure coding is a bad choice even if we have enough idle CPUs?

Sadly, it's always 'It depends'.You definitely have to test with your workload. One setup I see -> create a small replica volume for the OS of the VMs and a second one (disperse volume) for the data disks. Then, test and consider if it's enough for you. Best Regards,Strahil Nikolov On Tue, Feb 1, 2022 at 20:02, alishirv--- via Users<users@ovirt.org> wrote: Thanks a lot, another reason alongside the storage overhead that led me to erasure coding was the idle CPU cycles on our servers. Currently, we have enough idle CPU cycles on our servers. Do you think erasure coding is a bad choice even if we have enough idle CPUs? _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ME7SBP4G535BWD...

It was this near endless range of possibilities via permutation of the parts that originally attracted me to oVirt. Being clearly a member of the original Lego generation I imagined how you could simply add blocks of this and that to rebuild to something new fantastic..., limitless gluster scaling and HCI, VDO dedup/compression, SSD caching, nested virtualization, geo-replication, OVA support, it just had everything I might want! The truth is rather ugly from what I have gathered around here during almost three years. You don't mention it explicitly, but you evidently are talking about a HCI setup, preferably installed, expanded and managed by the nice Cockpit/Engine GUIs. What I learned is that Gluster based HCI receives very little love from the oVirt developers, even if it seems to you and me (Lego people?) the most attractive option. My impression is that they tend to work more with the original non-HCI approach based on SAN or NFS storage, even if the engine may now by default be a VM, when it was separate servers originally. Gluster, VDO, Ansible, Java engine are all aquisitions, HCI more of a management decision to counter Nutanix and I find that following the evolution from Moshe Bar's Mosix to Qumranet and the Permabit, Ansible and Gluster acquisitions as well as the competitor products helps understand why things are as they are. The current oVirt code base supports single node HCI, which delivers no benefits beyond testing. It also supports 3 node HCI, which kind of works with the Cockpit wizard (I have never succeeded with an installation where I didn't have to do some fiddling). Beyond that you already branch out into the jungle of not-tested, not-supported, even if I remember reading that 6-nodes and 9-node HCI seem possible. But quorums with 6 nodes don't seem natural and certainly nobody would want to use replicas in a 9-node HCI setup, right? I've seem actual "not supported" comments in the oVirt Ansible code that stops any installation using dispersed volumes, so there is your immediate answer. I've tricked it past that point editing Ansible scripts and actually got oVirt running with dispersed (erasure coded) volumes on 5 nodes, but it felt too wobbly for real use. Instead I've added the CPU/RAM parts of these 5 nodes to a 3 node HCI setup and then used the disks to create an erasure coded Gluster, which then can be used by VMs via Gluster or NFS pretty much like a filer. I can't recommend that unless you're willing to pay the price of being on your own. One major issue here is that you can't just create glusters and then merge them as any system can only ever be a member of one gluster. And you have to destroy volumes before moving to an other gluster: not sure how that rhymes with bottleneck-free scalability. And then there are various parts in oVirt, which look for quorums and find them missing even when nodes aren't actually contributing bricks to a volume (compute-only hosts or non-HCI volumes). I guess it's safe to say that everybody here would like to see you trying and feeding the changes required to make it work back into the project... ...but it's not "supported out of the box". P.S. When I'm offered erasure coding, VDO dedup/compression and thin allocation, I naturally tend to tick all boxes (originally I also chose SSD cache, but quickly changed to SSD-only storage). It's only later when they mention somewhere, that you aren't supposed to use them in combination, without a full explanation or analysis to follow. Because even at today's SSD storage pricing I need all the tricks in the book I stayed with them all, but Gluster doesn't seem to be a speed devil, no matter what you put on top. And to think that there used to be a UDMA/Infiniband option, that later disappeared with little comment... VMs in a 9-node HCI replica gluster could only ever get ~1Gbit/s throughput on a 10Gbit/s network, so erasure coding should be the smart choice...
participants (3)
-
alishirv@pm.me
-
Strahil Nikolov
-
Thomas Hoberg