[ovirt-users] Re: How do I share a disk across multiple VMs?

Friday, 23 April 2021

Hi Strahil,

I've tried to measure the cost or of erasure coding and, more importantly, VDO with
de-duplication and compression a bit.

Erasure coding should be neglible in terms of CPU power while the vastly more complex LZ4
compression (used inside VDO) really is rather impressive at 1GByte/s single threaded for
compression (6Gbyte/s decompression, on a 25GByte/s memory bus) on the 15Watt NUCs I am
using for one cluster.

The storage I/O overhead of erasure coding shouldn't really matter with NVMe becoming
cheaper than SATA SSD. Perhaps the write amplification needs to be watched with SSDs, but
a lot of that is writeback tuning and with a Gluster in the back, you can commit to RAM as
long as you have a quorum (and a UPS).

Actually with Gluster I guess most of the erasure coding would actually be done by the
client and the network amplification would also be there, but not really different between
erasure coding and replicas: If you write to nine nodes, you write to nine nodes from the
client independent of the encoding.

There the ability to say "please continue to use the 4:2 dispersion as I expand from
6 to 9 nodes and roll that across on a shard by shard base without me having to set up
bricks like that", would certainly help.

With all of VDO enabled I get 200MByte/s for a random data workload on FIO via Gluster,
which becomes 600MByte/s for reads with 3 replicas on the 10Gbit network I use, 60% of the
theoretical maximum with random I/O.

That's completely adequate, because we're not running HPC or SAP batches here and
I'd be rather sure that using erasure coding with 6 and 9 nodes won't introduce a
performance bottleneck, unless I go to 40 or 100GBit on the network.

I'd just really want to be able to choose between say 1, 2 or 3 out of 9 bricks being
used for redundancy, depending on if it's an HCI block next door, going into a ship
with months at sea or into a space station.

I'd also probably add an extra node or two to act as warm (even cold) standby in
critical or hard-to-reach locations, that act as compute-only nodes initially (to avoid
split quotas), but can be promoted to replace a storage node that failed without hands-on
intervention.

oVirt HCI is as close at it gets to LEGO computers, but right now it's doing LEGO with
your hands tied behind your back.

Kind regards, Thomas

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[ovirt-users] Re: How do I share a disk across multiple VMs?