[ovirt-users] Multi-node cluster with local storage
Pavel Gashev
Pax at acronis.com
Fri Mar 4 10:43:42 UTC 2016
On 04/03/16 12:22, "Sahina Bose" <sabose at redhat.com> wrote:
>
>On 03/04/2016 02:14 AM, Pavel Gashev wrote:
>>
>> Unfortunately, oVirt doesn't support multi-node local storage clusters.
>> And Gluster/CEPH doesn't work well over 1G network. It looks like that
>> the only way to use oVirt in a three-node cluster is to share local
>> storages over NFS. At least it makes possible to migrate VMs and move
>> disks among hardware nodes.
>
>
>Do you know of reported problems with Gluster over 1Gb network? I think
>10Gb is recommended, but 1Gb can also be used for gluster.
>(We use it in our lab setup, and haven't encountered any issues so far
>but of course, the workload may be different - hence the question)
Let's calculate. If I have a three node replicated gluster volume, each block writing on a node copies the block to the other two nodes. Thus, maximal write performance can't be above 50MB/s. Even it's acceptable for my workload, things get worse in failure recovering scenario. Gluster works with files. When a node fails and then recovers (even it's just a plain reboot), gluster copies the whole file over network if the file is changed during node outage. So if I have a 100GB VM disk, and guest system has written a 512-byte block to the disk, the whole 100GB will be copied during recovery. It might take 20 minutes for 100GB, and 3 hours for 1TB. And network will be 100% busy during recovery, so VMs on other nodes will wait for I/O most of time. In other words, a plain reboot of a node would result in datacenter out of service for several hours.
Things might be better if you have a distributed+replicated gluster volume. It requires at least six nodes. But things are still bad when you try to rebalance the volume after adding new bricks, or when a node has really failed and replaced.
Thus, 1GB network is ok for a lab, but it's not ok for production. IMHO.
More information about the Users
mailing list