I would still recommend sharding. Imagine that you got 2 TB disks for a VM and one of the oVirt hosts needs maintenance.
When gluster has to heal that 2TB file, your VM won't be able to access the file for a very long time and will fail. Sharding is important for having no-downtime maintenance.


Yet I have one question.
Did you use preallocated VM disks or you used thin provisioning for the qcow2 ?

Best Regards,
Strahil Nikolov

Description of problem:

Intermittent VM pause and Qcow image corruption after add new bricks.

I'm suffered an issue on image corruption on oVirt 4.3 caused by default gluster ovirt profile, and intermittent VM pause. the problem is similar to #2246 #2254 in glusterfs issue and VM pause issue report in ovirt user group. The gluster vol did not have pending heal object, vol appear in good shape, xfs is healthy, no hardware issue. Sadly few VM have mystery corruption after new bricks added.

Afterwards, I try to simulate the problem with or without "cluster.lookup-optimize off" few time, but the problem is not 100% reproducible with lookup-optimize on, I got 1 of 3 attempt that able to reproduce it. It really depend on the workloads and cache status at that moment and the number of object after rebalance as well.

Also I tried to disable all sharding features, it ran very solid, write performance increase by far, no corruption, no VM pause when the gluster under stress.

So, here is a decision question on shard or not shard.

IMO, even recommendation document saying it break large file into smaller chunk that allow healing to complete faster, a larger file can spread over multiple bricks. But there are uncovered issue compared to full large file in this case, I'd like to further deep dive into the reason why recommend shard as default for oVirt? Especially from the reliability and performance perspective, sharding seems losing this end for ovirt/kvm workloads. Is it more appropriate to just tell ovirt user to ensure underlying single bricks shall be large enough to hold the largest chunk instead? Besides, anything i'm overlooked for the shard setting? I'm really doubt to enable sharding on the volume after disaster.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LFG6KMP7SQV6W3DEQ4AEFD5K2VX7L7AA/