Gluster 5.x does have two important performance-related fixes that are not part of 3.12.x -
i.  in shard-replicate interaction - https://bugzilla.redhat.com/show_bug.cgi?id=1635972
ii. in qemu-gluster-fuse interaction - https://bugzilla.redhat.com/show_bug.cgi?id=1635980

The two fixes do improve write performance in vm-storage workload. Do let us know your experience if you happen to move to gluster-5.x.

-Krutika



On Thu, Mar 28, 2019 at 1:13 PM Strahil <hunter86_bg@yahoo.com> wrote:

Hi Krutika,

I have noticed some performance penalties  (10%-15%) when using sharing in v3.12  .
What is the situation now with 5.5 ?
Best Regards,
Strahil Nikolov

On Mar 28, 2019 08:56, Krutika Dhananjay <kdhananj@redhat.com> wrote:
Right. So Gluster stores what are called "indices" for each modified file (or shard)
under a special hidden directory of the "good" bricks at $BRICK_PATH/.glusterfs/indices/xattrop.
When the offline brick comes back up, the file corresponding to each index is healed, and then the index deleted
to mark the fact that the file has been healed.

You can try this and see it for yourself. Just create a 1x3 plain replicate volume, and enable shard on it.
Create a big file (big enough to have multiple shards). Check that the shards are created under $BRICK_PATH/.shard.
Now kill a brick. Modify a small portion of the file. Hit `ls` on $BRICK_PATH/.glusterfs/indices/xattrop of the online bricks.
You'll notice there will be entries named after the gfid (unique identifier in gluster for each file) of the shards.
And only for those shards that the write modified, and not ALL shards of this really big file.
And then when you bring the brick back up using `gluster volume start $VOL force`, the
shards get healed and the directory eventually becomes empty.

-Krutika


On Thu, Mar 28, 2019 at 12:14 PM Indivar Nair <indivar.nair@techterra.in> wrote:
Hi Krutika,

So how does the Gluster node know which shards were modified after it went down?
Do the other Gluster nodes keep track of it?

Regards,


Indivar Nair


On Thu, Mar 28, 2019 at 9:45 AM Krutika Dhananjay <kdhananj@redhat.com> wrote:
Each shard is a separate file of size equal to value of "features.shard-block-size".
So when a brick/node was down, only those shards belonging to the VM that were modified will be sync'd later when the brick's back up.
Does that answer your question?

-Krutika

On Wed, Mar 27, 2019 at 7:48 PM Sahina Bose <sabose@redhat.com> wrote:
On Wed, Mar 27, 2019 at 7:40 PM Indivar Nair <indivar.nair@techterra.in> wrote:
>
> Hi Strahil,
>
> Ok. Looks like sharding should make the resyncs faster.
>
> I searched for more info on it, but couldn't find much.
> I believe it will still have to compare each shard to determine whether there are any changes that need to be replicated.
> Am I right?

+Krutika Dhananjay
>
> Regards,
>
> Indivar Nair
>
>
>
> On Wed, Mar 27, 2019 at 4:34 PM Strahil <hunter86_bg@yahoo.com> wrote:
>>
>> By default ovirt uses 'sharding' which splits the files into logical chunks. This greatly reduces healing time, as VM's disk is not always completely overwritten and only the shards that are different will be healed.
>>
>> Maybe you should change the default shard size.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> On Mar 27, 2019 08:24, Indivar Nair <indivar.nair@techterra.in> wrote:
>>
>> Hi All,
>>
>> We are planning a 2 + 1 arbitrated mirrored Gluster setup.
>> We would have around 50 - 60 VMs, with an average 500GB disk size.
>>
>> Now in case one of the Gluster Nodes go completely out of sync, roughly, how long would it take to resync? (as per your experience)
>> Will it impact the working of VMs in any way?
>> Is there anything to be taken care of, in advance, to prepare for such a situation?
>>
>> Regards,
>>
>>
>> Indivar Nair
>>
> ______________