VMs paused - unknown storage error - Stale file handle - distribute 2 - replica 3 volume with sharding

Hi, I opened a bug on gluster because I have reading errors on files on a gluster volume: https://bugzilla.redhat.com/show_bug.cgi?id=1652548 The files are many of the VMs images of the oVirt DATA storage domain. oVirt pause the vms because unknown storage errors. It's impossibile to copy/clone, manage some snapshots of these vms. The errors on the low level are "stale file handle". Volume is distribute 2 replicate 3 with sharding. Should I open a bug also on oVirt? Gluster 3.12.15-1.el7 oVirt 4.2.6.4-1.el7 Regards, -- Marco Crociani

On Thu, Nov 22, 2018 at 5:51 PM Marco Lorenzo Crociani <marcoc@prismatelecomtesting.com> wrote:
Hi, I opened a bug on gluster because I have reading errors on files on a gluster volume: https://bugzilla.redhat.com/show_bug.cgi?id=1652548
The files are many of the VMs images of the oVirt DATA storage domain. oVirt pause the vms because unknown storage errors. It's impossibile to copy/clone, manage some snapshots of these vms. The errors on the low level are "stale file handle". Volume is distribute 2 replicate 3 with sharding.
Should I open a bug also on oVirt?
Thanks, this bug should do. I've requested for info on the bug - could you update it?
Gluster 3.12.15-1.el7 oVirt 4.2.6.4-1.el7
Regards,
-- Marco Crociani _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/L3JGKXNC45STFU...

Hi, is there a way to recover file from "Stale file handle" errors? Here some of the tests we have done: - compared the extended attributes of all of the three replicas of the involved shard. Found identical attributes. - compared SHA512 message digest of all of the three replicas of the involved shard. Found identical digests. - tried to delete the shard from a replica set, one at a time, along with its hard link. Shard is always rebuilt correctly but error from client persists. Regards, -- Marco Crociani Il 22/11/18 13:19, Marco Lorenzo Crociani ha scritto:
Hi, I opened a bug on gluster because I have reading errors on files on a gluster volume: https://bugzilla.redhat.com/show_bug.cgi?id=1652548
The files are many of the VMs images of the oVirt DATA storage domain. oVirt pause the vms because unknown storage errors. It's impossibile to copy/clone, manage some snapshots of these vms. The errors on the low level are "stale file handle". Volume is distribute 2 replicate 3 with sharding.
Should I open a bug also on oVirt?
Gluster 3.12.15-1.el7 oVirt 4.2.6.4-1.el7
Regards,

Hi Marco, It looks like I'm suffering form the same issue, see; https://lists.gluster.org/pipermail/gluster-users/2019-January/035602.html I've included a simple github gist there, which you can run on the machines with the stale shards. However i haven't tested the full purge, it works well on individual files/shards. Best Olaf
participants (3)
-
Marco Lorenzo Crociani
-
olaf.buitelaar@gmail.com
-
Sahina Bose