
Hey folks, in our production setup with 3 nodes (HCI) we took one host down (maintenance, stop gluster, poweroff via ssh/ovirt engine). Once it was up the gluster hat 2k healing entries that went down in a matter on 10 minutes to 2. Those two give me a headache: [root@node03:~] # gluster vol heal ssd_storage info Brick node01:/gluster_bricks/ssd_storage/ssd_storage <gfid:a121e4fb-0984-4e41-94d7-8f0c4f87f4b6> <gfid:6f8817dc-3d92-46bf-aa65-a5d23f97490e> Status: Connected Number of entries: 2 Brick node02:/gluster_bricks/ssd_storage/ssd_storage Status: Connected Number of entries: 0 Brick node03:/gluster_bricks/ssd_storage/ssd_storage <gfid:a121e4fb-0984-4e41-94d7-8f0c4f87f4b6> <gfid:6f8817dc-3d92-46bf-aa65-a5d23f97490e> Status: Connected Number of entries: 2 No paths, only gfid. We took down node2, so it does not have the file: [root@node01:~] # md5sum /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 75c4941683b7eabc223fc9d5f022a77c /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 [root@node02:~] # md5sum /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 md5sum: /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6: No such file or directory [root@node03:~] # md5sum /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 75c4941683b7eabc223fc9d5f022a77c /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 The other two files are md5-identical. These flags are identical, too: [root@node01:~] # getfattr -d -m . -e hex /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 getfattr: Removing leading '/' from absolute path names # file: gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.ssd_storage-client-1=0x0000004f0000000100000000 trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6 trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030 trusted.glusterfs.mdata=0x010000000000000000000000005e349b1e000000001139aa2a000000005e349b1e000000001139aa2a000000005e34994900000000304a5eb2 getfattr: Removing leading '/' from absolute path names # file: gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.ssd_storage-client-1=0x0000004f0000000100000000 trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6 trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030 trusted.glusterfs.mdata=0x010000000000000000000000005e349b1e000000001139aa2a000000005e349b1e000000001139aa2a000000005e34994900000000304a5eb2 Now, I dont dare simply proceeding withouth some advice. Anyone got a clue on who to resolve this issue? File #2 is identical to this one, from a problem point of view. Have a great weekend! -Chris. -- with kind regards, mit freundlichen Gruessen, Christian Reiss