Could you check if the self-heal daemon on all nodes is connected to the 3 bricks? You will need to check the glustershd.log for that.If it is not connected, try restarting the shd using `gluster volume start engine force`, then launch the heal command like you did earlier and see if heals happen.
[2017-07-20 09:58:46.573079] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on e6dfd556-340b-4b76-b47b- 7b6f5bd74327. sources=[0] 1 sinks=2 [2017-07-20 09:59:22.995003] I [MSGID: 108026] [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 0-engine-replicate-0: performing metadata selfheal on f05b9742-2771-484a-85fc- 5b6974bcef81 [2017-07-20 09:59:22.999372] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed metadata selfheal on f05b9742-2771-484a-85fc- 5b6974bcef81. sources=[0] 1 sinks=2
If it doesn't, please provide the getfattr outputs of the 12 files from all 3 nodes using `getfattr -d -m . -e hex /gluster/engine/brick/path-to-file` ?
NODE01:getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.68 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000120000000000000000 trusted.bit-rot.version=0x090000000000000059647d5b0004 47e9 trusted.gfid=0xe3565b5014954e5bae883bceca47 b7d9 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.48 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x0000000e0000000000000000 trusted.bit-rot.version=0x090000000000000059647d5b0004 47e9 trusted.gfid=0x676067891f344c1586b8c0d05b07 f187 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ images/19d71267-52a4-42a3- bb1e-e3145361c0c2/7a215635- 02f3-47db-80db-8b689c6a8f01 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000550000000000000000 trusted.bit-rot.version=0x090000000000000059647d5b0004 47e9 trusted.gfid=0x8aa745646740403ead51f56d9ca5 d7a7 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000c80000000000000000000 00000000000000d4f2290000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.60 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000070000000000000000 trusted.bit-rot.version=0x090000000000000059647d5b0004 47e9 trusted.gfid=0x4e33ac33dddb4e29b4a351770b81 166a getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ dom_md/ids trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000000000000000000000 trusted.bit-rot.version=0x0f0000000000000059647d5b0004 47e9 trusted.gfid=0x2581cb9ac2b74bd9ac17a09bd2f0 01b3 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000100000000000000000 000000000000000008000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/__DIRECT_IO_TEST__ trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000000000000000000000 trusted.gfid=0xf05b97422771484a85fc5b6974bc ef81 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000000000000000000000 000000000000000000000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ images/88d41053-a257-4272- 9e2e-2f3de0743b81/6573ed08- d3ed-4d12-9227-2c95941e1ad6 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000010000000000000000 trusted.bit-rot.version=0x0f0000000000000059647d5b0004 47e9 trusted.gfid=0xe6dfd556340b4b76b47b7b6f5bd7 4327 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000100000000000000000 000000000000000008000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.64 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x0000000a0000000000000000 trusted.bit-rot.version=0x090000000000000059647d5b0004 47e9 trusted.gfid=0x9ef88647cfe64a35a38ca5173c9e 8fc0 NODE02:getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.68 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-2=0x0000001a0000000000000000 trusted.bit-rot.version=0x08000000000000005965ede0000c 352d trusted.gfid=0xe3565b5014954e5bae883bceca47 b7d9 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.48 trusted.afr.dirty=0x000000010000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-2=0x0000000c0000000000000000 trusted.bit-rot.version=0x08000000000000005965ede0000c 352d trusted.gfid=0x676067891f344c1586b8c0d05b07 f187 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ images/19d71267-52a4-42a3- bb1e-e3145361c0c2/7a215635- 02f3-47db-80db-8b689c6a8f01 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-1=0x000000000000000000000000 trusted.afr.engine-client-2=0x0000008e0000000000000000 trusted.bit-rot.version=0x08000000000000005965ede0000c 352d trusted.gfid=0x8aa745646740403ead51f56d9ca5 d7a7 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000c80000000000000000000 00000000000000d4f2290000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.60 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000090000000000000000 trusted.bit-rot.version=0x08000000000000005965ede0000c 352d trusted.gfid=0x4e33ac33dddb4e29b4a351770b81 166a getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ dom_md/ids trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000010000000000000000 trusted.bit-rot.version=0x08000000000000005965ede0000c 352d trusted.gfid=0x2581cb9ac2b74bd9ac17a09bd2f0 01b3 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000100000000000000000 000000000000000008000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/__DIRECT_IO_TEST__ trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000000000000000000000 trusted.gfid=0xf05b97422771484a85fc5b6974bc ef81 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000000000000000000000 000000000000000000000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ images/88d41053-a257-4272- 9e2e-2f3de0743b81/6573ed08- d3ed-4d12-9227-2c95941e1ad6 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000020000000000000000 trusted.bit-rot.version=0x08000000000000005965ede0000c 352d trusted.gfid=0xe6dfd556340b4b76b47b7b6f5bd7 4327 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000100000000000000000 000000000000000008000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.64 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.engine-client-0=0x000000000000000000000000 trusted.afr.engine-client-2=0x000000120000000000000000 trusted.bit-rot.version=0x08000000000000005965ede0000c 352d trusted.gfid=0x9ef88647cfe64a35a38ca5173c9e 8fc0 NODE04:getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.68 security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.bit-rot.version=0x050000000000000059662c390006 b836 trusted.gfid=0xe3565b5014954e5bae883bceca47 b7d9 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.48 security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.bit-rot.version=0x050000000000000059662c390006 b836 trusted.gfid=0x676067891f344c1586b8c0d05b07 f187 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ images/19d71267-52a4-42a3- bb1e-e3145361c0c2/7a215635- 02f3-47db-80db-8b689c6a8f01 security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.bit-rot.version=0x050000000000000059662c390006 b836 trusted.gfid=0x8aa745646740403ead51f56d9ca5 d7a7 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000c80000000000000000000 00000000000000d4f2290000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.60 security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.bit-rot.version=0x050000000000000059662c390006 b836 trusted.gfid=0x4e33ac33dddb4e29b4a351770b81 166a getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ dom_md/ids security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x050000000000000059662c390006 b836 trusted.gfid=0x2581cb9ac2b74bd9ac17a09bd2f0 01b3 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000100000000000000000 000000000000000008000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/__DIRECT_IO_TEST__ security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.bit-rot.version=0x0200000000000000596484e20006 237b trusted.gfid=0xf05b97422771484a85fc5b6974bc ef81 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000000000000000000000 000000000000000000000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/ images/88d41053-a257-4272- 9e2e-2f3de0743b81/6573ed08- d3ed-4d12-9227-2c95941e1ad6 security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x050000000000000059662c390006 b836 trusted.gfid=0xe6dfd556340b4b76b47b7b6f5bd7 4327 trusted.glusterfs.shard.block-size=0x0000000020000000 trusted.glusterfs.shard.file-size= 0x0000000000100000000000000000 000000000000000008000000000000 000000 getfattr: Removing leading '/' from absolute path names# file: gluster/engine/brick/.shard/8aa74564-6740-403e-ad51- f56d9ca5d7a7.64 security.selinux=0x73797374656d5f753a6f626a6563 745f723a756e6c6162656c65645f74 3a733000 trusted.bit-rot.version=0x050000000000000059662c390006 b836 trusted.gfid=0x9ef88647cfe64a35a38ca5173c9e 8fc0
[root@node01 ~]# sestatusSELinux status: disabled[root@node02 ~]# sestatusSELinux status: disabled[root@node04 ~]# sestatusSELinux status: disabled
Thanks,
Ravi
2. Are these 12 files also present in the 3rd data brick?
I've checked right now: all files exists in all 3 nodes3. Can you provide the output of `gluster volume info` for the this volume?
Volume Name: engineType: ReplicateVolume ID: d19c19e3-910d-437b-8ba7-4f2a23d17515 Status: StartedSnapshot Count: 0Number of Bricks: 1 x 3 = 3Transport-type: tcpBricks:Brick1: node01:/gluster/engine/brickBrick2: node02:/gluster/engine/brickBrick3: node04:/gluster/engine/brickOptions Reconfigured:nfs.disable: onperformance.readdir-ahead: ontransport.address-family: inetstorage.owner-uid: 36performance.quick-read: offperformance.read-ahead: offperformance.io-cache: offperformance.stat-prefetch: offperformance.low-prio-threads: 32network.remote-dio: offcluster.eager-lock: enablecluster.quorum-type: autocluster.server-quorum-type: servercluster.data-self-heal-algorithm: full cluster.locking-scheme: granularcluster.shd-max-threads: 8cluster.shd-wait-qlength: 10000features.shard: onuser.cifs: offstorage.owner-gid: 36features.shard-block-size: 512MBnetwork.ping-timeout: 30performance.strict-o-direct: oncluster.granular-entry-heal: onauth.allow: *server.allow-insecure: on
Some extra info:
We have recently changed the gluster from: 2 (full repliacated) + 1 arbiter to 3 full replicated cluster
Just curious, how did you do this? `remove-brick` of arbiter brick followed by an `add-brick` to increase to replica-3?
Yes
#gluster volume remove-brick engine replica 2 node03:/gluster/data/brick force (OK!)
#gluster volume heal engine info (no entries!)
#gluster volume add-brick engine replica 3 node04:/gluster/engine/brick (OK!)
After some minutes
[root@node01 ~]# gluster volume heal engine infoBrick node01:/gluster/engine/brickStatus: ConnectedNumber of entries: 0
Brick node02:/gluster/engine/brickStatus: ConnectedNumber of entries: 0
Brick node04:/gluster/engine/brickStatus: ConnectedNumber of entries: 0
Thanks,
Ravi
Another extra info (I don't know if this can be the problem): Five days ago A black out has suddenly shut down the networks switch (also gluster network) of node 03 and 04 ... But I don't know this problem is in place after this black out
Thank you!