Volume Name: dataType: ReplicateVolume ID: c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2dStatus: StartedSnapshot Count: 0Number of Bricks: 1 x 3 = 3Transport-type: tcpBricks:Brick1: gdnode01:/gluster/data/brickBrick2: gdnode02:/gluster/data/brickBrick3: gdnode04:/gluster/data/brickOptions Reconfigured:nfs.disable: onperformance.readdir-ahead: ontransport.address-family: inetstorage.owner-uid: 36performance.quick-read: offperformance.read-ahead: offperformance.io-cache: offperformance.stat-prefetch: offperformance.low-prio-threads: 32network.remote-dio: enablecluster.eager-lock: enablecluster.quorum-type: autocluster.server-quorum-type: servercluster.data-self-heal-algorithm: fullcluster.locking-scheme: granularcluster.shd-max-threads: 8cluster.shd-wait-qlength: 10000features.shard: onuser.cifs: offstorage.owner-gid: 36features.shard-block-size: 512MBnetwork.ping-timeout: 30performance.strict-o-direct: oncluster.granular-entry-heal: onauth.allow: *server.allow-insecure: on
Volume Name: engineType: ReplicateVolume ID: d19c19e3-910d-437b-8ba7-4f2a23d17515 Status: StartedSnapshot Count: 0Number of Bricks: 1 x 3 = 3Transport-type: tcpBricks:Brick1: gdnode01:/gluster/engine/brickBrick2: gdnode02:/gluster/engine/brickBrick3: gdnode04:/gluster/engine/brickOptions Reconfigured:nfs.disable: onperformance.readdir-ahead: ontransport.address-family: inetstorage.owner-uid: 36performance.quick-read: offperformance.read-ahead: offperformance.io-cache: offperformance.stat-prefetch: offperformance.low-prio-threads: 32network.remote-dio: offcluster.eager-lock: enablecluster.quorum-type: autocluster.server-quorum-type: servercluster.data-self-heal-algorithm: full cluster.locking-scheme: granularcluster.shd-max-threads: 8cluster.shd-wait-qlength: 10000features.shard: onuser.cifs: offstorage.owner-gid: 36features.shard-block-size: 512MBnetwork.ping-timeout: 30performance.strict-o-direct: oncluster.granular-entry-heal: onauth.allow: *
2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishankar@redhat.com>:But it does say something. All these gfids of completed heals in the log below are the for the ones that you have given the getfattr output of. So what is likely happening is there is an intermittent connection problem between your mount and the brick process, leading to pending heals again after the heal gets completed, which is why the numbers are varying each time. You would need to check why that is the case.
Hope this helps,
Ravi
[2017-07-20 09:58:46.573079] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5b d74327. sources=[0] 1 sinks=2 [2017-07-20 09:59:22.995003] I [MSGID: 108026] [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 0-engine-replicate-0: performing metadata selfheal on f05b9742-2771-484a-85fc-5b6974 bcef81 [2017-07-20 09:59:22.999372] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974 bcef81. sources=[0] 1 sinks=2 Hi,following your suggestion, I've checked the "peer" status and I found that there is too many name for the hosts, I don't know if this can be the problem or part of it:gluster peer status on NODE01:Number of Peers: 2Hostname: dnode02.localdomain.localUuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd State: Peer in Cluster (Connected)Other names:192.168.10.52dnode02.localdomain.local10.10.20.9010.10.10.20gluster peer status on NODE02:Number of Peers: 2Hostname: dnode01.localdomain.localUuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12 State: Peer in Cluster (Connected)Other names:gdnode0110.10.10.10Hostname: gdnode04Uuid: ce6e0f6b-12cf-4e40-8f01-d1609dfc5828 State: Peer in Cluster (Connected)Other names:192.168.10.5410.10.10.40gluster peer status on NODE04:Number of Peers: 2Hostname: dnode02.neridom.domUuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd State: Peer in Cluster (Connected)Other names:10.10.20.90gdnode02192.168.10.5210.10.10.20Hostname: dnode01.localdomain.localUuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12 State: Peer in Cluster (Connected)Other names:gdnode0110.10.10.10All these ip are pingable and hosts resolvible across all 3 nodes but, only the 10.10.10.0 network is the decidated network for gluster (rosolved using gdnode* host names) ... You think that remove other entries can fix the problem? So, sorry, but, how can I remove other entries?And, what about the selinux?Thank you