I have a 3-way replica HCI setup. I recently placed one host in maintenance to perform work on it. When I re-activated it I've noticed that many of my gluster volumes are not completing the heal process.
heal info shows shard files in heal pending. I looked up the files and it appears that they exist on the other two hosts (the ones that remained active) but do not exist on the host that was in maintenance.
I tried to run a manual heal on one of the volumes and then a full heal as well but there are still unhealed shards. The shard files also still do not exist on the maintenance host. Here is an example from one of my volumes:
# gluster volume heal prod_a info
Brick gluster0:/gluster_bricks/prod_a/prod_a
Status: Connected
Number of entries: 0
Brick gluster1:/gluster_bricks/prod_a/prod_a
/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177
/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.178
Status: Connected
Number of entries: 2
Brick gluster2:/gluster_bricks/prod_a/prod_a
/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177
/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.178
Status: Connected
Number of entries: 2
host0:
# ls -al /gluster_bricks/prod_a/prod_a/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177
ls: cannot access /gluster_bricks/prod_a/prod_a/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177: No such file or directory
host1:
# ls -al /gluster_bricks/prod_a/prod_a/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177
-rw-rw----. 2 root root 67108864 Jan 13 16:57 /gluster_bricks/prod_a/prod_a/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177
host2:
# ls -al /gluster_bricks/prod_a/prod_a/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177
-rw-rw----. 2 root root 67108864 Jan 13 16:57 /gluster_bricks/prod_a/prod_a/.shard/a746f8d2-5044-4d20-b525-24456e6f6f16.177
How can I heal these volumes?
Thanks!