Hello Colleagues,
I want to share my experience and especially how I have recovered after a situation where
gluster refuses to heal several files of my oVirt Lab.
Most probably the situation was caused by the fact that I didn't check if all files
were healed before I started the upgrade on the next node , which in a 'replica 2
arbiter1' setup has caused multiple files missing/conflicting and heal fails to
happen.
Background:1. I have powered off my HostedEngine VM and made a snapshot of the gluster
volume2. Started the update , but without screen - I messed it up and decided to revert
from the snapshot3. Powered off the volume , restored from snapshot and then started (and
again snapshoted the volume)4. Upgrade of the HostedEngine was successfull5. Upgraded the
arbiter (ovirt3)6. I forgot to check the heal status on the arbiter and upgraded
ovirt1/gluster1 (which maybe was the reason for the issue)7. After gluster1 was healed I
saw that some files are left for healing , but expected it will finish till
ovirt2/gluster2 is patched8. Sadly my assumption was not right and after the reboot of
ovirt2/gluster2 I noticed that some files never heal.
Symptoms:2 files per volume never heal (used 'full' mode) even after I
'stat'-ed every file/dir in the volume.Ovirt Dashboard reported multiple errors
(50+) that it cannot update the OVF metadata for volume/VM.
Here are my notes while I was recovering from the situation. As this is my LAB, I shutdown
all VMs (including the HostedEngine) as downtime was not an issue:
Heals never complete:
# gluster volume heal data_fast4 info
Brick gluster1:/gluster_bricks/data_fast4/data_fast4
Status: Connected
Number of entries: 0
Brick gluster2:/gluster_bricks/data_fast4/data_fast4
<gfid:d21a6512-eaf6-4859-90cf-eeef2cc0cab8>
<gfid:95bc2cd2-8a1e-464e-a384-6b128780d370>
Status: Connected
Number of entries: 2
Brick ovirt3:/gluster_bricks/data_fast4/data_fast4
<gfid:d21a6512-eaf6-4859-90cf-eeef2cc0cab8>
<gfid:95bc2cd2-8a1e-464e-a384-6b128780d370>
Status: Connected
Number of entries: 2
Mount to get the gfid to path relationship:
mount -t glusterfs -o aux-gfid-mount gluster1:/data_fast4 /mnt
# getfattr -n trusted.glusterfs.pathinfo -e text
/mnt/.gfid/d21a6512-eaf6-4859-90cf-eeef2cc0cab8
getfattr: Removing leading '/' from absolute path names
# file: mnt/.gfid/d21a6512-eaf6-4859-90cf-eeef2cc0cab8
trusted.glusterfs.pathinfo="(<REPLICATE:data_fast4-replicate-0>
<POSIX(/gluster_bricks/data_fast4/data_fast4):ovirt2.localdomain:/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a651
2-eaf6-4859-90cf-eeef2cc0cab8>
<POSIX(/gluster_bricks/data_fast4/data_fast4):ovirt3.localdomain:/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8>)"
The local brick (that is supposed to be healed) is missing some data:
# ls -l
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8
ls: няма достъп до
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8:
Няма такъв файл или директория
Remote is OK:
# ssh gluster2 'ls -l
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8'
-rw-r--r--. 2 vdsm kvm 436 9 ное 19,34
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8
Arbiter is also OK:
# ssh ovirt3 'ls -l
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8'
-rw-r--r--. 2 vdsm kvm 0 9 ное 19,34
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8
Rsync the file/directory from a good brick to broken one:
# rsync -avP gluster2:/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/
# ls -l
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8
-rw-r--r--. 1 vdsm kvm 436 9 ное 19,34
/gluster_bricks/data_fast4/data_fast4/.glusterfs/d2/1a/d21a6512-eaf6-4859-90cf-eeef2cc0cab8
After a full heal we see our problematic file:
# gluster volume heal data_fast4 full
Launching heal operation to perform full self heal on volume data_fast4 has been
successful
Use heal info commands to check status.
# gluster volume heal data_fast4 info
Brick gluster1:/gluster_bricks/data_fast4/data_fast4
Status: Connected
Number of entries: 0
Brick gluster2:/gluster_bricks/data_fast4/data_fast4
/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
Status: Connected
Number of entries: 1
Brick ovirt3:/gluster_bricks/data_fast4/data_fast4
/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
Status: Connected
Number of entries: 1
This time the file is missing on both gluster1 (older version) and arbiter:
# cat
/gluster_bricks/data_fast4/data_fast4/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
CTIME=1558265783
DESCRIPTION={"Updated":true,"Size":256000,"Last
Updated":"Sat Nov 09 19:24:08 EET 2019","Storage
Domains":[{"uuid":"578bca3d-6540-41cd-8e0e-9e3047026484"}],"Disk
Description":"OVF_STORE"}
DISKTYPE=OVFS
DOMAIN=578bca3d-6540-41cd-8e0e-9e3047026484
FORMAT=RAW
GEN=0
IMAGE=58e197a6-12df-4432-a643-298d40e44130
LEGALITY=LEGAL
MTIME=0
PUUID=00000000-0000-0000-0000-000000000000
SIZE=262144
TYPE=PREALLOCATED
VOLTYPE=LEAF
EOF
# ssh gluster2 cat
/gluster_bricks/data_fast4/data_fast4/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
CTIME=1558265783
DESCRIPTION={"Updated":true,"Size":256000,"Last
Updated":"Sat Nov 09 19:34:33 EET 2019","Storage
Domains":[{"uuid":"578bca3d-6540-41cd-8e0e-9e3047026484"}],"Disk
Description":"OVF_STORE"}
DISKTYPE=OVFS
DOMAIN=578bca3d-6540-41cd-8e0e-9e3047026484
FORMAT=RAW
GEN=0
IMAGE=58e197a6-12df-4432-a643-298d40e44130
LEGALITY=LEGAL
MTIME=0
PUUID=00000000-0000-0000-0000-000000000000
SIZE=262144
TYPE=PREALLOCATED
VOLTYPE=LEAF
EOF
# ssh ovirt3 cat
/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
cat:
/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta:
Няма такъв файл или директория
As we miss the same file on both a brick and arbiter we take another approach:
First remove on gluster1 the file to another name:
# mv
/gluster_bricks/data_fast4/data_fast4/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
/gluster_bricks/data_fast4
/data_fast4/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta_old
Then we copy the file from gluster2 (good brick)
# rsync -avP
gluster2:/gluster_bricks/data_fast4/data_fast4/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
/gluster_
bricks/data_fast4/data_fast4/578bca3d-6540-41cd-8e0e-9e3047026484/images/58e197a6-12df-4432-a643-298d40e44130/535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
receiving incremental file list
535ec7f7-f4d1-4d1e-a988-c1e95b4a38ca.meta
436 100% 425.78kB/s 0:00:00 (xfr#1, to-chk=0/1)
sent 43 bytes received 571 bytes 1,228.00 bytes/sec
total size is 436 speedup is 0.71
Run full heal:
# gluster volume heal data_fast4 full
Launching heal operation to perform full self heal on volume data_fast4 has been
successful
Use heal info commands to check status
# gluster volume heal data_fast4 info
Brick gluster1:/gluster_bricks/data_fast4/data_fast4
Status: Connected
Number of entries: 0
Brick gluster2:/gluster_bricks/data_fast4/data_fast4
Status: Connected
Number of entries: 0
Brick ovirt3:/gluster_bricks/data_fast4/data_fast4
Status: Connected
Number of entries: 0
And ofcourse umount /mnt
I did the above for all oVirt Storage domains and rebooted all nodes (simultaneously)
after stopping the whole stack. This one should not be neccessary , but I wanted to be
sure that after power outage the cluster will be operational again:
systemctl stop ovirt-ha-agent ovirt-ha-broker vdsmd supervdsmd sanlock
glusterd/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh
Verification:After the reboot , I have tried to set each oVirt storage domain in
'Maintenance' which confirms that engine can update the OVMF meta and then set it
back to Active. Without downtime , this will not be possible.
I hope this long post will help anyone .
PS: I have collected some data for some of the files, that I have ommited as this e-mail
is very long.
Best Regards,Strahil Nikolov