My hyperconverged cluster was running out of space.
The reason for that is a good problem to have - I've grown more in the last 4 months
than in the past 4-5 years combined.
But the downside was, I had to go ahead and upgrade my storage, and it became urgent to do
so.
I began that process last week.
I have 3 volumes:
[root@cha2-storage dwhite]# gluster volume list
data
engine
vmstore
I did the following on all 3 of my volumes:
1) Converted cluster from Replica 3 to Replica 2, arbiter 1
- I did run into an issue where some VMs were paused, but I was able to poweroff and
power on those VMs again with no issue
2) Ran the following on Host 2:
# gluster volume remove-brick data replica 1
cha1-storage.mgt.my-domain.com:/gluster_bricks/data/data force
# gluster volume remove-brick vmstore replica 1
cha1-storage.mgt.my-domain.com:/gluster_bricks/vmstore/vmstore force
# gluster volume remove-brick engine replica 1
cha1-storage.mgt.my-domain.com:/gluster_bricks/engine/engine force
(rebuilt the array & reboot the server -- when I rebooted, I commented out the
original UUIDs from /etc/fstab for the gluster storage)
# lvcreate -L 2157G --zero n -T gluster_vg_sdb/gluster_thinpool_gluster_vg_sdb
# lvcreate -L 75G -n gluster_lv_engine gluster_vg_sdb
# lvcreate -V 600G --thin -n gluster_lv_data
gluster_vg_sdb/gluster_thinpool_gluster_vg_sdb
# lvcreate -V 1536G --thin -n gluster_lv_vmstore
gluster_vg_sdb/gluster_thinpool_gluster_vg_sdb
# mkfs.xfs /dev/gluster_vg_sdb/gluster_lv_engine
# mkfs.xfs /dev/gluster_vg_sdb/gluster_lv_data
# mkfs.xfs /dev/gluster_vg_sdb/gluster_lv_vmstore
At this point, I ran lsblk --fs to get the new UUIDs, and put them into /etc/fstab
# mount -a
# gluster volume add-brick engine replica 2
cha2-storage.mgt.my-domain.com:/gluster_bricks/engine/engine force
# gluster volume add-brick vmstore replica 2
cha2-storage.mgt.my-domain.com:/gluster_bricks/vmstore/vmstore force
# gluster volume add-brick data replica 2
cha2-storage.mgt.my-domain.com:/gluster_bricks/data/data force
So far so good.
3) I was running critically low on disk space on the replica, so I:- Let the gluster
volume heal
- Then I removed the Host 1 bricks from the volumes
When I removed the Host 1 bricks, I made sure that there were no additional healing tasks,
then proceeded to do so:# gluster volume remove-brick data replica 1
cha1-storage.mgt.my-domain.com:/gluster_bricks/data/data force
# gluster volume remove-brick vmstore replica 1
cha1-storage.mgt.my-domain.com:/gluster_bricks/vmstore/vmstore force
At this point, I ran into problems.
All of my VMs went into a Paused state, and I had to reboot all of them.
All VMs came back online, but two VMs were corrupted. I wound up re-building them from
scratch.
Unfortunately, we lost a lot of data on 1 of the VMs, as I didn't realize backups were
broken on that particular VM.
Is there a way for me to go into those disks (that are still mounted to Host 1), examine
the Gluster content, and somehow mount / recover the data from the VM that we lost?
Sent with ProtonMail Secure Email.