Hi David,
I hope you manage to recover the VM or most of the data. If you got multiple disks in that
VM (easily observeable in oVirt UI), you might need to repeat that again for the rest of
the disks.
Check with xfs_info the inode size (isize), as the default used to be 256, but I have
noticed that in some cases mkfs.xfs picked a higher value (EL7). Also, check the
gluster's logs or at least keep them for a later check. Usually, smaller inode size
can cause a lot and really awkward issues in Gluster, but this needs to be verified.
Once the raid is fully rebuilt, you will have to add both the HW raid and the arbiter
brick (add-brick replica 3 arbiter 1) . As you will be reusing the arbiter brick, the
safest is to mkfs.xfs and also increase the inode ratio to 90%.
Can you provide your volume info ? The default shard size is just 64MB and transfer is
quite fast, so there should be no locking or the symptoms reported .
Once the healing is over, you should be ready for the rebuilt of the other node.
Best Regards,Strahil Nikolov
Ok, so right now, my production cluster is operating off of a single brick. I was planning
on expanding the storage on the 2nd host next week, and adding that back into the cluster,
and getting the Replica 2, Arbiter 1 redundancy working again.
How would you recommend I proceed with that plan, knowing that I'm currently operating
off of a single brick in which I did NOT specify the size with `mkfs.xfs -i size=512?
Should I specify the size on the new brick I build next week, and then once everything is
healed, reformat the current brick?
And then there is a lot of information missing between the lines: I
guess you are using a 3 node HCI setup and were adding new disks (/dev/sdb) on all three
nodes and trying to move the glusterfs to those new bigger disks?
You are correct in that I'm using 3-node HCI. I originally built HCI with Gluster
replication on all 3 nodes (Replica 3). As I'm increasing the storage, I'm also
moving to an architecture of Replica 2/Arbiter 1. So yes, the plan was:
1) Convert FROM Replica 3 TO replica 2/arbiter 1
2) Convert again down to a Replica 1 (so no replication... just operating storage on a
single host)
3) Rebuild the RAID array (with larger storage) on one of the unused hosts, and rebuild
the gluster bricks
4) Add the larger RAID back into gluster, let it heal
5) Now, remove the bricks from the host with the smaller storage -- THIS is where things
went awry, and what caused the data loss on this 1 particular VM
--- This is where I am currently ---
6) Rebuild the RAID array on the remaining host that is now unused (This is what I am /
was planning to do next week)
Sent with ProtonMail Secure Email.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, August 5th, 2021 at 3:12 PM, Thomas Hoberg <thomas(a)hoberg.net> wrote:
If you manage to export the disk image via the GUI, the result should
be a qcow2 format file, which you can mount/attach to anything Linux (well, if the VM was
Linux... it didn't say)
But it's perhaps easier to simply try to attach the disk of the
failed VM as a secondary to a live VM to recover the data.
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org