Hello,
Many thanks for your email.
I should add that this is a test environment we set up in preparation for a
planned CentOS 7 / oVirt 4.3 upgrade to CentOS 8 Streams / oVirt 4.5
upgrade in one of our old(er) oVirt clusters.
In this case, we blew up the software RAID during the OS replacement
(CentOS 7 -> 8) so have a host, but no storage.
And as an added bonus, the FS locations are a bit different. (due MD
changes we made during the blowup).
So, essentially the host is alive, but we need to create a new brick using
a known good brick.
A couple of questions:
Assuming I have a known good brick to copy but the FS location is different
and given the fact I cannot simply remove/add brick, how do I change the
brick path?
Old location:
office-wx-hv1-lab-gfs:/mnt/LogGFSData/brick
New location:
office-wx-hv1-lab-gfs.localdomain:/gluster/brick/data/brick
Thanks again,
Gilboa
On Mon, Jul 18, 2022 at 1:32 AM Patrick Hibbs <hibbsncc1701(a)gmail.com>
wrote:
What you are missing is the fact that gluster requires more than one
set
of bricks to recover from a dead host. I.e. In your set up, you'd need 6
hosts. 4x replicas and 2x arbiters with at least one set (2x replicas and
1x arbiter) operational bare minimum.
Automated commands to fix the volume do not exist otherwise. (It's a
Gluster limitation.) This can be fixed manually however.
Standard Disclaimer: Back up your data first! Fixing this issue requires
manual intervention. Reader assumes all responsiblity for any action
resulting from the instructions below. Etc.
If it's just a dead brick, (i.e. the host is still functional), all you
really need to do is replace the underlying storage:
1. Take the gluster volume offline.
2. Remove the bad storage device, and attach the replacement.
3. rsync / scp / etc. the data from a known good brick (be sure to include
hidden files / preserve file times and ownership / SELinux labels / etc. ).
4. Restart the gluster volume.
Gluster *might* still need to heal everything after all of that, but it
should start the volume and get it running again.
If the host itself is dead, (and the underlying storage is still
functional), you can just move the underlying storage over to the new host:
1. Take the gluster volume offline.
2. Attach the old storage.
3. Fix up the ids on the volume file. (
https://serverfault.com/questions/631365/rename-a-glusterfs-peer)
4. Restart the gluster volume.
If both the host and underlying storage are dead, you'll need to do both
tasks:
1. Take the gluster volume offline.
2. Attach the new storage.
3. rsync / scp / etc. the data from a known good brick (be sure to
include hidden files / preserve file times and ownership / SELinux labels /
etc. ).
4. Fix up the ids on the volume file.
5. Restart the gluster volume.
Keep in mind one thing however: If the gluster host you are replacing is
used by oVirt to connect to the volume (I.e. It's the host named in the
volume config in the Admin portal). The new host will need to retain the
old hostname / IP, or you'll need to update oVirt's config. Otherwise the
VM hosts will wind up in Unassigned / Non-functional status.
- Patrick Hibbs
On Sun, 2022-07-17 at 22:15 +0300, Gilboa Davara wrote:
Hello all,
I'm attempting to replace a dead host in a replica 2 + arbiter gluster
setup and replace it with a new host.
I've already set up a new host (same hostname..localdomain) and got into
the cluster.
$ gluster peer status
Number of Peers: 2
Hostname: office-wx-hv3-lab-gfs
Uuid: 4e13f796-b818-4e07-8523-d84eb0faa4f9
State: Peer in Cluster (Connected)
Hostname: office-wx-hv1-lab-gfs.localdomain <------ This is a new host.
Uuid: eee17c74-0d93-4f92-b81d-87f6b9c2204d
State: Peer in Cluster (Connected)
$ gluster volume info GV2Data
Volume Name: GV2Data
Type: Replicate
Volume ID: c1946fc2-ed94-4b9f-9da3-f0f1ee90f303
Status: Stopped
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: office-wx-hv1-lab-gfs:/mnt/LogGFSData/brick <------ This is the
dead host.
Brick2: office-wx-hv2-lab-gfs:/mnt/LogGFSData/brick
Brick3: office-wx-hv3-lab-gfs:/mnt/LogGFSData/brick (arbiter)
...
Looking at the docs, it seems that I need to remove the dead brick.
$ gluster volume remove-brick GV2Data
office-wx-hv1-lab-gfs:/mnt/LogGFSData/brick start
Running remove-brick with cluster.force-migration enabled can result in
data corruption. It is safer to disable this option so that files that
receive writes during migration are not migrated.
Files that are not migrated can then be manually copied after the
remove-brick commit operation.
Do you want to continue with your current cluster.force-migration
settings? (y/n) y
volume remove-brick start: failed: Removing bricks from replicate
configuration is not allowed without reducing replica count explicitly
So I guess I need to drop from replica 2 + arbiter to replica 1 + arbiter
(?).
$ gluster volume remove-brick GV2Data replica 1
office-wx-hv1-lab-gfs:/mnt/LogGFSData/brick start
Running remove-brick with cluster.force-migration enabled can result in
data corruption. It is safer to disable this option so that files that
receive writes during migration are not migrated.
Files that are not migrated can then be manually copied after the
remove-brick commit operation.
Do you want to continue with your current cluster.force-migration
settings? (y/n) y
volume remove-brick start: failed: need 2(xN) bricks for reducing replica
count of the volume from 3 to 1
... What am I missing?
- Gilboa
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OIXTFTJREUA...
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MAEPVKYSBBY...