[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3

Friday, 14 June 2019

Thanks Gobinda,
I am in the process of finishing up the 9 node cluster, once done I will
test this ansible role...

On Fri, Jun 14, 2019 at 12:45 PM Gobinda Das <godas(a)redhat.com&gt; wrote:

...
 We have ansible role to replace gluster node.I think it works only
with
 same FQDN.
 https://github.com/sac/gluster-ansible-maintenance
 I am not sure if it covers all senarios, but you can try with same FQDN.

 On Fri, Jun 14, 2019 at 7:13 AM Adrian Quintero <adrianquintero(a)gmail.com&gt;
 wrote:

> Strahil,
> Thanks for all the follow up, I will try to reproduce the same scenario
> today, deploy a 9 node cluster, Completely kill the initiating node (vmm10)
> and see If i can recover using the extra server approach (Different
> IP/FQDN). If I am able to recover I will also try to test with your
> suggested second approach (Using same IP/FQDN).
> My objective here is to document the possible recovery scenarios without
> any downtime or impact.
>
> I have documented a few setup and recovery scenarios with 6 and 9 nodes
> already with a hyperconverged setup and I will make them available to the
> community, hopefully this week, including the tests that you have been
> helping me with. Hopefully this will provide help to others that are in the
> same situation that I am, and it will also provide me with feedback from
> more knowledgeable admins out there so that I can get this into production
> in the near future.
>
>
> Thanks again.
>
>
>
> On Wed, Jun 12, 2019 at 11:58 PM Strahil <hunter86_bg(a)yahoo.com&gt; wrote:
>
>> Hi Adrian,
>>
>> Please keep in mind that when a server dies, the easiest way to recover
>> is to get another freshly installed server with different IP/FQDN .
>> Then you will need to use 'replace-brick' and once gluster replaces that
>> node - you should be able to remove the old entry in oVirt.
>> Once the old entry is gone, you can add the new installation in oVirt
>> via the UI.
>>
>> Another approach is to have the same IP/FQDN for the fresh install.In
>> this situation, you need to have the same gluster ID (which should be a
>> text file) and the peer IDs. Most probably you can create them on your own
>> , based on data on the other gluster peers.
>> Once the fresh install is available in 'gluster peer' , you can initiate
>> a reset-brick' (don't forget to set the SELINUX , firewall and repos) and
a
>> full heal.
>> From there you can reinstall the machine from the UI and it should be
>> available for usage.
>>
>> P.S.: I know that the whole procedure is not so easy :)
>>
>> Best Regards,
>> Strahil Nikolov
>> On Jun 12, 2019 19:02, Adrian Quintero <adrianquintero(a)gmail.com&gt; wrote:
>>
>> Strahil, I dont use the GUI that much, in this case I need to understand
>> how all is tied together if I want to move to production. As far as Gluster
>> goes, I can get do the administration thru CLI, however when my test
>> environment was set up it was setup using geodeploy for Hyperconverged
>> setup under oVirt.
>> The initial setup was 3 servers with the same amount of physical disks:
>> sdb, sdc, sdc, sdd, sde(this last one used for caching as it is an SSD)
>>
>> vmm10.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm10.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm10.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm10.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> vmm11.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm11.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm11.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm11.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> vmm12.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm12.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm12.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm12.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> *As you can see from the above the the engine volume is conformed of
>> hosts vmm10 (Initiating cluster server but now dead sever), vmm11 and vmm12
>> and on block device /dev/sdb (100Gb LV), also the vmstore1 volume is also
>> on /dev/sdb (2600Gb LV).*
>> /dev/mapper/gluster_vg_sdb-gluster_lv_engine                   xfs
>>       100G  2.0G   98G   2% /gluster_bricks/engine
>> /dev/mapper/gluster_vg_sdb-gluster_lv_vmstore1                 xfs
>>       2.6T   35M  2.6T   1% /gluster_bricks/vmstore1
>> /dev/mapper/gluster_vg_sdc-gluster_lv_data1                    xfs
>>       2.7T  4.6G  2.7T   1% /gluster_bricks/data1
>> /dev/mapper/gluster_vg_sdd-gluster_lv_data2                    xfs
>>       2.7T  9.5G  2.7T   1% /gluster_bricks/data2
>> vmm10.mydomain.com:/engine
>> fuse.glusterfs  300G  9.2G  291G   4%
>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_engine
>> vmm10.mydomain.com:/vmstore1
>> fuse.glusterfs  5.1T   53G  5.1T   2%
>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_vmstore1
>> vmm10.mydomain.com:/data1
>>  fuse.glusterfs  8.0T   95G  7.9T   2%
>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_data1
>> vmm10.mydomain.com:/data2
>>  fuse.glusterfs  8.0T  112G  7.8T   2%
>> /rhev/data-center/mnt/glusterSD/vmm10.virt.iad3p:_data2
>>
>>
>>
>>
>> *before any issues I increased the size of the cluster and the gluster
>> cluster with the following, creating 4 distributed replicated volumes
>> (engine, vmstore1, data1, data2)*
>>
>> vmm13.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm13.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm13.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm13.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> vmm14.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm14.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm14.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm14.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> vmm15.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm15.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm15.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm15.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> vmm16.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm16.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm16.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm16.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> vmm17.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm17.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm17.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm17.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data2
>>
>> vmm18.mydomain.com:/gluster_bricks/brick1(/dev/sdb) engine
>> vmm18.mydomain.com:/gluster_bricks/brick2(/dev/sdb) vmstore1
>> vmm18.mydomain.com:/gluster_bricks/brick3(/dev/sdc) data1
>> vmm18.mydomain.com:/gluster_bricks/brick4(/dev/sdd) data
>>
>>
>> *with your first suggestion I dont think it is possible to recover as I
>> will lose the engine if I stop the "engine" volume, It might be doable
for
>> vmstore1, data1 and data2 but not the engine.*
>> A) If you have space on another gluster volume (or volumes) or on
>> NFS-based storage, you can migrate all VMs live . Once you do it,  the
>> simple way will be to stop and remove the storage domain (from UI) and
>> gluster volume that correspond to the problematic brick. Once gone, you
>> can  remove the entry in oVirt for the old host and add the newly built
>> one. Then you can recreate your volume and migrate the data back.
>>
>> *I tried removing the brick using CLI but get the following error:*
>> volume remove-brick start: failed: Host node of the brick
>> vmm10.mydomain.com:/gluster_bricks/engine/engine is down
>>
>> *So I used the force command:*
>> gluster vol remove-brick engine vmm10.mydomain.com:/gluster_bricks/engine/engine
>>  vmm11.mydomain.com:/gluster_bricks/engine/engine 
vmm12.mydomain.com:/gluster_bricks/engine/engine
>> force
>> Remove-brick force will not migrate files from the removed bricks, so
>> they will no longer be available on the volume.
>> Do you want to continue? (y/n) y
>> volume remove-brick commit force: success
>>
>> *so I lost my engine:*
>> Please enter your authentication name: vdsm@ovirt
>> Please enter your password:
>>  Id    Name                           State
>> ----------------------------------------------------
>>  3     HostedEngine                   paused
>>
>>  hosted-engine --vm-start
>> The hosted engine configuration has not been retrieved from shared
>> storage. Please ensure that ovirt-ha-agent is running and the storage
>> server is reachable.
>>
>> I guess this fail scenario is more complex than I thought, hosted engine
>> should of survived, as far as gluster I am able to get around command line,
>> the issue is the engine, though it was running on vmm18 and not running on
>> any bricks belonging to vmm10, 11, or 12 (original setup) it still failed...
>> virsh list --all
>> Please enter your authentication name: vdsm@ovirt
>> Please enter your password:
>>  Id    Name                           State
>> ----------------------------------------------------
>>  -     HostedEngine                   shut off
>>
>> *Now I cant get it to start:*
>> hosted-engine --vm-start
>> The hosted engine configuration has not been retrieved from shared
>> storage. Please ensure that ovirt-ha-agent is running and the storage
>> server is reachable.
>> df -hT still showing mounts from old hosts bricks, could the problem be
>> that this was the initiating host of the hyperconverged setup?
>> vmm10.mydomain.com:/engine
>> fuse.glusterfs  200G  6.2G  194G   4% /rhev/data-center/mnt/glusterSD/
>> vmm10.mydomain.com:_engine
>>
>>
>> I will re-create everything from scratch and simulate this again, and
>> see why is it too complex to recover ovirt's engine with gluster when a
>> server dies completely. Maybe it is my lack of understanding with regards
>> how ovirt integrates with gluster though I have a decent understanding of
>> Gluster to work with it...
>>
>> I will let you know once I have the cluster recreated and will kill the
>> same server and see if I missed anything from the recommendations your
>> provided.
>>
>> Thanks,
>>
>> --
>> Adrian.
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jun 11, 2019 at 4:13 PM Strahil Nikolov <hunter86_bg(a)yahoo.com&gt;
>> wrote:
>>
>> Do you have empty space to store the VMs ? If yes, you can always script
>> the migration of the disks via the API . Even a bash script and curl can do
>> the trick.
>>
>> About the /dev/sdb , I still don't get it . A pure "df -hT" from a
node
>> will make it way clear. I guess '/dev/sdb' is a PV and you got 2 LVs
ontop
>> of it.
>>
>> Note: I should admit that as an admin - I don't use UI for gluster
>> management.
>>
>> For now do not try to remove the brick. The approach is either to
>> migrate the qemu disks to another storage or to reset-brick/replace-brick
>> in order to restore the replica count.
>> I will check the file and I will try to figure it out.
>>
>> Redeployment never fixes the issue, it just speeds up the recovery. If
>> you can afford the time to spent on fixing the issue - then do not redeploy.
>>
>> I would be able to take a look next week , but keep in mind that I'm not
>> so in deep with oVirt - I have started playing with it when I deployed my
>> lab.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> Strahil,
>>
>> Looking at your suggestions I think I need to provide a bit more info on
>> my current setup.
>>
>>
>>
>>    1.
>>
>>    I have 9 hosts in total
>>    2.
>>
>>    I have 5 storage domains:
>>    -
>>
>>       hosted_storage (Data Master)
>>       -
>>
>>       vmstore1 (Data)
>>       -
>>
>>       data1 (Data)
>>       -
>>
>>       data2 (Data)
>>       -
>>
>>       ISO (NFS) //had to create this one because oVirt 4.3.3.1 would
>>       not let me upload disk images to a data domain without an ISO (I think
this
>>       is due to a bug)
>>
>>       3.
>>
>>    Each volume is of the type “Distributed Replicate” and each one is
>>    composed of 9 bricks.
>>    I started with 3 bricks per volume due to the initial Hyperconverged
>>    setup, then I expanded the cluster and the gluster cluster by 3 hosts at a
>>    time until I got to a total of 9 hosts.
>>
>>
>>    1.
>>       -
>>
>>
>>
>>
>>
>>
>>
>>
>> *Disks, bricks and sizes used per volume / dev/sdb engine 100GB /
>>       dev/sdb vmstore1 2600GB / dev/sdc data1 2600GB / dev/sdd data2 2600GB /
>>       dev/sde -------- 400GB SSD Used for caching purposes From the above layout
>>       a few questions came up:*
>>       1.
>>
>>
>>
>> *Using the web UI, How can I create a 100GB brick and a 2600GB brick to
>>          replace the bad bricks for “engine” and “vmstore1” within the same
block
>>          device (sdb) ? What about / dev/sde (caching disk), When I tried
creating a
>>          new brick thru the UI I saw that I could use / dev/sde for caching but
only
>>          for 1 brick (i.e. vmstore1) so if I try to create another brick how
would I
>>          specify it is the same / dev/sde device to be used for caching?*
>>
>>
>>
>>    1.
>>
>>    If I want to remove a brick and it being a replica 3, I go to
>>    storage > Volumes > select the volume > bricks once in there I can
select
>>    the 3 servers that compose the replicated bricks and click remove, this
>>    gives a pop-up window with the following info:
>>
>>    Are you sure you want to remove the following Brick(s)?
>>    - vmm11:/gluster_bricks/vmstore1/vmstore1
>>    - vmm12.virt.iad3p:/gluster_bricks/vmstore1/vmstore1
>>    - 192.168.0.100:/gluster-bricks/vmstore1/vmstore1
>>    - Migrate Data from the bricks?
>>
>>    If I proceed with this that means I will have to do this for all the
>>    4 volumes, that is just not very efficient, but if that is the only way,
>>    then I am hesitant to put this into a real production environment as there
>>    is no way I can take that kind of a hit for +500 vms :) and also I
>>    wont have that much storage or extra volumes to play with in a real
>>    sceneario.
>>
>>    2.
>>
>>    After modifying yesterday */ etc/vdsm/ <http://vdsm.id>vdsm.id
>>    <http://vdsm.id> by following (
>>   
<https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids>...
>>    <https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids>)
I
>>    was able to add the server **back **to the cluster using a new fqdn
>>    and a new IP, and tested replacing one of the bricks and this is my mistake
>>    as mentioned in #3 above I used / dev/sdb entirely for 1 brick because thru
>>    the UI I could not separate the block device and be used for 2 bricks (one
>>    for the engine and one for vmstore1). **So in the “gluster vol info”
>>    you might see <http://vmm102.mydomain.com>vmm102.mydomain.com
>>    <http://vmm102.mydomain.com> *
>> *but in reality it is <http://myhost1.mydomain.com>myhost1.mydomain.com
>>    <http://myhost1.mydomain.com> *
>>    3.
>>
>>    *I am also attaching gluster_peer_status.txt * *and in the last 2
>>    entries of that file you will see and entry
>>    <http://vmm10.mydomain.com>vmm10.mydomain.com
<http://vmm10.mydomain.com>
>>    (old/bad entry) and <http://vmm102.mydomain.com>vmm102.mydomain.com
>>    <http://vmm102.mydomain.com> (new entry, same server vmm10, but renamed
to
>>    vmm102). *
>> *Also please find gluster_vol_info.txt file. *
>>    4.
>>
>>    *I am ready *
>> *to redeploy this environment if needed, but I am also ready to test any
>>    other suggestion. If I can get a good understanding on how to recover from
>>    this I will be ready to move to production. *
>>    5.
>>
>>
>>
>> *Wondering if you’d be willing to have a look at my setup through a
>>    shared screen? *
>>
>> *Thanks *
>>
>>
>> *Adrian*
>>
>> On Mon, Jun 10, 2019 at 11:41 PM Strahil <hunter86_bg(a)yahoo.com&gt; wrote:
>>
>> Hi Adrian,
>>
>> You have several options:
>> A) If you have space on another gluster volume (or volumes) or on
>> NFS-based storage, you can migrate all VMs live . Once you do it,  the
>> simple way will be to stop and remove the storage domain (from UI) and
>> gluster volume that correspond to the problematic brick. Once gone, you
>> can  remove the entry in oVirt for the old host and add the newly built
>> one.Then you can recreate your volume and migrate the data back.
>>
>> B)  If you don't have space you have to use a more riskier approach
>> (usually it shouldn't be risky, but I had bad experience in gluster v3):
>> - New server has same IP and hostname:
>> Use command line and run the 'gluster volume reset-brick VOLNAME
>> HOSTNAME:BRICKPATH HOSTNAME:BRICKPATH commit'
>> Replace VOLNAME with your volume name.
>> A more practical example would be:
>> 'gluster volume reset-brick data ovirt3:/gluster_bricks/data/brick
>> ovirt3:/gluster_ ricks/data/brick commit'
>>
>> If it refuses, then you have to cleanup '/gluster_bricks/data' (which
>> should be empty).
>> Also check if the new peer has been probed via 'gluster peer
>> status'.Check the firewall is allowing gluster communication (you can
>> compare it to the firewalls on another gluster host).
>>
>> The automatic healing will kick in 10 minutes (if it succeeds) and will
>> stress the other 2 replicas, so pick your time properly.
>> Note: I'm not recommending you to use the 'force' option in the
previous
>> command ... for now :)
>>
>> - The new server has a different IP/hostname:
>> Instead of 'reset-brick' you can use  'replace-brick':
>> It should be like this:
>> gluster volume replace-brick data old-server:/path/to/brick
>> new-server:/new/path/to/brick commit force
>>
>> In both cases check the status via:
>> gluster volume info VOLNAME
>>
>> If your cluster is in production , I really recommend you the first
>> option as it is less risky and the chance for unplanned downtime will be
>> minimal.
>>
>> The 'reset-brick'  in your previous e-mail shows that one of the servers
>> is not connected. Check peer status on all servers, if they are less than
>> they should check for network and/or firewall issues.
>> On the new node check if glusterd is enabled and running.
>>
>> In order to debug - you should provide more info like 'gluster volume
>> info' and the peer status from each node.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> On Jun 10, 2019 20:10, Adrian Quintero <adrianquintero(a)gmail.com&gt; wrote:
>>
>> >
>> > Can you let me know how to fix the gluster and missing brick?,
>> > I tried removing it by going to "storage > Volumes > vmstore >
bricks
>> > selected the brick
>> > However it is showing as an unknown status (which is expected because
>> the server was completely wiped) so if I try to "remove", "replace
brick"
>> or "reset brick" it wont work
>> > If i do remove brick: Incorrect bricks selected for removal in
>> Distributed Replicate volume. Either all the selected bricks should be from
>> the same sub volume or one brick each for every sub volume!
>> > If I try "replace brick" I cant because I dont have another
server
>> with extra bricks/disks
>> > And if I try "reset brick": Error while executing action Start
Gluster
>> Volume Reset Brick: Volume reset brick commit force failed: rc=-1 out=()
>> err=['Host myhost1_mydomain_com  not connected']
>> >
>> > Are you suggesting to try and fix the gluster using command line?
>> >
>> > Note that I cant "peer detach"   the sever , so if I force the
removal
>> of the bricks would I need to force downgrade to replica 2 instead of 3?
>> what would happen to oVirt as it only supports replica 3?
>> >
>> > thanks again.
>> >
>> > On Mon, Jun 10, 2019 at 12:52 PM Strahil <hunter86_bg(a)yahoo.com&gt;
>> wrote:
>>
>> >>
>> >> Hi Adrian,
>> >> Did you fix the issue with the gluster and the missing brick?
>> >> If yes, try to set the 'old' host in maintenance an
>>
>>
>>
>> --
>> Adrian Quintero
>>
>>
>>
>> --
>> Adrian Quintero
>>
>>
>
> --
> Adrian Quintero
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/44ITPT3QMJI...
>

 --

 Thanks,
 Gobinda

-- 
Adrian Quintero

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[ovirt-users] Re: Replace bad Host from a 9 Node hyperconverged setup 4.3.3