Replace bad Host from a 9 Node hyperconverged setup 4.3.3

Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage? One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps? thanks,

Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the corresponding new one. Hope it helps. Cheers, Leo On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote:
Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage?
One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps?
thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY...

Hi Leo, yes, this helps a lot, this confirms the plan we had in mind. Will test tomorrow and post the results. Thanks again Adrian On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the corresponding new one. Hope it helps. Cheers, Leo
On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote:
Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage?
One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps?
thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY...
-- Adrian Quintero

Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved? thanks, Adrian On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the corresponding new one. Hope it helps. Cheers, Leo
On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote:
Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage?
One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps?
thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY...
-- Adrian Quintero
-- Adrian Quintero

I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action: host1.mydomain.com - Cannot remove Host. Server having Gluster volume. On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the corresponding new one. Hope it helps. Cheers, Leo
On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote:
Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage?
One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps?
thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY...
-- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero

Have you tried with "Force remove" tick ? Best Regards,Strahil Nikolov В четвъртък, 6 юни 2019 г., 21:47:20 ч. Гринуич+3, Adrian Quintero <adrianquintero@gmail.com> написа: I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action: host1.mydomain.com - Cannot remove Host. Server having Gluster volume. On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero <adrianquintero@gmail.com> wrote: Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved? thanks, Adrian On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero <adrianquintero@gmail.com> wrote: Hi Leo, yes, this helps a lot, this confirms the plan we had in mind. Will test tomorrow and post the results. Thanks again Adrian On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote: Hi Adrian,I think the steps are:- reinstall the host- join it to virtualisation clusterAnd if was member of gluster cluster as well:- go to host - storage devices- create the bricks on the devices - as they are on the other hosts- go to storage - volumes- replace each failed brick with the corresponding new one.Hope it helps.Cheers,Leo On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote: Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage? One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps? thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY... -- Adrian Quintero -- Adrian Quintero -- Adrian Quintero _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PB2YWWPO2TRJ6E...

You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong about the procedure, can anybody input an advice helping with this situation ? Cheers, Leo On Thu, Jun 6, 2019 at 9:45 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the corresponding new one. Hope it helps. Cheers, Leo
On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote:
Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage?
One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps?
thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY...
-- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero
-- Best regards, Leo David

Leo, I did try putting it under maintenance and checking to ignore gluster and it did not work. Error while executing action: -Cannot remove host. Server having gluster volume. Note: the server was already reinstalled so gluster will never see the volumes or bricks for this server. I will rename the server to myhost2.mydomain.com and try to replace the bricks hopefully that might work, however it would be good to know that you can re-install from scratch an existing cluster server and put it back to the cluster. Still doing research hopefully we can find a way. thanks again Adrian On Fri, Jun 7, 2019 at 2:39 AM Leo David <leoalex@gmail.com> wrote:
You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong about the procedure, can anybody input an advice helping with this situation ? Cheers,
Leo
On Thu, Jun 6, 2019 at 9:45 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero < adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the corresponding new one. Hope it helps. Cheers, Leo
On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote:
Anybody have had to replace a failed host from a 3, 6, or 9 node hyperconverged setup with gluster storage?
One of my hosts is completely dead, I need to do a fresh install using ovirt node iso, can anybody point me to the proper steps?
thanks, _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY...
-- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero
-- Best regards, Leo David
-- Adrian Quintero

Ok I have tried reinstalling the server from scratch with a different name and IP address and when trying to add it to cluster I get the following error: Event details ID: 505 Time: Jun 10, 2019, 10:00:00 AM Message: Host myshost2.virt.iad3p installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com I am at a loss here, I don't have a brand new server to do this and in need to re-use what I have. *From the oVirt engine log (/var/log/ovirt-engine/engine.log): * 2019-06-10 10:57:59,950-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-37744) [9b88055] EVENT_ID: VDS_INSTALL_FAILED(505), Host myhost2.mydomain.com installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com So in the /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20190610105759-myhost2.mydomain.com-9b88055.log of the ovirt engine I see that the host deploy is running the following command to identify the system, if this is the case then it will never work :( because it identifies each host using the system uuid. *dmidecode -s system-uuid* b64d566e-055d-44d4-83a2-d3b83f25412e Any suggestions? Thanks On Sat, Jun 8, 2019 at 11:23 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I did try putting it under maintenance and checking to ignore gluster and it did not work. Error while executing action: -Cannot remove host. Server having gluster volume.
Note: the server was already reinstalled so gluster will never see the volumes or bricks for this server.
I will rename the server to myhost2.mydomain.com and try to replace the bricks hopefully that might work, however it would be good to know that you can re-install from scratch an existing cluster server and put it back to the cluster.
Still doing research hopefully we can find a way.
thanks again
Adrian
On Fri, Jun 7, 2019 at 2:39 AM Leo David <leoalex@gmail.com> wrote:
You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong about the procedure, can anybody input an advice helping with this situation ? Cheers,
Leo
On Thu, Jun 6, 2019 at 9:45 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero < adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero < adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
Hi Adrian, I think the steps are: - reinstall the host - join it to virtualisation cluster And if was member of gluster cluster as well: - go to host - storage devices - create the bricks on the devices - as they are on the other hosts - go to storage - volumes - replace each failed brick with the corresponding new one. Hope it helps. Cheers, Leo
On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote:
> Anybody have had to replace a failed host from a 3, 6, or 9 node > hyperconverged setup with gluster storage? > > One of my hosts is completely dead, I need to do a fresh install > using ovirt node iso, can anybody point me to the proper steps? > > thanks, > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY... > -- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero
-- Best regards, Leo David
-- Adrian Quintero
-- Adrian Quintero

At this point I'd go to engine VM and remove host from the postgres DB manually. A bit of a hack, but... ssh root@<engine> su - postgres cd /opt/rh/rh-postgresql10/ source enable psql engine select vds_id from vds_static where host_name='myhost1.mydomain.com'; select DeleteVds('<vds_id from prev statement>'); Of course, keep in mind that editing database directly is the last resort and not supported in any way. -- Dmitry Filonov Linux Administrator SBGrid Core | Harvard Medical School 250 Longwood Ave, SGM-114 Boston, MA 02115 On Mon, Jun 10, 2019 at 11:16 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Ok I have tried reinstalling the server from scratch with a different name and IP address and when trying to add it to cluster I get the following error:
Event details ID: 505 Time: Jun 10, 2019, 10:00:00 AM Message: Host myshost2.virt.iad3p installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
I am at a loss here, I don't have a brand new server to do this and in need to re-use what I have.
*From the oVirt engine log (/var/log/ovirt-engine/engine.log): * 2019-06-10 10:57:59,950-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-37744) [9b88055] EVENT_ID: VDS_INSTALL_FAILED(505), Host myhost2.mydomain.com installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
So in the /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20190610105759-myhost2.mydomain.com-9b88055.log of the ovirt engine I see that the host deploy is running the following command to identify the system, if this is the case then it will never work :( because it identifies each host using the system uuid.
*dmidecode -s system-uuid* b64d566e-055d-44d4-83a2-d3b83f25412e
Any suggestions?
Thanks
On Sat, Jun 8, 2019 at 11:23 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I did try putting it under maintenance and checking to ignore gluster and it did not work. Error while executing action: -Cannot remove host. Server having gluster volume.
Note: the server was already reinstalled so gluster will never see the volumes or bricks for this server.
I will rename the server to myhost2.mydomain.com and try to replace the bricks hopefully that might work, however it would be good to know that you can re-install from scratch an existing cluster server and put it back to the cluster.
Still doing research hopefully we can find a way.
thanks again
Adrian
On Fri, Jun 7, 2019 at 2:39 AM Leo David <leoalex@gmail.com> wrote:
You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong about the procedure, can anybody input an advice helping with this situation ? Cheers,
Leo
On Thu, Jun 6, 2019 at 9:45 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero < adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero < adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
> Hi Adrian, > I think the steps are: > - reinstall the host > - join it to virtualisation cluster > And if was member of gluster cluster as well: > - go to host - storage devices > - create the bricks on the devices - as they are on the other hosts > - go to storage - volumes > - replace each failed brick with the corresponding new one. > Hope it helps. > Cheers, > Leo > > > On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote: > >> Anybody have had to replace a failed host from a 3, 6, or 9 node >> hyperconverged setup with gluster storage? >> >> One of my hosts is completely dead, I need to do a fresh install >> using ovirt node iso, can anybody point me to the proper steps? >> >> thanks, >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY... >> > -- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero
-- Best regards, Leo David
-- Adrian Quintero
-- Adrian Quintero _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4XZVAK7S7CVLG7...

Hi, i think you can generate and use a new uuid, althought i am not sure about the procedure right now.. On Mon, Jun 10, 2019, 18:13 Adrian Quintero <adrianquintero@gmail.com> wrote:
Ok I have tried reinstalling the server from scratch with a different name and IP address and when trying to add it to cluster I get the following error:
Event details ID: 505 Time: Jun 10, 2019, 10:00:00 AM Message: Host myshost2.virt.iad3p installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
I am at a loss here, I don't have a brand new server to do this and in need to re-use what I have.
*From the oVirt engine log (/var/log/ovirt-engine/engine.log): * 2019-06-10 10:57:59,950-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-37744) [9b88055] EVENT_ID: VDS_INSTALL_FAILED(505), Host myhost2.mydomain.com installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
So in the /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20190610105759-myhost2.mydomain.com-9b88055.log of the ovirt engine I see that the host deploy is running the following command to identify the system, if this is the case then it will never work :( because it identifies each host using the system uuid.
*dmidecode -s system-uuid* b64d566e-055d-44d4-83a2-d3b83f25412e
Any suggestions?
Thanks
On Sat, Jun 8, 2019 at 11:23 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I did try putting it under maintenance and checking to ignore gluster and it did not work. Error while executing action: -Cannot remove host. Server having gluster volume.
Note: the server was already reinstalled so gluster will never see the volumes or bricks for this server.
I will rename the server to myhost2.mydomain.com and try to replace the bricks hopefully that might work, however it would be good to know that you can re-install from scratch an existing cluster server and put it back to the cluster.
Still doing research hopefully we can find a way.
thanks again
Adrian
On Fri, Jun 7, 2019 at 2:39 AM Leo David <leoalex@gmail.com> wrote:
You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong about the procedure, can anybody input an advice helping with this situation ? Cheers,
Leo
On Thu, Jun 6, 2019 at 9:45 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero < adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero < adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
> Hi Adrian, > I think the steps are: > - reinstall the host > - join it to virtualisation cluster > And if was member of gluster cluster as well: > - go to host - storage devices > - create the bricks on the devices - as they are on the other hosts > - go to storage - volumes > - replace each failed brick with the corresponding new one. > Hope it helps. > Cheers, > Leo > > > On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote: > >> Anybody have had to replace a failed host from a 3, 6, or 9 node >> hyperconverged setup with gluster storage? >> >> One of my hosts is completely dead, I need to do a fresh install >> using ovirt node iso, can anybody point me to the proper steps? >> >> thanks, >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY... >> > -- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero
-- Best regards, Leo David
-- Adrian Quintero
-- Adrian Quintero

https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids On Mon, Jun 10, 2019, 18:13 Adrian Quintero <adrianquintero@gmail.com> wrote:
Ok I have tried reinstalling the server from scratch with a different name and IP address and when trying to add it to cluster I get the following error:
Event details ID: 505 Time: Jun 10, 2019, 10:00:00 AM Message: Host myshost2.virt.iad3p installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
I am at a loss here, I don't have a brand new server to do this and in need to re-use what I have.
*From the oVirt engine log (/var/log/ovirt-engine/engine.log): * 2019-06-10 10:57:59,950-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-37744) [9b88055] EVENT_ID: VDS_INSTALL_FAILED(505), Host myhost2.mydomain.com installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
So in the /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20190610105759-myhost2.mydomain.com-9b88055.log of the ovirt engine I see that the host deploy is running the following command to identify the system, if this is the case then it will never work :( because it identifies each host using the system uuid.
*dmidecode -s system-uuid* b64d566e-055d-44d4-83a2-d3b83f25412e
Any suggestions?
Thanks
On Sat, Jun 8, 2019 at 11:23 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I did try putting it under maintenance and checking to ignore gluster and it did not work. Error while executing action: -Cannot remove host. Server having gluster volume.
Note: the server was already reinstalled so gluster will never see the volumes or bricks for this server.
I will rename the server to myhost2.mydomain.com and try to replace the bricks hopefully that might work, however it would be good to know that you can re-install from scratch an existing cluster server and put it back to the cluster.
Still doing research hopefully we can find a way.
thanks again
Adrian
On Fri, Jun 7, 2019 at 2:39 AM Leo David <leoalex@gmail.com> wrote:
You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong about the procedure, can anybody input an advice helping with this situation ? Cheers,
Leo
On Thu, Jun 6, 2019 at 9:45 PM Adrian Quintero <adrianquintero@gmail.com> wrote:
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero < adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero < adrianquintero@gmail.com> wrote:
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote:
> Hi Adrian, > I think the steps are: > - reinstall the host > - join it to virtualisation cluster > And if was member of gluster cluster as well: > - go to host - storage devices > - create the bricks on the devices - as they are on the other hosts > - go to storage - volumes > - replace each failed brick with the corresponding new one. > Hope it helps. > Cheers, > Leo > > > On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote: > >> Anybody have had to replace a failed host from a 3, 6, or 9 node >> hyperconverged setup with gluster storage? >> >> One of my hosts is completely dead, I need to do a fresh install >> using ovirt node iso, can anybody point me to the proper steps? >> >> thanks, >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY... >> > -- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero
-- Best regards, Leo David
-- Adrian Quintero
-- Adrian Quintero

Thanks for pointing me in the right direction, I was able to add the server to the cluster by adding /etc/vdsm/vdsm.id I will now try to create the new bricks and try a replacement brick, this part I think I will have to do thru command line because my Hyperconverged setup with a replica 3 is as follows: */dev/sdb = /gluster_bricks/engine 100G* */dev/sdb = /gluster_bricks/vmstore1 2600G* /dev/sdc = /gluster_bricks/data1 2700G /dev/sdd = /gluster_bricks/data2 2700G /dev/sde = caching disk. The issue i see here is that I don't see an option through the WEB UI to create 2 bricks in the same /dev/sdb (one of 100Gb for the engine and one of 2600Gb for vmstore1). So if you have any ideas they are most welcome. thanks again. On Mon, Jun 10, 2019 at 4:35 PM Leo David <leoalex@gmail.com> wrote:
https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids
On Mon, Jun 10, 2019, 18:13 Adrian Quintero <adrianquintero@gmail.com> wrote:
Ok I have tried reinstalling the server from scratch with a different name and IP address and when trying to add it to cluster I get the following error:
Event details ID: 505 Time: Jun 10, 2019, 10:00:00 AM Message: Host myshost2.virt.iad3p installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
I am at a loss here, I don't have a brand new server to do this and in need to re-use what I have.
*From the oVirt engine log (/var/log/ovirt-engine/engine.log): * 2019-06-10 10:57:59,950-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-37744) [9b88055] EVENT_ID: VDS_INSTALL_FAILED(505), Host myhost2.mydomain.com installation failed. Host myhost2.mydomain.com reports unique id which already registered for myhost1.mydomain.com
So in the /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20190610105759-myhost2.mydomain.com-9b88055.log of the ovirt engine I see that the host deploy is running the following command to identify the system, if this is the case then it will never work :( because it identifies each host using the system uuid.
*dmidecode -s system-uuid* b64d566e-055d-44d4-83a2-d3b83f25412e
Any suggestions?
Thanks
On Sat, Jun 8, 2019 at 11:23 AM Adrian Quintero <adrianquintero@gmail.com> wrote:
Leo, I did try putting it under maintenance and checking to ignore gluster and it did not work. Error while executing action: -Cannot remove host. Server having gluster volume.
Note: the server was already reinstalled so gluster will never see the volumes or bricks for this server.
I will rename the server to myhost2.mydomain.com and try to replace the bricks hopefully that might work, however it would be good to know that you can re-install from scratch an existing cluster server and put it back to the cluster.
Still doing research hopefully we can find a way.
thanks again
Adrian
On Fri, Jun 7, 2019 at 2:39 AM Leo David <leoalex@gmail.com> wrote:
You will need to remove the storage role from that server first ( not being part of gluster cluster ). I cannot test this right now on production, but maybe putting host although its already died under "mantainance" while checking to ignore guster warning will let you remove it. Maybe I am wrong about the procedure, can anybody input an advice helping with this situation ? Cheers,
Leo
On Thu, Jun 6, 2019 at 9:45 PM Adrian Quintero < adrianquintero@gmail.com> wrote:
I tried removing the bad host but running into the following issue , any idea? Operation Canceled Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero < adrianquintero@gmail.com> wrote:
Leo, I forgot to mention that I have 1 SSD disk for caching purposes, wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero < adrianquintero@gmail.com> wrote:
> Hi Leo, yes, this helps a lot, this confirms the plan we had in mind. > > Will test tomorrow and post the results. > > Thanks again > > Adrian > > On Wed, Jun 5, 2019 at 11:18 PM Leo David <leoalex@gmail.com> wrote: > >> Hi Adrian, >> I think the steps are: >> - reinstall the host >> - join it to virtualisation cluster >> And if was member of gluster cluster as well: >> - go to host - storage devices >> - create the bricks on the devices - as they are on the other hosts >> - go to storage - volumes >> - replace each failed brick with the corresponding new one. >> Hope it helps. >> Cheers, >> Leo >> >> >> On Wed, Jun 5, 2019, 23:09 <adrianquintero@gmail.com> wrote: >> >>> Anybody have had to replace a failed host from a 3, 6, or 9 node >>> hyperconverged setup with gluster storage? >>> >>> One of my hosts is completely dead, I need to do a fresh install >>> using ovirt node iso, can anybody point me to the proper steps? >>> >>> thanks, >>> _______________________________________________ >>> Users mailing list -- users@ovirt.org >>> To unsubscribe send an email to users-leave@ovirt.org >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RFBYQKWC2KNZVY... >>> >> -- > Adrian Quintero >
-- Adrian Quintero
-- Adrian Quintero
-- Best regards, Leo David
-- Adrian Quintero
-- Adrian Quintero
-- Adrian Quintero

Definitely is a challenge trying to replace a bad host. So let me tell you what I see and have done so far: 1.-I have a host that went bad due to HW issues. 2.-This bad host is still showing in the compute --> hosts section. 3.-This host was part of a hyperconverged setup with Gluster. 4.-The gluster bricks for this server show up with a "?" mark inside the volumes under Storage ---> Volumes ---> Myvolume ---> bricks 5.-Under Compute ---> Hosts --> mybadhost.mydomain.com the host is in maintenance mode. 6.-When I try to remove that host (with "Force REmove" ticked) I keep getting: Operation Canceled Error while executing action: mybadhost.mydomain.com - Cannot remove Host. Server having Gluster volume. Note: I have also confirmed "host has been rebooted" Since the bad host was not recoverable (it was fried), I took a brand new server with the same specs and installed oVirt 4.3.3 on it and have it ready to add it to the cluster with the same hostname and IP but I cant do this until I remove the old entries on the WEB UI of the Hosted Engine VM. If this is not possible would I really need to add this new host with a different name and IP? What would be the correct and best procedure to fix this? Note that my setup is a 9 node setup with hyperconverged and replica 3 bricks and in a distributed replicated volume scenario. Thanks

I'll presume you didn't fully backup your hosts root file systems on the host which was fried. It may be easier to replace with a new hostname/IP. I would focus on the gluster config first, since it was hyperconverged. I don't know which way engine UI is using to detect gluster mount on missing host and decides not to remove the old host. You probably also have the storage domain "mounted in the data-center" with backup volume servers pointing at the old host details The remaining gluster peers also notice the outage, and it could be detecting that? I would try to make gluster changes, so maybe the engine UI will allow you to remove old hyperconverged host entry. (The Engine UI is really trying to protect your gluster data). I'd try changing the mount options and there is a way to tell gluster to only use two hosts and stop trying to connect to the third, but I don't remember the details. On Thu, Jun 6, 2019 at 4:32 PM <adrianquintero@gmail.com> wrote:
Definitely is a challenge trying to replace a bad host.
So let me tell you what I see and have done so far:
1.-I have a host that went bad due to HW issues. 2.-This bad host is still showing in the compute --> hosts section. 3.-This host was part of a hyperconverged setup with Gluster. 4.-The gluster bricks for this server show up with a "?" mark inside the volumes under Storage ---> Volumes ---> Myvolume ---> bricks 5.-Under Compute ---> Hosts --> mybadhost.mydomain.com the host is in maintenance mode. 6.-When I try to remove that host (with "Force REmove" ticked) I keep getting: Operation Canceled Error while executing action: mybadhost.mydomain.com - Cannot remove Host. Server having Gluster volume. Note: I have also confirmed "host has been rebooted"
Since the bad host was not recoverable (it was fried), I took a brand new server with the same specs and installed oVirt 4.3.3 on it and have it ready to add it to the cluster with the same hostname and IP but I cant do this until I remove the old entries on the WEB UI of the Hosted Engine VM.
If this is not possible would I really need to add this new host with a different name and IP? What would be the correct and best procedure to fix this?
Note that my setup is a 9 node setup with hyperconverged and replica 3 bricks and in a distributed replicated volume scenario.
Thanks _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/N4HFTCWNFTOJJ3...

Can you remove bricks that belong to a fried server? Either from a GUI or CLI You should be able to do so and then it should allow you to remove host from the oVirt setup. -- Dmitry Filonov Linux Administrator SBGrid Core | Harvard Medical School 250 Longwood Ave, SGM-114 Boston, MA 02115 On Thu, Jun 6, 2019 at 4:36 PM <adrianquintero@gmail.com> wrote:
Definitely is a challenge trying to replace a bad host.
So let me tell you what I see and have done so far:
1.-I have a host that went bad due to HW issues. 2.-This bad host is still showing in the compute --> hosts section. 3.-This host was part of a hyperconverged setup with Gluster. 4.-The gluster bricks for this server show up with a "?" mark inside the volumes under Storage ---> Volumes ---> Myvolume ---> bricks 5.-Under Compute ---> Hosts --> mybadhost.mydomain.com the host is in maintenance mode. 6.-When I try to remove that host (with "Force REmove" ticked) I keep getting: Operation Canceled Error while executing action: mybadhost.mydomain.com - Cannot remove Host. Server having Gluster volume. Note: I have also confirmed "host has been rebooted"
Since the bad host was not recoverable (it was fried), I took a brand new server with the same specs and installed oVirt 4.3.3 on it and have it ready to add it to the cluster with the same hostname and IP but I cant do this until I remove the old entries on the WEB UI of the Hosted Engine VM.
If this is not possible would I really need to add this new host with a different name and IP? What would be the correct and best procedure to fix this?
Note that my setup is a 9 node setup with hyperconverged and replica 3 bricks and in a distributed replicated volume scenario.
Thanks _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/N4HFTCWNFTOJJ3...
participants (6)
-
Adrian Quintero
-
adrianquintero@gmail.com
-
Dmitry Filonov
-
Edward Berger
-
Leo David
-
Strahil Nikolov