oVirt Gluster Hyperconverged problem

Hi Guys, I encountered an unfortunate circumstance today. Possibly an achillies heel. I have three hypervisors, HV1, HV2, HV3, all running gluster for hosted engine support. Individually they all pointed to HV1:/hosted_engine with backupvol=HV2,HV3... HV1 lost its bootsector, which was discovered upon a reboot. This had zero impact, as designed, on the VM's. However, now that HV1 is down, how does one go about replacing the original HV? The backup servers point to HV1, and you cannot readd the HV through the GUI, and the CLI will not readd it as it's already there... you cannot remove it as it is down in the GUI... Pointing the other HV's to their own storage may make sense for multiple instances of the hosted_engine, however it's nice that the gluster volumes are replicated and that one VM can be relaunched when a HV error is detected. It's also consuming less resources. What's the procedure to replace the original VM?

Hi, Pad [1] contains the procedure to replace the host with same FQDN where existing host OS has to be re-installed. [1] https://paste.fedoraproject.org/431252/47435076/ Thanks kasturi. On 09/20/2016 06:27 AM, Hanson wrote:
Hi Guys,
I encountered an unfortunate circumstance today. Possibly an achillies heel.
I have three hypervisors, HV1, HV2, HV3, all running gluster for hosted engine support. Individually they all pointed to HV1:/hosted_engine with backupvol=HV2,HV3...
HV1 lost its bootsector, which was discovered upon a reboot. This had zero impact, as designed, on the VM's.
However, now that HV1 is down, how does one go about replacing the original HV? The backup servers point to HV1, and you cannot readd the HV through the GUI, and the CLI will not readd it as it's already there... you cannot remove it as it is down in the GUI...
Pointing the other HV's to their own storage may make sense for multiple instances of the hosted_engine, however it's nice that the gluster volumes are replicated and that one VM can be relaunched when a HV error is detected. It's also consuming less resources.
What's the procedure to replace the original VM?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

This is a multi-part message in MIME format. --------------1BBBADD28F98A6EF25AE5C4A Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Hello Hanson, Below is the procedure to replace the host with same FQDN where existing host OS has to be re-installed. If the ovirt version you are running is 4.0, steps 14 and 15 are not required. You could reinstall the host from UI with HostedEngine->Deploy option. * 1. Move host (host3) to maintenance in UI 2. Re-install OS, subscribe to channels & install required packages, prepare bricks (if needed) 3. Check gluster peer status from working node to obtain UUID of host being replaced 4. Create brick directories by running the command mkdir /rhgs/brick{1..3} 5. Put /etc/fstab entries in the new node by copying it from other nodes. 6. Run mount -a so that bricks are mounted. 7. Edit gluster UUID in /var/lib/glusterd/glusterd.info 8. Copy peer info from a working peer to /var/lib/glusterd/peers (without the peer info of node being replaced, here host3) 9. Create and remove a tmp dir at all volume mount points 10. Run the command setfattr -n trusted.non-existent-key -v abc <mount point> to set extended attributes and remove the extended attribute by running the command setfattr -x trusted.non-existent-key <mount point> at all mount points. 11. Restart glusterd 12. Ensure heal is in progress and complete 13. Edit the host, and fetch fingerprint in Advanced details - as fingerprint is changed due to reinstallation 14. Run hosted-engine --deploy --config-append=answers.conf on host3 (Should be seen as additional host setup, provide the host number as known by other hosts) 15. hosted-engine deploy fails as the host being installed cannot be added to the engine with hostname already known error. Reinstalling from the UI and aborting HE setup seems to fix this. ovirt-ha-agent and ovirt-ha-broker services had to be started manually 1. Go to UI and click on reinstall button to reinstall the host. Reinstalling host might fail due not able to configure management network. 2. Go to Network Interfaces tab and click on Setup Host Networks and assign the networks ovirtmgmt and glusternw to the correct nics. 3. Wait for sometime for the Node to come up and start ovirt-ha-agent and ovirt-ha-broker services. * Thanks kasturi. On 09/20/2016 11:27 AM, knarra wrote:
Hi,
Pad [1] contains the procedure to replace the host with same FQDN where existing host OS has to be re-installed.
[1] https://paste.fedoraproject.org/431252/47435076/
Thanks kasturi.
On 09/20/2016 06:27 AM, Hanson wrote:
Hi Guys,
I encountered an unfortunate circumstance today. Possibly an achillies heel.
I have three hypervisors, HV1, HV2, HV3, all running gluster for hosted engine support. Individually they all pointed to HV1:/hosted_engine with backupvol=HV2,HV3...
HV1 lost its bootsector, which was discovered upon a reboot. This had zero impact, as designed, on the VM's.
However, now that HV1 is down, how does one go about replacing the original HV? The backup servers point to HV1, and you cannot readd the HV through the GUI, and the CLI will not readd it as it's already there... you cannot remove it as it is down in the GUI...
Pointing the other HV's to their own storage may make sense for multiple instances of the hosted_engine, however it's nice that the gluster volumes are replicated and that one VM can be relaunched when a HV error is detected. It's also consuming less resources.
What's the procedure to replace the original VM?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--------------1BBBADD28F98A6EF25AE5C4A Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">Hello Hanson,<br> <br> Below is the procedure to replace the host with same FQDN where existing host OS has to be re-installed. If the ovirt version you are running is 4.0, steps 14 and 15 are not required. You could reinstall the host from UI with HostedEngine->Deploy option.<br> <br> <meta http-equiv="content-type" content="text/html; charset=windows-1252"> <meta http-equiv="content-type" content="text/html; charset=windows-1252"> <b style="font-weight:normal;" id="docs-internal-guid-2ce616e6-463c-f821-1898-5ecbdf4b8407"> <ol style="margin-top:0pt;margin-bottom:0pt;"> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Move host (host3) to maintenance in UI</span></p> </li> </ol> <br> <ol style="margin-top:0pt;margin-bottom:0pt;" start="2"> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Re-install OS, subscribe to channels & install required packages, prepare bricks (if needed)</span></p> </li> </ol> <br> <ol style="margin-top:0pt;margin-bottom:0pt;" start="3"> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Check gluster peer status from working node to obtain UUID of host being replaced</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Create brick directories by running the command mkdir /rhgs/brick{1..3}</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Put /etc/fstab entries in the new node by copying it from other nodes.</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Run mount -a so that bricks are mounted.</span></p> </li> </ol> <br> <ol style="margin-top:0pt;margin-bottom:0pt;" start="7"> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Edit gluster UUID in /var/lib/glusterd/glusterd.info </span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Copy peer info from a working peer to /var/lib/glusterd/peers (without the peer info of node being replaced, here host3) </span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Create and remove a tmp dir at all volume mount points </span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"> Run the command setfattr -n trusted.non-existent-key -v abc <mount point> to set extended attributes and remove the extended attribute by running the command setfattr -x trusted.non-existent-key <mount point> at all mount points.</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Restart glusterd</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Ensure heal is in progress and complete</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Edit the host, and fetch fingerprint in Advanced details - as fingerprint is changed due to reinstallation</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Run hosted-engine --deploy --config-append=answers.conf on host3 (Should be seen as additional host setup, provide the host number as known by other hosts)</span></p> </li> <li dir="ltr" style="list-style-type:decimal;font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">hosted-engine deploy fails as the host being installed cannot be added to the engine with hostname already known error. Reinstalling from the UI and aborting HE setup seems to fix this. ovirt-ha-agent and ovirt-ha-broker services had to be started manually</span></p> </li> <ol style="margin-top:0pt;margin-bottom:0pt;"> <li dir="ltr" style="list-style-type:lower-alpha;font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Go to UI and click on reinstall button to reinstall the host. Reinstalling host might fail due not able to configure management network.</span></p> </li> <li dir="ltr" style="list-style-type:lower-alpha;font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Go to Network Interfaces tab and click on Setup Host Networks and assign the networks ovirtmgmt and glusternw to the correct nics.</span></p> </li> <li dir="ltr" style="list-style-type:lower-alpha;font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:14.666666666666666px;font-family:Arial;color:#ff0000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Wait for sometime for the Node to come up and start ovirt-ha-agent and ovirt-ha-broker services.</span></p> </li> </ol> </ol> </b> <meta charset="utf-8"> <br> Thanks<br> kasturi.<br> <br> On 09/20/2016 11:27 AM, knarra wrote:<br> </div> <blockquote cite="mid:cece5fa2-088c-5b84-8446-449d865dbb3b@redhat.com" type="cite">Hi, <br> <br> Pad [1] contains the procedure to replace the host with same FQDN where existing host OS has to be re-installed. <br> <br> [1] <a class="moz-txt-link-freetext" href="https://paste.fedoraproject.org/431252/47435076/">https://paste.fedoraproject.org/431252/47435076/</a> <br> <br> Thanks <br> kasturi. <br> <br> On 09/20/2016 06:27 AM, Hanson wrote: <br> <blockquote type="cite">Hi Guys, <br> <br> I encountered an unfortunate circumstance today. Possibly an achillies heel. <br> <br> I have three hypervisors, HV1, HV2, HV3, all running gluster for hosted engine support. Individually they all pointed to HV1:/hosted_engine with backupvol=HV2,HV3... <br> <br> HV1 lost its bootsector, which was discovered upon a reboot. This had zero impact, as designed, on the VM's. <br> <br> However, now that HV1 is down, how does one go about replacing the original HV? The backup servers point to HV1, and you cannot readd the HV through the GUI, and the CLI will not readd it as it's already there... you cannot remove it as it is down in the GUI... <br> <br> Pointing the other HV's to their own storage may make sense for multiple instances of the hosted_engine, however it's nice that the gluster volumes are replicated and that one VM can be relaunched when a HV error is detected. It's also consuming less resources. <br> <br> <br> What's the procedure to replace the original VM? <br> <br> <br> <br> _______________________________________________ <br> Users mailing list <br> <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <br> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> <br> </blockquote> <br> <br> _______________________________________________ <br> Users mailing list <br> <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <br> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> <br> </blockquote> <p><br> </p> </body> </html> --------------1BBBADD28F98A6EF25AE5C4A--
participants (2)
-
Hanson
-
knarra