OK, thanks.  I probably made the mistake of not removing first before doing work on the server.

I reinstalled because this was the machine that had a dead CPU so I had to move the NIC to a new PCI slot serviced by Socket 1 on the motherboard.  This meant the interface names changed and it was easier to reinstall than spend a day working out how to try and modify the node configuration.

Cheers.

On Thu, Apr 23, 2020 at 7:25 AM Yedidyah Bar David <didi@redhat.com> wrote:
On Wed, Apr 22, 2020 at 7:10 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
>
> On April 22, 2020 11:57:31 AM GMT+03:00, Yedidyah Bar David <didi@redhat.com> wrote:
> >On Wed, Apr 22, 2020 at 11:52 AM Shareef Jalloq <shareef@jalloq.co.uk>
> >wrote:
> >>
> >> Thanks for the suggestions everyone.  First up, no, this is not part
> >of a Gluster HCI.
> >>
> >> Secondly, "Confirm Host has been rebooted" seems to require the host
> >have a status of "Non operational", "Maintenance" or "Connecting" from
> >the error pop up.
> >>
> >> Ah, as I'm writing this, I shut down the host and now the status
> >changes to Non-responsive and lets me put it in maintenance so I can
> >remove it.  So it looks like the change in config causes the engine to
> >not be able to communicate with the host.  Is that a bug or expected?
> >What should my workflow have been here after reinstalling the node?
> >
> >What did you do in practice?
> >
> >I think something like:
> >
> >1. Move to maintenance
> >2. Remove from engine
> >3. Reinstall OS
> >4. Add to engine
> >
> >But this depends on exactly what you wanted to achieve, or IOW why you
> >reinstalled.
> >
> >Best regards,
> >
> >>
> >> Thanks, Shareef.
> >>
> >>
> >>
> >> On Wed, Apr 22, 2020 at 7:46 AM Yedidyah Bar David <didi@redhat.com>
> >wrote:
> >>>
> >>> On Wed, Apr 22, 2020 at 2:17 AM Strahil Nikolov
> ><hunter86_bg@yahoo.com> wrote:
> >>> >
> >>> > On April 22, 2020 12:41:49 AM GMT+03:00, "Maton, Brett"
> ><matonb@ltresources.co.uk> wrote:
> >>> > >Last time I had to forcibly remove a node because it was
> >impossible to
> >>> > >do
> >>> > >so otherwise, it had never ever had anything to do with gluster,
> >so I
> >>> > >STRONGLY dispute your claim that fixing an issue (that was not
> >stated)
> >>> > >will
> >>> > >fix anything.
> >>> > >
> >>> > >On Tue, 21 Apr 2020 at 22:39, Maton, Brett
> ><matonb@ltresources.co.uk>
> >>> > >wrote:
> >>> > >
> >>> > >> I'm sorry there was no suggestion that the node had anything to
> >do
> >>> > >with
> >>> > >> gluster, clearly stated but how to remove a dead and
> >unmanageable
> >>> > >node from
> >>> > >> the cluster.
> >>> > >>
> >>> > >> On Tue, 21 Apr 2020 at 20:41, Strahil Nikolov
> ><hunter86_bg@yahoo.com>
> >>> > >> wrote:
> >>> > >>
> >>> > >>> Not a good approach.
> >>> > >>> It's important to know if the node was also a gluster peer in
> >the
> >>> > >storage
> >>> > >>> pool - if yes, it needs to be replaced with 'replace-brick' or
> >>> > >>> 'reset-brick' (depending if you use the old hostname or not).
> >>> > >>> Once the storage node is replaced - oVirt will allow you to
> >remove
> >>> > >it.
> >>> > >>>
> >>> > >>>
> >>> > >>> Best Regards,
> >>> > >>> Strahil Nikolov
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>> В вторник, 21 април 2020 г., 19:46:47 Гринуич+3, Maton, Brett
> ><
> >>> > >>> matonb@ltresources.co.uk> написа:
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>> Last time I had to do this I removed from the database.
> >>> > >>>
> >>> > >>> (at your own risk)
> >>> > >>> On ovirt engine switch to the postgres user from root:
> >>> > >>>
> >>> > >>> su - postgres
> >>> > >>>
> >>> > >>> Enable postgres 10 and connect to the engine database:
> >>> > >>>
> >>> > >>> . scl_source enable rh-postgresql10
> >>> > >>> psql -d engine
> >>> > >>>
> >>> > >>> Change <host name to remove> to the name (Name column of the
> >host in
> >>> > >the
> >>> > >>> UI) of the host you want to get rid of ( leave the '\' and \''
> >in
> >>> > >place )
> >>> > >>>
> >>> > >>> BEGIN;\set host '\'<host name to remove>\''DELETE FROM
> >vds_dynamic
> >>> > >WHERE
> >>> > >>> vds_id IN (SELECT vds_id FROM vds_static WHERE vds_name =
> >>> > >:host);DELETE
> >>> > >>> FROM vds_statistics WHERE vds_id IN (SELECT vds_id FROM
> >vds_static
> >>> > >WHERE
> >>> > >>> vds_name = :host);DELETE FROM vds_static WHERE vds_name =
> >>> > >:host;COMMIT;
> >>> > >>> <CTRL d - exit>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>> On Tue, 21 Apr 2020 at 15:19, Shareef Jalloq
> ><shareef@jalloq.co.uk>
> >>> > >>> wrote:
> >>> > >>> > Hi,
> >>> > >>> >
> >>> > >>> > I seem to have got a stale host in my engine that I can't
> >remove.
> >>> > >I
> >>> > >>> recently reinstalled oVirt Node on this host and while trying
> >to
> >>> > >refresh
> >>> > >>> the host in the engine, have got it in some state where I
> >can't do
> >>> > >anything.
> >>> > >>> >
> >>> > >>> > The host is listed as Status=Unassigned.  Under the
> >Management
> >>> > >pull
> >>> > >>> down I only have Restart and Stop options, both of which error
> >if
> >>> > >>> selected.  The Remove button is not available.
> >>> > >>> >
> >>> > >>> > How do I force a removal of this host from the view so I can
> >>> > >reload it?
> >>> > >>> >
> >>> > >>> > Shareef.
> >>> > >>> > _______________________________________________
> >>> > >>> > Users mailing list -- users@ovirt.org
> >>> > >>> > To unsubscribe send an email to users-leave@ovirt.org
> >>> > >>> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> >>> > >>> > oVirt Code of Conduct:
> >>> > >>> https://www.ovirt.org/community/about/community-guidelines/
> >>> > >>> > List Archives:
> >>> > >>>
> >>> >
> >>https://lists.ovirt.org/archives/list/users@ovirt.org/message/TCBGLVCLMTZSQAURDDVGHKYXXQ36NO5H/
> >>> > >>> >
> >>> > >>> _______________________________________________
> >>> > >>> Users mailing list -- users@ovirt.org
> >>> > >>> To unsubscribe send an email to users-leave@ovirt.org
> >>> > >>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> >>> > >>> oVirt Code of Conduct:
> >>> > >>> https://www.ovirt.org/community/about/community-guidelines/
> >>> > >>> List Archives:
> >>> > >>>
> >>> >
> >>https://lists.ovirt.org/archives/list/users@ovirt.org/message/EAFO7EBLCETILFJGCY55K254TLRKJPRC/
> >>> > >>>
> >>> > >>
> >>> >
> >>> > Last time I had to remove a host - I didn't have your issues.
> >>> > Yet,  I'm not claiming the opposite - just that in case the host
> >is also a gluster node ,  oVirt will intentionally prevent removal of
> >the node.
> >>> >
> >>> > Your approach could be absolutely valid in case there is no
> >Gluster involved - yet for HCI , Gluster brick has  to be replaced
> >prior taking any actions for the removal.
> >>>
> >>> Did any of you try to "Confirm 'Host has been rebooted'"? Did this
> >help?
> >>>
> >>> Best regards,
> >>> --
> >>> Didi
> >>>
>
> Hey Didi,
>
> Sometimes the node just dies and I guess Shareef tested that situation.

Obviously.

"Confirm 'Host has been rebooted'" does not imply that you actually rebooted
it. It just means that the engine can be sure it's not running stuff anymore.
If it's indeed up and the engine can connect to it, great. Otherwise, it does
mean that the engine can know that VMs that used to be on it are down, etc.
The engine still might wait until some timeout trying to connect to it, or
find its status (or the status of operations/objects related to it, etc.).
I admit I am not an engine developer myself and am not certain about
the details.
But in such a case (want to remove a dead host), you can always try to Confirm
and see if this helps. Admittedly, this does not help in each and every case.
If you run into a reproducible flow in which you think it should be possible
to remove a host but it's not, please open a bug. Thanks!

Best regards,

>
> Best Regards,
> Strahil Nikolov
>


--
Didi