OK, thanks. I probably made the mistake of not removing first before doing
work on the server.
I reinstalled because this was the machine that had a dead CPU so I had to
move the NIC to a new PCI slot serviced by Socket 1 on the motherboard.
This meant the interface names changed and it was easier to reinstall than
spend a day working out how to try and modify the node configuration.
Cheers.
On Thu, Apr 23, 2020 at 7:25 AM Yedidyah Bar David <didi(a)redhat.com> wrote:
On Wed, Apr 22, 2020 at 7:10 PM Strahil Nikolov
<hunter86_bg(a)yahoo.com>
wrote:
>
> On April 22, 2020 11:57:31 AM GMT+03:00, Yedidyah Bar David <
didi(a)redhat.com> wrote:
> >On Wed, Apr 22, 2020 at 11:52 AM Shareef Jalloq <shareef(a)jalloq.co.uk>
> >wrote:
> >>
> >> Thanks for the suggestions everyone. First up, no, this is not part
> >of a Gluster HCI.
> >>
> >> Secondly, "Confirm Host has been rebooted" seems to require the
host
> >have a status of "Non operational", "Maintenance" or
"Connecting" from
> >the error pop up.
> >>
> >> Ah, as I'm writing this, I shut down the host and now the status
> >changes to Non-responsive and lets me put it in maintenance so I can
> >remove it. So it looks like the change in config causes the engine to
> >not be able to communicate with the host. Is that a bug or expected?
> >What should my workflow have been here after reinstalling the node?
> >
> >What did you do in practice?
> >
> >I think something like:
> >
> >1. Move to maintenance
> >2. Remove from engine
> >3. Reinstall OS
> >4. Add to engine
> >
> >But this depends on exactly what you wanted to achieve, or IOW why you
> >reinstalled.
> >
> >Best regards,
> >
> >>
> >> Thanks, Shareef.
> >>
> >>
> >>
> >> On Wed, Apr 22, 2020 at 7:46 AM Yedidyah Bar David <didi(a)redhat.com>
> >wrote:
> >>>
> >>> On Wed, Apr 22, 2020 at 2:17 AM Strahil Nikolov
> ><hunter86_bg(a)yahoo.com> wrote:
> >>> >
> >>> > On April 22, 2020 12:41:49 AM GMT+03:00, "Maton, Brett"
> ><matonb(a)ltresources.co.uk> wrote:
> >>> > >Last time I had to forcibly remove a node because it was
> >impossible to
> >>> > >do
> >>> > >so otherwise, it had never ever had anything to do with
gluster,
> >so I
> >>> > >STRONGLY dispute your claim that fixing an issue (that was
not
> >stated)
> >>> > >will
> >>> > >fix anything.
> >>> > >
> >>> > >On Tue, 21 Apr 2020 at 22:39, Maton, Brett
> ><matonb(a)ltresources.co.uk>
> >>> > >wrote:
> >>> > >
> >>> > >> I'm sorry there was no suggestion that the node had
anything to
> >do
> >>> > >with
> >>> > >> gluster, clearly stated but how to remove a dead and
> >unmanageable
> >>> > >node from
> >>> > >> the cluster.
> >>> > >>
> >>> > >> On Tue, 21 Apr 2020 at 20:41, Strahil Nikolov
> ><hunter86_bg(a)yahoo.com>
> >>> > >> wrote:
> >>> > >>
> >>> > >>> Not a good approach.
> >>> > >>> It's important to know if the node was also a
gluster peer in
> >the
> >>> > >storage
> >>> > >>> pool - if yes, it needs to be replaced with
'replace-brick' or
> >>> > >>> 'reset-brick' (depending if you use the old
hostname or not).
> >>> > >>> Once the storage node is replaced - oVirt will allow
you to
> >remove
> >>> > >it.
> >>> > >>>
> >>> > >>>
> >>> > >>> Best Regards,
> >>> > >>> Strahil Nikolov
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>> В вторник, 21 април 2020 г., 19:46:47 Гринуич+3,
Maton, Brett
> ><
> >>> > >>> matonb(a)ltresources.co.uk> написа:
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>> Last time I had to do this I removed from the
database.
> >>> > >>>
> >>> > >>> (at your own risk)
> >>> > >>> On ovirt engine switch to the postgres user from
root:
> >>> > >>>
> >>> > >>> su - postgres
> >>> > >>>
> >>> > >>> Enable postgres 10 and connect to the engine
database:
> >>> > >>>
> >>> > >>> . scl_source enable rh-postgresql10
> >>> > >>> psql -d engine
> >>> > >>>
> >>> > >>> Change <host name to remove> to the name (Name
column of the
> >host in
> >>> > >the
> >>> > >>> UI) of the host you want to get rid of ( leave the
'\' and \''
> >in
> >>> > >place )
> >>> > >>>
> >>> > >>> BEGIN;\set host '\'<host name to
remove>\''DELETE FROM
> >vds_dynamic
> >>> > >WHERE
> >>> > >>> vds_id IN (SELECT vds_id FROM vds_static WHERE
vds_name =
> >>> > >:host);DELETE
> >>> > >>> FROM vds_statistics WHERE vds_id IN (SELECT vds_id
FROM
> >vds_static
> >>> > >WHERE
> >>> > >>> vds_name = :host);DELETE FROM vds_static WHERE
vds_name =
> >>> > >:host;COMMIT;
> >>> > >>> <CTRL d - exit>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>> On Tue, 21 Apr 2020 at 15:19, Shareef Jalloq
> ><shareef(a)jalloq.co.uk>
> >>> > >>> wrote:
> >>> > >>> > Hi,
> >>> > >>> >
> >>> > >>> > I seem to have got a stale host in my engine
that I can't
> >remove.
> >>> > >I
> >>> > >>> recently reinstalled oVirt Node on this host and
while trying
> >to
> >>> > >refresh
> >>> > >>> the host in the engine, have got it in some state
where I
> >can't do
> >>> > >anything.
> >>> > >>> >
> >>> > >>> > The host is listed as Status=Unassigned. Under
the
> >Management
> >>> > >pull
> >>> > >>> down I only have Restart and Stop options, both of
which error
> >if
> >>> > >>> selected. The Remove button is not available.
> >>> > >>> >
> >>> > >>> > How do I force a removal of this host from the
view so I can
> >>> > >reload it?
> >>> > >>> >
> >>> > >>> > Shareef.
> >>> > >>> > _______________________________________________
> >>> > >>> > Users mailing list -- users(a)ovirt.org
> >>> > >>> > To unsubscribe send an email to
users-leave(a)ovirt.org
> >>> > >>> > Privacy Statement:
https://www.ovirt.org/privacy-policy.html
> >>> > >>> > oVirt Code of Conduct:
> >>> > >>>
https://www.ovirt.org/community/about/community-guidelines/
> >>> > >>> > List Archives:
> >>> > >>>
> >>> >
> >>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TCBGLVCLMTZ...
> >>> > >>> >
> >>> > >>> _______________________________________________
> >>> > >>> Users mailing list -- users(a)ovirt.org
> >>> > >>> To unsubscribe send an email to
users-leave(a)ovirt.org
> >>> > >>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
> >>> > >>> oVirt Code of Conduct:
> >>> > >>>
https://www.ovirt.org/community/about/community-guidelines/
> >>> > >>> List Archives:
> >>> > >>>
> >>> >
> >>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EAFO7EBLCET...
> >>> > >>>
> >>> > >>
> >>> >
> >>> > Last time I had to remove a host - I didn't have your issues.
> >>> > Yet, I'm not claiming the opposite - just that in case the
host
> >is also a gluster node , oVirt will intentionally prevent removal of
> >the node.
> >>> >
> >>> > Your approach could be absolutely valid in case there is no
> >Gluster involved - yet for HCI , Gluster brick has to be replaced
> >prior taking any actions for the removal.
> >>>
> >>> Did any of you try to "Confirm 'Host has been
rebooted'"? Did this
> >help?
> >>>
> >>> Best regards,
> >>> --
> >>> Didi
> >>>
>
> Hey Didi,
>
> Sometimes the node just dies and I guess Shareef tested that situation.
Obviously.
"Confirm 'Host has been rebooted'" does not imply that you actually
rebooted
it. It just means that the engine can be sure it's not running stuff
anymore.
If it's indeed up and the engine can connect to it, great. Otherwise, it
does
mean that the engine can know that VMs that used to be on it are down, etc.
The engine still might wait until some timeout trying to connect to it, or
find its status (or the status of operations/objects related to it, etc.).
I admit I am not an engine developer myself and am not certain about
the details.
But in such a case (want to remove a dead host), you can always try to
Confirm
and see if this helps. Admittedly, this does not help in each and every
case.
If you run into a reproducible flow in which you think it should be
possible
to remove a host but it's not, please open a bug. Thanks!
Best regards,
>
> Best Regards,
> Strahil Nikolov
>
--
Didi