On Wed, Apr 22, 2020 at 7:10 PM Strahil Nikolov <hunter86_bg(a)yahoo.com> wrote:
On April 22, 2020 11:57:31 AM GMT+03:00, Yedidyah Bar David <didi(a)redhat.com>
wrote:
>On Wed, Apr 22, 2020 at 11:52 AM Shareef Jalloq <shareef(a)jalloq.co.uk>
>wrote:
>>
>> Thanks for the suggestions everyone. First up, no, this is not part
>of a Gluster HCI.
>>
>> Secondly, "Confirm Host has been rebooted" seems to require the host
>have a status of "Non operational", "Maintenance" or
"Connecting" from
>the error pop up.
>>
>> Ah, as I'm writing this, I shut down the host and now the status
>changes to Non-responsive and lets me put it in maintenance so I can
>remove it. So it looks like the change in config causes the engine to
>not be able to communicate with the host. Is that a bug or expected?
>What should my workflow have been here after reinstalling the node?
>
>What did you do in practice?
>
>I think something like:
>
>1. Move to maintenance
>2. Remove from engine
>3. Reinstall OS
>4. Add to engine
>
>But this depends on exactly what you wanted to achieve, or IOW why you
>reinstalled.
>
>Best regards,
>
>>
>> Thanks, Shareef.
>>
>>
>>
>> On Wed, Apr 22, 2020 at 7:46 AM Yedidyah Bar David <didi(a)redhat.com>
>wrote:
>>>
>>> On Wed, Apr 22, 2020 at 2:17 AM Strahil Nikolov
><hunter86_bg(a)yahoo.com> wrote:
>>> >
>>> > On April 22, 2020 12:41:49 AM GMT+03:00, "Maton, Brett"
><matonb(a)ltresources.co.uk> wrote:
>>> > >Last time I had to forcibly remove a node because it was
>impossible to
>>> > >do
>>> > >so otherwise, it had never ever had anything to do with gluster,
>so I
>>> > >STRONGLY dispute your claim that fixing an issue (that was not
>stated)
>>> > >will
>>> > >fix anything.
>>> > >
>>> > >On Tue, 21 Apr 2020 at 22:39, Maton, Brett
><matonb(a)ltresources.co.uk>
>>> > >wrote:
>>> > >
>>> > >> I'm sorry there was no suggestion that the node had
anything to
>do
>>> > >with
>>> > >> gluster, clearly stated but how to remove a dead and
>unmanageable
>>> > >node from
>>> > >> the cluster.
>>> > >>
>>> > >> On Tue, 21 Apr 2020 at 20:41, Strahil Nikolov
><hunter86_bg(a)yahoo.com>
>>> > >> wrote:
>>> > >>
>>> > >>> Not a good approach.
>>> > >>> It's important to know if the node was also a gluster
peer in
>the
>>> > >storage
>>> > >>> pool - if yes, it needs to be replaced with
'replace-brick' or
>>> > >>> 'reset-brick' (depending if you use the old
hostname or not).
>>> > >>> Once the storage node is replaced - oVirt will allow you
to
>remove
>>> > >it.
>>> > >>>
>>> > >>>
>>> > >>> Best Regards,
>>> > >>> Strahil Nikolov
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> В вторник, 21 април 2020 г., 19:46:47 Гринуич+3, Maton,
Brett
><
>>> > >>> matonb(a)ltresources.co.uk> написа:
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> Last time I had to do this I removed from the database.
>>> > >>>
>>> > >>> (at your own risk)
>>> > >>> On ovirt engine switch to the postgres user from root:
>>> > >>>
>>> > >>> su - postgres
>>> > >>>
>>> > >>> Enable postgres 10 and connect to the engine database:
>>> > >>>
>>> > >>> . scl_source enable rh-postgresql10
>>> > >>> psql -d engine
>>> > >>>
>>> > >>> Change <host name to remove> to the name (Name
column of the
>host in
>>> > >the
>>> > >>> UI) of the host you want to get rid of ( leave the
'\' and \''
>in
>>> > >place )
>>> > >>>
>>> > >>> BEGIN;\set host '\'<host name to
remove>\''DELETE FROM
>vds_dynamic
>>> > >WHERE
>>> > >>> vds_id IN (SELECT vds_id FROM vds_static WHERE vds_name =
>>> > >:host);DELETE
>>> > >>> FROM vds_statistics WHERE vds_id IN (SELECT vds_id FROM
>vds_static
>>> > >WHERE
>>> > >>> vds_name = :host);DELETE FROM vds_static WHERE vds_name =
>>> > >:host;COMMIT;
>>> > >>> <CTRL d - exit>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> On Tue, 21 Apr 2020 at 15:19, Shareef Jalloq
><shareef(a)jalloq.co.uk>
>>> > >>> wrote:
>>> > >>> > Hi,
>>> > >>> >
>>> > >>> > I seem to have got a stale host in my engine that I
can't
>remove.
>>> > >I
>>> > >>> recently reinstalled oVirt Node on this host and while
trying
>to
>>> > >refresh
>>> > >>> the host in the engine, have got it in some state where I
>can't do
>>> > >anything.
>>> > >>> >
>>> > >>> > The host is listed as Status=Unassigned. Under the
>Management
>>> > >pull
>>> > >>> down I only have Restart and Stop options, both of which
error
>if
>>> > >>> selected. The Remove button is not available.
>>> > >>> >
>>> > >>> > How do I force a removal of this host from the view
so I can
>>> > >reload it?
>>> > >>> >
>>> > >>> > Shareef.
>>> > >>> > _______________________________________________
>>> > >>> > Users mailing list -- users(a)ovirt.org
>>> > >>> > To unsubscribe send an email to
users-leave(a)ovirt.org
>>> > >>> > Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>> > >>> > oVirt Code of Conduct:
>>> > >>>
https://www.ovirt.org/community/about/community-guidelines/
>>> > >>> > List Archives:
>>> > >>>
>>> >
>>https://lists.ovirt.org/archives/list/users@ovirt.org/message/TCBGLVCLMTZSQAURDDVGHKYXXQ36NO5H/
>>> > >>> >
>>> > >>> _______________________________________________
>>> > >>> Users mailing list -- users(a)ovirt.org
>>> > >>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> > >>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>> > >>> oVirt Code of Conduct:
>>> > >>>
https://www.ovirt.org/community/about/community-guidelines/
>>> > >>> List Archives:
>>> > >>>
>>> >
>>https://lists.ovirt.org/archives/list/users@ovirt.org/message/EAFO7EBLCETILFJGCY55K254TLRKJPRC/
>>> > >>>
>>> > >>
>>> >
>>> > Last time I had to remove a host - I didn't have your issues.
>>> > Yet, I'm not claiming the opposite - just that in case the host
>is also a gluster node , oVirt will intentionally prevent removal of
>the node.
>>> >
>>> > Your approach could be absolutely valid in case there is no
>Gluster involved - yet for HCI , Gluster brick has to be replaced
>prior taking any actions for the removal.
>>>
>>> Did any of you try to "Confirm 'Host has been rebooted'"?
Did this
>help?
>>>
>>> Best regards,
>>> --
>>> Didi
>>>
Hey Didi,
Sometimes the node just dies and I guess Shareef tested that situation.
Obviously.
"Confirm 'Host has been rebooted'" does not imply that you actually
rebooted
it. It just means that the engine can be sure it's not running stuff anymore.
If it's indeed up and the engine can connect to it, great. Otherwise, it does
mean that the engine can know that VMs that used to be on it are down, etc.
The engine still might wait until some timeout trying to connect to it, or
find its status (or the status of operations/objects related to it, etc.).
I admit I am not an engine developer myself and am not certain about
the details.
But in such a case (want to remove a dead host), you can always try to Confirm
and see if this helps. Admittedly, this does not help in each and every case.
If you run into a reproducible flow in which you think it should be possible
to remove a host but it's not, please open a bug. Thanks!
Best regards,