On Sun, Feb 6, 2022 at 5:09 PM Gilboa Davara <gilboad(a)gmail.com> wrote:
Unlike my predecessor, I not only lost my vmengine, I also lost the vdsm services on all
hosts.
All seem to be hitting the same issue - read, the certs under /etc/pki/vdsm/certs and
/etc/pki/ovirt* all expired a couple of days ago.
As such, the hosted engine cannot go into global maintenance mode,
What do you mean by that? What happens if you 'hosted-engine
--set-maintenance --mode=global'?
preventing engine-setup --offline from running.
Actually just a few days ago I pushed a patch for:
https://bugzilla.redhat.com/show_bug.cgi?id=1700460
But:
If you really have a problem that you can't set global maintenance,
using this is a risk - HA might intervene in the middle and shutdown
the VM. So either make sure global maintenance does work, or stop
all HA services on all hosts.
Two questions:
1. Is there any automated method to renew the vdsm certificates?
You mean, without an engine?
I think that if you have a functional engine one way or another,
you can automate this somehow, didn't check. Try checking e.g. the
python sdk examples - there might be there something you can base
on.
2. Assuming the previous answer is "no", assuming I'm
somewhat versed in using openssl, how can I manually renew them?
I'd rather not try to invent from memory how this is supposed to work,
and doing this methodically and verifying before replying is quite
an effort.
If this is really what you want, I suggest something like:
1. Set up a test env with an engine and one host
2. Backup (or use git on) /etc on both
3. Renew the host cert from the UI
4. Check what changed
You should find, IMO, that the key(s) on the host didn't
change. I guess you might also find CSRs on one or both of them.
So basically it should be something like:
1. Create a CSR on the host for the existing key (one or more,
not sure).
2. Copy and sign this on the engine using pki-enroll-request.sh
(I think you can find examples for it scattered around, perhaps
even in the main guides)
3. Copy back the generated certs to the host
4. Perhaps restart one or more services there (vdsm, imageio?,
ovn, etc.)
You can check the code in
/usr/share/ovirt-engine/ansible-runner-service-project/project
to see how it's done when initiated from the UI.
Good luck and best regards,
--
Didi