I went through an issue like that on an older version. I documented what I did to fix it at https://docs.google.com/document/d/1_sBabK1zTucY_oFdquT2qjE46-iNVfN09PyLXFLIIxg/edit?usp=sharing. I put that together after a lot of back and forth so I hope it is accurate and it helps.  


CPR Image
  
 Christopher Cross
Linux Systems Network Administrator
Colorado Public Radio
Bridges Broadcast Center
7409 S. Alton Ct. | Centennial, CO 80112
(303) 871-9191, ext. 4258
Let's connect: Facebook | Twitter
www.cpr.org
 


On Mon, Nov 17, 2025 at 2:27 AM Michael Thomas <wart@caltech.edu> wrote:
I seem to have gotten myself into a bit of a pickle.  My ovirt host
certs (4-node 4.4.1 cluster) expired recently and I didn't notice until
one of my hosts got fenced on Friday.  I was finally able to get it back
online by rebooting, telling the ovirt-engine the host had been rebooted
(so that it didn't think VMs were running on it), and then going into
maintenance mode and 'enroll certificates' to renew the certs.  This
worked with the second host as well.  But I was unable to put the third
host into maintenance mode because the engine claimed "Another power
management action is already in progress".  Sure enough, the task list
showed the engine trying to take action against this host.  It was stuck
in that state for over an hour, so I finally poked the ovirt-engine
database directly to fail these tasks:

psql engine -c "UPDATE job SET status = 'FAILED', end_time = NOW() WHERE
status NOT IN ('FINISHED', 'FAILED');"
psql engine -c "UPDATE step SET status = 'FAILED', end_time = NOW()
WHERE status NOT IN ('FINISHED', 'FAILED');"

This got the tasks to clear, but the engine UI still claimed that the
stuck host still had "Another power management action is already in
progress".  Then I made the fatal mistake of rebooting the ovirt-engine
(which was still running on the fourth host with expired certs).

Now my engine won't start, even when I try to start it on one of the
host that had certs renewed.  The libvirt logs indicate that it's
because the vnc cert is expired:

qemu-kvm: -object
tls-creds-x509,id=vnc-tls-creds0,dir=/etc/pki/vdsm/libvirt-vnc,endpoint=server,verify-peer=no:
The server certificate /etc/pki/vdsm/libvirt-vnc/server-cert.pem has expired

Since I don't care about vnc access to the engine, is there a way to get
the engine running again without vnc?

Alternately, is there a way that I can manually get some of the more
critical VMs running without an engine running?

FWIW, I tried searching the list archives, but I get a 500 server error
when trying to go to lists.ovirt.org.

--Mike
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TXLZTLC45SUAJLPK7NOTZJFT4YRZLEL5/

...