Hello,
I am stuck in this situation...
It looks like engine certificate (engine.cer) expired few days ago
[root@ovirt ~]# openssl x509 -in /etc/pki/ovirt-engine/certs/engine.cer
-noout -dates
notBefore=Mar 23 21:34:19 2021 GMT
notAfter=Apr 26 21:34:19 2022 GMT
CA and other certs are still valid
Yesterday I had one host outage and HE restarted on other host. But it
cannot communicate with all hosts due to certificate expiration
lnav /var/log/ovirt-engine/engine.log
...
2022-05-02 11:02:29,127+02 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-43)
[] Unable to RefreshCapabilities: VDSNetworkException:
VDSGenericException: VDSNetworkException: Received fatal alert:
certificate_expired
...
There are vms still running on hosts.
Is there way how to (manualy?) renew engine cert and recover from this
situation?
I have tried run engine-setup (and select renew certificate during install)
[root@ovirt ~]# engine-setup --offline
but it fails with
[ ERROR ] It seems that you are running your engine inside of the
hosted-engine VM and are not in "Global Maintenance" mode.
In that case you should put the system into the "Global
Maintenance" mode before running engine-setup, or the hosted-engine HA
agent might kill the machine, which might corrupt your data.
[ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine
setup detected, but Global Maintenance is not set.
But global maintenance is enabled on host...
[root@ovirt06 ~]# hosted-engine --vm-status
!! Cluster is in GLOBAL MAINTENANCE mode !!
--== Host ovirt05.net.slu.cz (id: 1) status ==--
Host ID : 1
Host timestamp : 38627
Score : 3400
Engine status : {"vm": "down_unexpected",
"health":
"bad", "detail": "Down", "reason": "bad vm
status"}
Hostname : ovirt05.net.slu.cz
Local maintenance : False
stopped : False
crc32 : b719664d
conf_on_shared_storage : True
local_conf_timestamp : 38627
Status up-to-date : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=38627 (Mon May 2 10:55:43 2022)
host-id=1
score=3400
vm_conf_refresh_time=38627 (Mon May 2 10:55:43 2022)
conf_on_shared_storage=True
maintenance=False
state=EngineDown
stopped=False
--== Host ovirt06.net.slu.cz (id: 2) status ==--
Host ID : 2
Host timestamp : 8858161
Score : 3400
Engine status : {"vm": "up", "health":
"good",
"detail": "Up"}
Hostname : ovirt06.net.slu.cz
Local maintenance : False
stopped : False
crc32 : 414a980b
conf_on_shared_storage : True
local_conf_timestamp : 8858161
Status up-to-date : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=8858161 (Mon May 2 10:55:48 2022)
host-id=2
score=3400
vm_conf_refresh_time=8858161 (Mon May 2 10:55:48 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False
!! Cluster is in GLOBAL MAINTENANCE mode !!
relevant lines from ovirt-engine-setup log are
...
2022-05-02 11:08:02,194+0200 DEBUG
otopi.ovirt_engine_setup.engine_common.database database.execute:239
Creating own connection
2022-05-02 11:08:02,233+0200 DEBUG
otopi.ovirt_engine_setup.engine_common.database database.execute:284
Result: [{'vm_guid': '96a6b6a7-75a9-472a-9d4f-1502b415470a',
'run_on_vds': 'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a'}]
2022-05-02 11:08:02,234+0200 DEBUG
otopi.ovirt_engine_setup.engine_common.database database.execute:234
Database: 'None', Statement: '
SELECT vds_id, ha_global_maintenance
FROM vds_statistics
WHERE vds_id = %(VdsId)s;
', args: {'VdsId':
'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a'}
2022-05-02 11:08:02,234+0200 DEBUG
otopi.ovirt_engine_setup.engine_common.database database.execute:239
Creating own connection
2022-05-02 11:08:02,250+0200 DEBUG
otopi.ovirt_engine_setup.engine_common.database database.execute:284
Result: [{'vds_id': 'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a',
'ha_global_maintenance': False}]
2022-05-02 11:08:02,250+0200 ERROR
otopi.plugins.ovirt_engine_common.ovirt_engine.system.he
he._validate:114 It seems that you are running your engine inside of the
hosted-engine VM and are not in "Global Maintenance" mode.
In that case you should put the system into the "Global Maintenance"
mode before running engine-setup, or the hosted-engine HA agent might
kill the machine, which might corrupt your data.
...
Thanks in advance for any advice,
Jiri
Attachments:
- smime.p7s
(application/pkcs7-signature — 4.2 KB)