Hi,

On Tue, Aug 8, 2023 at 9:21 PM David Johnson <djohnson@maxistechnology.com> wrote:
Good afternoon all,

We had a confluence of events hit all at once and need help desperately. Our Ovirt engine system recently crashed and is unrecoverable. Due to a power maintenance event at the data center, 1/3 of our VM's are offline.

I have recent backups from the engine created with engine-backup.

How do you run engine-backup for backups? What version? What OS?
 

I installed a clean Centos 9 and followed the directions to install the ovirt-engine .

After I restore the backup, the engine-setup fails on the keycloak configuration.

From clean system:

Install: (Observe failed scriptlet during install, but rom install still succeeds)

[root@ovirt2 administrator]# dnf install -y ovirt-engine
Last metadata expiration check: 2:08:15 ago on Tue 08 Aug 2023 10:11:31 AM CDT.
Dependencies resolved.
=============================================================================================================================================================
 Package                                                      Architecture       Version                            Repository                          Size
=============================================================================================================================================================
Installing:
 ovirt-engine                                                 noarch             4.5.4-1.el9                        centos-ovirt45                      13 M
Installing dependencies:
 SuperLU                                                      x86_64             5.3.0-2.el9                        epel                               182 k
(Snip ...)

  Running scriptlet: ovirt-vmconsole-1.0.9-1.el9.noarch                                                                                               60/425
Failed to resolve allow statement at /var/lib/selinux/targeted/tmp/modules/400/ovirt_vmconsole/cil:539
Failed to resolve AST
/usr/sbin/semodule:  Failed!


This might cause a problem later on, but I do not think it's related to your current issue.
 

(Snip ...)
 xmlrpc-common-3.1.3-1.1.el9.noarch                                          xorg-x11-fonts-ISO8859-1-100dpi-7.5-33.el9.noarch
  zziplib-0.13.71-9.el9.x86_64

Complete!

Engine-restore (no visible issues):

[root@ovirt2 administrator]# engine-backup --mode=restore --log=restore1.log --file=Downloads/engine-2023-08-06.22.00.02.bak --provision-all-databases --restore-permissions
Start of engine-backup with mode 'restore'
scope: all
archive file: Downloads/engine-2023-08-06.22.00.02.bak
log file: restore1.log
Preparing to restore:
- Unpacking file 'Downloads/engine-2023-08-06.22.00.02.bak'
Restoring:
- Files
------------------------------------------------------------------------------
Please note:

Operating system is different from the one used during backup.
Current operating system: centos9
Operating system at backup: centos8

I do not think this is the problem, but you might try as well on centos8.
 

Apache httpd configuration will not be restored.
You will be asked about it on the next engine-setup run.
------------------------------------------------------------------------------
Provisioning PostgreSQL users/databases:
- user 'engine', database 'engine'
- user 'ovirt_engine_history', database 'ovirt_engine_history'
- user 'ovirt_engine_history_grafana' on database 'ovirt_engine_history'

 
Restoring:
- Engine database 'engine'
  - Cleaning up temporary tables in engine database 'engine'
  - Updating DbJustRestored VdcOption in engine database
  - Resetting DwhCurrentlyRunning in dwh_history_timekeeping in engine database
  - Resetting HA VM status
------------------------------------------------------------------------------
Please note:

The engine database was backed up at 2023-08-06 22:00:19.000000000 -0500 .

Objects that were added, removed or changed after this date, such as virtual
machines, disks, etc., are missing in the engine, and will probably require
recovery or recreation.
------------------------------------------------------------------------------
- DWH database 'ovirt_engine_history'
- Grafana database '/var/lib/grafana/grafana.db'

No Keycloak DB restored. I guess it was not backed up, perhaps not even configured.
 
You should now run engine-setup.
Done.
[root@ovirt2 administrator]#

Engine-setup :

[root@ovirt2 administrator]# engine-setup
[ INFO  ] Stage: Initializing
[ INFO  ] Stage: Environment setup
          Configuration files: /etc/ovirt-engine-setup.conf.d/10-packaging-jboss.conf, /etc/ovirt-engine-setup.conf.d/10-packaging.conf,
          /etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf
          Log file: /var/log/ovirt-engine/setup/ovirt-engine-setup-20230808124501-joveku.log
          Version: otopi-1.10.3 (otopi-1.10.3-1.el9)
[ INFO  ] The engine DB has been restored from a backup
[ ERROR ] Failed to execute stage 'Environment setup': Cannot connect to Keycloak database 'ovirt_engine_keycloak' using existing credentials: ovirt_engine_keycloak@localhost:5432

Despite the above, engine-setup tries to connect to the Keycloak DB. Not sure why. Perhaps check/share also relevant snippets of the setup log.
Anyway, I think you should be able to make it not try this, by finding and removing a file under /etc/ovirt-engine/engine.conf.d/ - should be 12-setup-keycloak.conf. I am writing from memory - do not have an engine setup available to check on anymore.
 
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20230808124501-joveku.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20230808124504-setup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Execution of setup failed
[root@ovirt2 administrator]#

Engine-cleanup results:
(snip)

(snip) is too late here, to see if it cleaned the engine DB (and others).
 
[ INFO  ] Stage: Clean up
          Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-remove-20230808120445-mj4eef.log
[ INFO  ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20230808120508-cleanup.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ INFO  ] Execution of cleanup completed successfully
[root@cen-90-tmpl administrator]#


Engine backup (restore) results:

[root@ovirt2 administrator]# engine-backup --mode=restore --log=restore1.log --file=Downloads/engine-2023-08-06.22.00.02.bak --provision-all-databases --restore-permissions
Start of engine-backup with mode 'restore'
scope: all
archive file: Downloads/engine-2023-08-06.22.00.02.bak
log file: restore1.log
Preparing to restore:
- Unpacking file 'Downloads/engine-2023-08-06.22.00.02.bak'
Restoring:
- Files
------------------------------------------------------------------------------
Please note:

Operating system is different from the one used during backup.
Current operating system: centos9
Operating system at backup: centos8

Apache httpd configuration will not be restored.
You will be asked about it on the next engine-setup run.
------------------------------------------------------------------------------
Provisioning PostgreSQL users/databases:
- user 'engine', database 'engine'
FATAL: Existing database 'engine' or user 'engine' found and temporary ones created - Please clean up everything and try again

Meaning, the engine DB was not cleaned. If it's for production, I recommend reinstalling the OS. If it's just for testing, you can try:

1. Make sure there are no postgresql clients connected (engine, dwh, grafana, etc.)
2. systemctl stop postgresql-server
3. rm -rf /var/lib/pgsql/data

Then try again restore and setup. All of this is also from memory, sorry for mistakes if any.

Good luck and best regards,
--
Didi