On Sat, Oct 14, 2023 at 5:53 PM Devin A. Bougie <devin.bougie@cornell.edu> wrote:
Hello,

We have a functioning oVirt 4.5.4 cluster running on fully-updated EL9.2 hosts.  We are trying to migrate the self-hosted engine to a new iSCSI storage domain using the existing hosts, following the documented procedure:
- set the cluster into global maintenance mode
- backup the engine using "engine-backup --scope=all --mode=backup --file=backup.bck --log=backuplog.log"
- shutdown the engine
- restore the engine using "hosted-engine --deploy --4 --restore-from-file=backup.bck"

This almost works, but fails with the attached log file.  Any help or suggestions would be greatly appreciated, including alternate procedures for migrating a self-hosted engine from one domain to another.

Many thanks,
Devin


If I'm right, the starting error seems to be this one:

 2023-10-14 11:06:16,529-0400 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [local
host -> 192.168.1.25]: FAILED! => {"changed": true, "cmd": "set -euo pipefail && engine-config -g DisableFenceAtStartupInSec | c
ut -d' ' -f2 > /root/DisableFenceAtStartupInSec.txt", "delta": "0:00:01.495195", "end": "2023-10-14 11:06:16.184479", "msg": "no
n-zero return code", "rc": 1, "start": "2023-10-14 11:06:14.689284", "stderr": "Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=f
alse", "stderr_lines": ["Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false"], "stdout": "", "stdout_lines": []}

As the return code is 1 ("rc": 1,) and determines the failure of the playbook, possibly the old environment doesn't have DisableFenceAtStartupInSec engine config property correctly set and/or the "cut" command fails... Or some other problem with that config parameter. Can you verify what it put into /root/DisableFenceAtStartupInSec.txt?

I have only a 4.4.10 env at hand and on it:

[root@ovengine01 ~]# engine-config -g DisableFenceAtStartupInSec
Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
DisableFenceAtStartupInSec: 300 version: general
[root@ovengine01 ~]#

[root@ovengine01 ~]# set -euo pipefail && engine-config -g DisableFenceAtStartupInSec | cut -d' ' -f 2
Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
300
[root@ovengine01 ~]#

what is the output of this command on your old env:

engine-config -g DisableFenceAtStartupInSec
?
Are the source and target environments the same version?

If you have access to your old env could you also run this query on engine database:

select * from vdc_options where option_name='DisableFenceAtStartupInSec';

eg this way
[root@ovengine01 ~]# su - postgres
[postgres@ovengine01 ~]$ psql engine
psql (12.9)
Type "help" for help.

engine=# select * from vdc_options where option_name='DisableFenceAtStartupInSec';
 option_id |        option_name         | option_value | version | default_value
-----------+----------------------------+--------------+---------+---------------
        40 | DisableFenceAtStartupInSec | 300          | general | 300
(1 row)

engine=#

Gianluca