
On Tue, Jul 14, 2020 at 4:40 PM Anton Louw via Users <users@ovirt.org> wrote:
Hi Andrea,
I have had sleepless nights with the same issue, but eventually figured it out. The two commands I used are below:
1. Backup the configs
engine-backup --scope=all --mode=backup --file=Full --log=Log_Full
1. Restore the configs
engine-backup --mode=restore --file=Full --log=Log_Full --provision-db --provision-dwh-db --restore-permissions
Andrea asked about hosted-engine, where you do not (by default) run restore manually yourself.
I ran the second command after I built a new HE from scratch. You just need to download the backup files from the original and copy them to the new HE before you restore.
Give it a bash and see if it works for you.
Cheers
*Anton Louw* *Cloud Engineer: Storage and Virtualization* at *Vox* ------------------------------ *T:* 087 805 0000 | *D:* 087 805 1572 *M:* N/A *E:* anton.louw@voxtelecom.co.za *A:* Rutherford Estate, 1 Scott Street, Waverley, Johannesburg www.vox.co.za
[image: F] <https://www.facebook.com/voxtelecomZA> [image: T] <https://www.twitter.com/voxtelecom> [image: I] <https://www.instagram.com/voxtelecomza/> [image: L] <https://www.linkedin.com/company/voxtelecom> [image: Y] <https://www.youtube.com/user/VoxTelecom>
[image: #VoxBrand] <https://www.vox.co.za/fibre/fibre-to-the-home/?prod=HOME> *Disclaimer*
The contents of this email are confidential to the sender and the intended recipient. Unless the contents are clearly and entirely of a personal nature, they are subject to copyright in favour of the holding company of the Vox group of companies. Any recipient who receives this email in error should immediately report the error to the sender and permanently delete this email from all storage devices.
This email has been scanned for viruses and malware, and may have been automatically archived by *Mimecast Ltd*, an innovator in Software as a Service (SaaS) for business. Providing a *safer* and *more useful* place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here <https://www.voxtelecom.co.za/security/mimecast/?prod=Enterprise>.
*From:* Andrea Chierici <andrea.chierici@cnaf.infn.it> *Sent:* 14 July 2020 14:54 *To:* users@ovirt.org *Subject:* [ovirt-users] Failing to restore a backup
Dear all, I'm rather new to the list, not to ovirt, that I use since 2014 profitably. I've a problem with an ovirt instance and I am desperately seeking for help.
I run a 4.3 self hosted engine installation, with 8 hypervisors and an iscsi storage. Since the storage is not very reliable, I bought a dell powervault where to move everything. No problem to move the VMs, the problem came out with the hosted engine. I've read many documentation and the procedure I think I must follow involves backing up the current HE, powering it off, installing a new host where to create a new HE recovering the backup. The command I used to generate the backup is: engine-backup --mode=backup --file=file_name --log=log_file_name
and the command used to restore it on the new HE is: hosted-engine --deploy --restore-from-file=backup/file_name
The problem comes out during the recovering of the backup.
With versions prior to 4.3.11 and also with 4.4.0 I got the error: 2020-06-25 15:17:34,950+0200 ERROR ansible failed { "ansible_host": "localhost", "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", "ansible_result": { "_ansible_no_log": false, "changed": false, "invocation": { "module_args": { "ca_file": null, "compress": true, "headers": null, "hostname": null, "insecure": null, "kerberos": false, "ovirt_auth": { "ansible_facts": { "ovirt_auth": { "ca_file": null, "compress": true, "headers": null, "insecure": true, "kerberos": false, "timeout": 0, "token": "1f5Zoys35sQmLb2MiEg6bhWm2rDJULFan3eBK0juJJR3S-nXtN_b31jac1sZ0KRz3d1KSDmr8tyf7ExNe_pqJg", "url": "https://ovirt-sgsi.cnaf.infn.it/ovirt-engine/api" <https://ovirt-sgsi.cnaf.infn.it/ovirt-engine/api> } }, "attempts": 1, "changed": false, "failed": false }, "password": null, "state": "absent", "timeout": 0, "token": null, "url": null, "username": null } }, "msg": "You must specify either 'url' or 'hostname'." }, "ansible_task": "Always revoke the SSO token", "ansible_type": "task", "status": "FAILED", "task_duration": 3 }
I think this is not a critical failure, and is not what failed the restore.
Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:
[ INFO ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc configuration': (2, 'No such file or directory')\n[ INFO ] DNF Performing DNF transaction rollback\n
This is part of 'engine-setup' output, which 'hosted-engine' runs inside the engine VM. If you can access the engine VM, you can try finding more information in /var/log/ovirt-engine/setup/* there. Otherwise, the hosted-engine deploy script might have managed to get a copy to /var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. Thanks.
I simply can't figure out what file is missing. If, as a test, I try to install the HE without restoring the backup, the installation goes smoothly to the end, but at that point I can't restore the backup, as far as I can understand.
Another option is to do the restore manually. To find relevant information, search the net for "enginevm_before_engine_setup". Best regards,
Any hint on what I may be missing?
Thanks,
Andrea
--
Andrea Chierici - INFN-CNAF
Viale Berti Pichat 6/2, 40127 BOLOGNA
Office Tel: +39 051 2095463
SkypeID ataruz
--
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q4GGDSLBACN4M6...
-- Didi

Hi, thank you for your help.
I think this is not a critical failure, and is not what failed the restore.
Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:
[ INFO ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc configuration': (2, 'No such file or directory')\n[ INFO ] DNF Performing DNF transaction rollback\n
This is part of 'engine-setup' output, which 'hosted-engine' runs inside the engine VM. If you can access the engine VM, you can try finding more information in /var/log/ovirt-engine/setup/* there. Otherwise, the hosted-engine deploy script might have managed to get a copy to /var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. Thanks.
Unfortunately the installation procedures when exiting, deletes the vm, hence I can't log in. Here are the ERROR messages I got on the logs copied on the host: engine.log:2020-07-08 15:05:04,178+02 ERROR [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) [45a7e7f3] Can not run fence action on host '<erased_for_privacy>', no suitable proxy host was found. server.log:2020-07-08 15:09:23,081+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found server.log:2020-07-08 15:14:19,804+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found grep: setup: Is a directory Not very helpful.
I simply can't figure out what file is missing. If, as a test, I try to install the HE without restoring the backup, the installation goes smoothly to the end, but at that point I can't restore the backup, as far as I can understand.
Another option is to do the restore manually. To find relevant information, search the net for "enginevm_before_engine_setup".
Later I will give it a try. Andrea -- Andrea Chierici - INFN-CNAF Viale Berti Pichat 6/2, 40127 BOLOGNA Office Tel: +39 051 2095463 SkypeID ataruz --

On Tue, Jul 14, 2020 at 6:04 PM Andrea Chierici <andrea.chierici@cnaf.infn.it> wrote:
Hi, thank you for your help.
I think this is not a critical failure, and is not what failed the restore.
Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:
[ INFO ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc configuration': (2, 'No such file or directory')\n[ INFO ] DNF Performing DNF transaction rollback\n
This is part of 'engine-setup' output, which 'hosted-engine' runs inside the engine VM. If you can access the engine VM, you can try finding more information in /var/log/ovirt-engine/setup/* there. Otherwise, the hosted-engine deploy script might have managed to get a copy to /var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. Thanks.
Unfortunately the installation procedures when exiting, deletes the vm, hence I can't log in.
Are you sure? Did you check with 'ps', searching qemu processes? If it's still up, but still using a local IP address, you can find it by searching the hosted-engine logs for 'local_vm_ip' and login there from the host.
Here are the ERROR messages I got on the logs copied on the host:
engine.log:2020-07-08 15:05:04,178+02 ERROR [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) [45a7e7f3] Can not run fence action on host '<erased_for_privacy>', no suitable proxy host was found.
That's ok.
server.log:2020-07-08 15:09:23,081+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found server.log:2020-07-08 15:14:19,804+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found
This probably indicates a problem, but I agree it's not very helpful.
grep: setup: Is a directory
Right - so please search inside it. Also please check the hosted-engine deploy logs themselves.
Not very helpful.
I simply can't figure out what file is missing. If, as a test, I try to install the HE without restoring the backup, the installation goes smoothly to the end, but at that point I can't restore the backup, as far as I can understand.
Another option is to do the restore manually. To find relevant information, search the net for "enginevm_before_engine_setup".
Later I will give it a try.
Good luck and best regards, -- Didi

Dear all, I think I finally understood the issue, even if I don't know how to fix it. Trying to install a new HE from a backup I get the error: "The host has been set in non_operational status, please check engine logs, more info can be found in the engine logs, fix accordingly and re-deploy." *The host, not the hosted engine*. This is more clear in another log: Host <removed_for_privacy> is set to Non-Operational, it is missing the following networks: 'iscsi_net,sgsi_iscsi,sgsi_priv,sgsi_vpn' The fact is that those networks are present on the host: # ip addr <CUT> 26: sgsi_priv: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff inet6 fe80::92e2:baff:fe63:2ebc/64 scope link 28: sgsi_vpn: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff inet6 fe80::92e2:baff:fe63:2ebc/64 scope link valid_lft forever preferred_lft forever The other two are configured on ovirt but not configurable on bare metal system, indeed if I issue "ip addr" on a production host I don't see those nets at all: I am puzzled. The problem is definitely this one, can anyone provide any suggestion on how to proceed? Why is it complaining about sgsi_priv and sgsi_vpn that are not missing at all? Andrea On 15/07/2020 08:33, Yedidyah Bar David wrote:
On Tue, Jul 14, 2020 at 6:04 PM Andrea Chierici <andrea.chierici@cnaf.infn.it> wrote:
Hi, thank you for your help.
I think this is not a critical failure, and is not what failed the restore.
Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:
[ INFO ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc configuration': (2, 'No such file or directory')\n[ INFO ] DNF Performing DNF transaction rollback\n
This is part of 'engine-setup' output, which 'hosted-engine' runs inside the engine VM. If you can access the engine VM, you can try finding more information in /var/log/ovirt-engine/setup/* there. Otherwise, the hosted-engine deploy script might have managed to get a copy to /var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. Thanks.
Unfortunately the installation procedures when exiting, deletes the vm, hence I can't log in. Are you sure? Did you check with 'ps', searching qemu processes?
If it's still up, but still using a local IP address, you can find it by searching the hosted-engine logs for 'local_vm_ip' and login there from the host.
Here are the ERROR messages I got on the logs copied on the host:
engine.log:2020-07-08 15:05:04,178+02 ERROR [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) [45a7e7f3] Can not run fence action on host '<erased_for_privacy>', no suitable proxy host was found. That's ok.
server.log:2020-07-08 15:09:23,081+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found server.log:2020-07-08 15:14:19,804+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found This probably indicates a problem, but I agree it's not very helpful.
grep: setup: Is a directory Right - so please search inside it.
Also please check the hosted-engine deploy logs themselves.
Not very helpful.
I simply can't figure out what file is missing. If, as a test, I try to install the HE without restoring the backup, the installation goes smoothly to the end, but at that point I can't restore the backup, as far as I can understand.
Another option is to do the restore manually. To find relevant information, search the net for "enginevm_before_engine_setup".
Later I will give it a try. Good luck and best regards,
-- Andrea Chierici - INFN-CNAF Viale Berti Pichat 6/2, 40127 BOLOGNA Office Tel: +39 051 2095463 SkypeID ataruz --

On Wed, Jul 15, 2020 at 6:21 PM Andrea Chierici <andrea.chierici@cnaf.infn.it> wrote:
Dear all, I think I finally understood the issue, even if I don't know how to fix it.
Trying to install a new HE from a backup I get the error: "The host has been set in non_operational status, please check engine logs, more info can be found in the engine logs, fix accordingly and re-deploy."
The host, not the hosted engine. This is more clear in another log: Host <removed_for_privacy> is set to Non-Operational, it is missing the following networks: 'iscsi_net,sgsi_iscsi,sgsi_priv,sgsi_vpn'
The fact is that those networks are present on the host: # ip addr <CUT> 26: sgsi_priv: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff inet6 fe80::92e2:baff:fe63:2ebc/64 scope link 28: sgsi_vpn: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 90:e2:ba:63:2e:bc brd ff:ff:ff:ff:ff:ff inet6 fe80::92e2:baff:fe63:2ebc/64 scope link valid_lft forever preferred_lft forever
The other two are configured on ovirt but not configurable on bare metal system, indeed if I issue "ip addr" on a production host I don't see those nets at all: I am puzzled. The problem is definitely this one, can anyone provide any suggestion on how to proceed? Why is it complaining about sgsi_priv and sgsi_vpn that are not missing at all?
If you pass --restore-from-file, you should be prompted, at some point, IMO (copying from the code, didn't test recently): 'Pause the execution after adding this host to the ' 'engine?\n' 'You will be able to iteratively connect to ' 'the restored engine in order to manually ' 'review and remediate its configuration before ' 'proceeding with the deployment:\nplease ensure that ' 'all the datacenter hosts and storage domain are ' 'listed as up or in maintenance mode before ' 'proceeding.\nThis is normally not required when ' 'restoring an up to date and coherent backup. ' '(@VALUES@)[@DEFAULT@]: ' Were you? If so, you can reply 'Yes', and then, later on, you should get a message: - name: Pause the execution to let the user interactively reconfigure the host - name: Let the user connect to the bootstrap engine to manually fix host configuration msg: >- You can now connect to {{ bootstrap_engine_url }} and check the status of this host and eventually remediate it, please continue only when the host is listed as 'up' - name: Pause execution until {{ he_setup_lock_file.path }} is removed, delete it once ready to proceed At this point, the deploy process will wait until you remove this file, before continuing. Then, you can login to the engine admin ui, change whatever needed on the host - including configuring networks or whatever, until you manage to bring it 'Up'. Then remove the file. Good luck and best regards,
Andrea
On 15/07/2020 08:33, Yedidyah Bar David wrote:
On Tue, Jul 14, 2020 at 6:04 PM Andrea Chierici <andrea.chierici@cnaf.infn.it> wrote:
Hi, thank you for your help.
I think this is not a critical failure, and is not what failed the restore.
Recently I tried the 4.3.11 beta and 4.4.1 and the error now is different:
[ INFO ] Upgrading CA\n[ ERROR ] Failed to execute stage 'Misc configuration': (2, 'No such file or directory')\n[ INFO ] DNF Performing DNF transaction rollback\n
This is part of 'engine-setup' output, which 'hosted-engine' runs inside the engine VM. If you can access the engine VM, you can try finding more information in /var/log/ovirt-engine/setup/* there. Otherwise, the hosted-engine deploy script might have managed to get a copy to /var/log/ovirt-hosted-engine-setup/engine-logs*. Please check/share these. Thanks.
Unfortunately the installation procedures when exiting, deletes the vm, hence I can't log in.
Are you sure? Did you check with 'ps', searching qemu processes?
If it's still up, but still using a local IP address, you can find it by searching the hosted-engine logs for 'local_vm_ip' and login there from the host.
Here are the ERROR messages I got on the logs copied on the host:
engine.log:2020-07-08 15:05:04,178+02 ERROR [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-89) [45a7e7f3] Can not run fence action on host '<erased_for_privacy>', no suitable proxy host was found.
That's ok.
server.log:2020-07-08 15:09:23,081+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found server.log:2020-07-08 15:14:19,804+02 ERROR [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-1) RESTEASY002010: Failed to execute: javax.ws.rs.WebApplicationException: HTTP 404 Not Found
This probably indicates a problem, but I agree it's not very helpful.
grep: setup: Is a directory
Right - so please search inside it.
Also please check the hosted-engine deploy logs themselves.
Not very helpful.
I simply can't figure out what file is missing. If, as a test, I try to install the HE without restoring the backup, the installation goes smoothly to the end, but at that point I can't restore the backup, as far as I can understand.
Another option is to do the restore manually. To find relevant information, search the net for "enginevm_before_engine_setup".
Later I will give it a try.
Good luck and best regards,
-- Andrea Chierici - INFN-CNAF Viale Berti Pichat 6/2, 40127 BOLOGNA Office Tel: +39 051 2095463 SkypeID ataruz --
-- Didi
participants (2)
-
Andrea Chierici
-
Yedidyah Bar David