In engine.log the first error I see is 30 minutes after start
2019-07-19 12:25:31,563+02 ERROR [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-engineScheduled-Thread-64) [2001ddf4] Ansible playbook execution failed: Timeout occurred while executing Ansible playbook.
In the mean time, as the playbook seems this one ( I run the job from engine) : /usr/share/ovirt-engine/playbooks/ovirt-ova-export.yml
I created at the moment the file
/etc/ovirt-engine/engine.conf.d/99-ansible-playbook-timeout.conf
with
ANSIBLE_PLAYBOOK_EXEC_DEFAULT_TIMEOUT=80
and restarted the engine and the python script to verify
Just to see if it completes, even if in my case with a 30Gb preallocated disk, the source problem is qemu-img convert command very slow in I/O.
It reads from iscsi multipath (2 paths) with 2x3MB/s and it writes on nfs
If I run a dd command from iscsi device mapper device to an nfs file I have 140MB/s rate that is what expected based on my storage array performances and my network.
Not understood why the qemu-img command is so slow
The question still applies in case I have to do an appliance from a VM with a very big disk, where the copy could potentially have an elapsed of more that 30 minutes...
Gianluca
I confirm that setting ANSIBLE_PLAYBOOK_EXEC_DEFAULT_TIMEOUT was the solution.
I got the ova completed:
Starting to export Vm enginecopy1 as a Virtual Appliance 7/19/19 5:53:05 PM
Vm enginecopy1 was exported successfully as a Virtual Appliance to path /save_ova/base/dump/myvm2.ova on Host ov301 7/19/19 6:58:07 PM
I have to understand why the conversion of the pre-allocated disk is so slow, because simulating I/O from iSCSI lun where VM disks live to the NFS share gives me about 110MB/s
I'm going to update to 4.3.4, just to see if there is any bug fixed. The same operation on vSphere have an elapsed of 5 minutes.
What is the eta for 4.3.5?
One notice:
if I manually create a snapshot of the same VM and then clone the snapshot, the process is this one
vdsm 5713 20116 6 10:50 ? 00:00:04 /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/mnt/blockSD/fa33df49-b09d-4f86-9719-ede649542c21/images/59a4a324-4c99-4ff5-abb1-e9bbac83292a/0420ef47-0ad0-4cf9-babd-d89383f7536b -O raw -W /rhev/data-center/mnt/blockSD/fa33df49-b09d-4f86-9719-ede649542c21/images/d13a5c43-0138-4cbb-b663-f3ad5f9f5983/fd4e1b08-15fe-45ee-ab12-87dea2d29bc4
and its speed is quite better (up to 100MB/s read and 100MB/s write) with a total elapsed of 6 minutes and 30 seconds.
during the ova generation the process was instead:
vdsm 13505 13504 3 14:24 ? 00:01:26 qemu-img convert -T none -O qcow2 /rhev/data-center/mnt/blockSD/fa33df49-b09d-4f86-9719-ede649542c21/images/59a4a324-4c99-4ff5-abb1-e9bbac83292a/0420ef47-0ad0-4cf9-babd-d89383f7536b /dev/loop0
could it be the "-O qcow2" the reason? Why qcow2 if origin is preallocated (raw)?
Gianluca