Hello,
it happened two times, always with VMs composed by more than 1 disk.
Now I have a VM with 9 disks and a total of about 370Gb:
1 x 90Gb
5 x 50Gb
3 x 10Gb
I export vm to an export domain and it started 2 hours and 10 minutes ago
at 15:46.
At beginning the write rate on export domain was 120MB/s (in line with I/O
capabilities of storage subsystem).
It seems 5 disks completed ok, while 4 don't complete, even if it seems to
be some reading activity:
with command
iotop -d 3 -k -o -P
I get this on hypervisor where qemu-img convert command is executing
Total DISK READ : 5712.75 K/s | Total DISK WRITE : 746.58 K/s
Actual DISK READ: 6537.11 K/s | Actual DISK WRITE: 6.89 K/s
PID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND
21238 idle vdsm 1344.18 K/s 183.12 K/s 0.00 % 0.15 % qemu-img
convert -p -t non~1d23-4829-8350-f81fa16ea8b0
21454 idle vdsm 1344.18 K/s 183.12 K/s 0.00 % 0.06 % qemu-img
convert -p -t non~443d-4563-8879-3dbd711b6936
21455 idle vdsm 1344.18 K/s 189.68 K/s 0.00 % 0.08 % qemu-img
convert -p -t non~b67d-4acd-9bb5-485bc3ffdf2e
21548 idle vdsm 1344.18 K/s 190.67 K/s 0.00 % 0.06 % qemu-img
convert -p -t non~7fc
On export share
[root@xfer ~]# ll -td
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/*
drwxr-xr-x 2 36 36 4096 Jan 21 17:21
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/ac33a8fe-f10e-4cbb-b121-2956f9925ade
drwxr-xr-x 2 36 36 4096 Jan 21 17:06
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/20c07082-d5ed-4804-867d-1f3f7c202b0e
drwxr-xr-x 2 36 36 4096 Jan 21 17:02
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/f8b1c6bc-de13-4416-9fe6-d7c26fb082b1
drwxr-xr-x 2 36 36 4096 Jan 21 17:02
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/1b5313ef-0ee5-4f9a-b2a6-baa153ff0984
drwxr-xr-x 2 36 36 4096 Jan 21 17:02
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/3f21c566-bc00-4dab-acbb-db9a3a1d76fa
drwxr-xr-x 2 36 36 4096 Jan 21 15:47
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/ed3a721e-fd1b-4fea-ad41-7ca56e94890e
drwxr-xr-x 2 36 36 4096 Jan 21 15:46
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/213f9ce9-6c34-4911-820e-4d1a96ba1791
drwxr-xr-x 2 36 36 4096 Jan 21 15:46
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/28fef57c-4254-4469-9f3f-0d1f114d4f78
drwxr-xr-x 2 36 36 4096 Jan 21 15:46
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/ea4fa5b1-a93a-4cb9-9552-e8ec67c5ff75
[root@xfer ~]# du -sh
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/*
11G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/1b5313ef-0ee5-4f9a-b2a6-baa153ff0984
51G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/20c07082-d5ed-4804-867d-1f3f7c202b0e
51G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/213f9ce9-6c34-4911-820e-4d1a96ba1791
51G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/28fef57c-4254-4469-9f3f-0d1f114d4f78
11G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/3f21c566-bc00-4dab-acbb-db9a3a1d76fa
85G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/ac33a8fe-f10e-4cbb-b121-2956f9925ade
51G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/ea4fa5b1-a93a-4cb9-9552-e8ec67c5ff75
51G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/ed3a721e-fd1b-4fea-ad41-7ca56e94890e
11G
/export/ovirt/a6a289ea-f160-4d35-b3aa-e59d171b4633/images/f8b1c6bc-de13-4416-9fe6-d7c26fb082b1
[root@xfer ~]#
It seems more than 30 minutes no more progress....
I see no errors in vdsm.log of hypervisor and in engine.log I keep this
message every 10 seconds:
2019-01-21 18:08:19,060+01 INFO
[org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback]
(EE-ManagedThreadFactory-engineScheduled-Thread-10)
[a1c0e39f-1924-4ae8-8b48-30bcce669278] Command 'ExportVm' (id:
'dc0a58bd-e8ca-42a9-8de0-dd3ccabe0512') waiting on child command id:
'29d40db8-a0b9-4408-9e62-1abf3bccca65' type:'CopyImageGroup' to complete
All the disks except the "apparently" not completed one (the 90Gb disk) are
preallocated.
Any other hint on what to check?
Thanks,
Gianluca