On Sun, Jul 4, 2021 at 1:01 PM Nir Soffer <nsoffer@redhat.com> wrote:

On Sun, Jul 4, 2021 at 11:30 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
>
> Isn't it better to strace it before killing qemu-img .

It may be too late, but it may help to understand why this qemu-img
run got stuck.

Hi, thanks for your answers and suggestions.

That env was a production one and so I was forced to power off the hypervisor and power on it again (it was a maintenance window with all the VMs powered down anyway). I was also unable to put the host into maintenance because it replied that there were some tasks running, even after the kill, because the 2 processes (the VM had 2 disks to export and so two qemu-img processes) remained in defunct and after several minutes no change in web admin feedback about the process....

My first suspicion was something related to fw congestion because the hypervisor network and the nas appliance were in different networks and I wasn't sure if a fw was in place for it....

But on a test oVirt environment with same oVirt version and with the same network for hypervisors I was able to put a Linux server with the same network as the nas and configure it as nfs server.

And the export went with a throughput of about 50MB/s, so no fw problem.

A VM with 55Gb disk exported in 19 minutes.

So I got the rights to mount the nas on the test env and mounted it as export domain and now I have the same problems I can debug.

The same VM with only one disk (55Gb). The process:

vdsm 14342 3270 0 11:17 ? 00:00:03 /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/mnt/blockSD/679c0725-75fb-4af7-bff1-7c447c5d789c/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379 -O raw -o preallocation=falloc /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379

On the hypervisor the ls commands quite hang, so from another hypervisor I see that the disk size seems to remain at 4Gb even if timestamp updates...

# ll /rhev/data-center/mnt/172.16.1.137\:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/
total 4260941
-rw-rw----. 1 nobody nobody 4363202560 Jul 5 11:23 d2a89b5e-7d62-4695-96d8-b762ce52b379
-rw-r--r--. 1 nobody nobody 261 Jul 5 11:17 d2a89b5e-7d62-4695-96d8-b762ce52b379.meta

On host console I see a throughput of 4mbit/s...

# strace -p 14342
strace: Process 14342 attached
ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, NULL, NULL, 8

# ll /proc/14342/fd

hangs...

# nfsstat -v
Client packet stats:
packets udp tcp tcpconn
0 0 0 0

Client rpc stats:
calls retrans authrefrsh
31171856 0 31186615

Client nfs v4:
null read write commit open open_conf
0 0% 2339179 7% 14872911 47% 7233 0% 74956 0% 2 0%
open_noat open_dgrd close setattr fsinfo renew
2312347 7% 0 0% 2387293 7% 24 0% 23 0% 5 0%
setclntid confirm lock lockt locku access
3 0% 3 0% 8 0% 8 0% 5 0% 1342746 4%
getattr lookup lookup_root remove rename link
3031001 9% 71551 0% 7 0% 74590 0% 6 0% 0 0%
symlink create pathconf statfs readlink readdir
0 0% 9 0% 16 0% 4548231 14% 0 0% 98506 0%
server_caps delegreturn getacl setacl fs_locations rel_lkowner
39 0% 14 0% 0 0% 0 0% 0 0% 0 0%
secinfo exchange_id create_ses destroy_ses sequence get_lease_t
0 0% 0 0% 4 0% 2 0% 1 0% 0 0%
reclaim_comp layoutget getdevinfo layoutcommit layoutreturn getdevlist
0 0% 2 0% 0 0% 0 0% 0 0% 0 0%
(null)
5 0%

# vmstat 3
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
3 1 0 82867112 437548 7066580 0 0 54 1 0 0 0 0 100 0 0
0 1 0 82867024 437548 7066620 0 0 1708 0 3720 8638 0 0 95 4 0
4 1 0 82868728 437552 7066616 0 0 875 9 3004 8457 0 0 95 4 0
0 1 0 82869600 437552 7066636 0 0 1785 6 2982 8359 0 0 95 4 0

I see the blocked process that is my qemu-img one...

In messages of hypervisor

Jul 5 11:33:06 node4 kernel: INFO: task qemu-img:14343 blocked for more than 120 seconds.
Jul 5 11:33:06 node4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 5 11:33:06 node4 kernel: qemu-img D ffff9d960e7e1080 0 14343 3328 0x00000080
Jul 5 11:33:06 node4 kernel: Call Trace:
Jul 5 11:33:06 node4 kernel: [<ffffffffa72de185>] ? sched_clock_cpu+0x85/0xc0
Jul 5 11:33:06 node4 kernel: [<ffffffffa72da830>] ? try_to_wake_up+0x190/0x390
Jul 5 11:33:06 node4 kernel: [<ffffffffa7988089>] schedule_preempt_disabled+0x29/0x70
Jul 5 11:33:06 node4 kernel: [<ffffffffa7985ff7>] __mutex_lock_slowpath+0xc7/0x1d0
Jul 5 11:33:06 node4 kernel: [<ffffffffa79853cf>] mutex_lock+0x1f/0x2f
Jul 5 11:33:06 node4 kernel: [<ffffffffc0db5489>] nfs_start_io_write+0x19/0x40 [nfs]
Jul 5 11:33:06 node4 kernel: [<ffffffffc0dad0d1>] nfs_file_write+0x81/0x1e0 [nfs]
Jul 5 11:33:06 node4 kernel: [<ffffffffa744d063>] do_sync_write+0x93/0xe0
Jul 5 11:33:06 node4 kernel: [<ffffffffa744db50>] vfs_write+0xc0/0x1f0
Jul 5 11:33:06 node4 kernel: [<ffffffffa744eaf2>] SyS_pwrite64+0x92/0xc0
Jul 5 11:33:06 node4 kernel: [<ffffffffa7993ec9>] ? system_call_after_swapgs+0x96/0x13a
Jul 5 11:33:06 node4 kernel: [<ffffffffa7993f92>] system_call_fastpath+0x25/0x2a
Jul 5 11:33:06 node4 kernel: [<ffffffffa7993ed5>] ? system_call_after_swapgs+0xa2/0x13a

Possibly problems with NFSv4? I see that it mounts as nfsv4:

# mount

. . .

172.16.1.137:/nas/EXPORT-DOMAIN on /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.50.52,local_lock=none,addr=172.16.1.137)

This is a test oVirt env so I can wait and eventually test something...

Let me know your suggestions

Gianluca