On Sun, Jul 4, 2021 at 1:01 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Sun, Jul 4, 2021 at 11:30 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
>
> Isn't it better to strace it before killing qemu-img .

It may be too late, but it may help to understand why this qemu-img
run got stuck.


Hi, thanks for your answers and suggestions.
That env was a production one and so I was forced to power off the hypervisor and power on it again (it was a maintenance window with all the VMs powered down anyway). I was also unable to put the host into maintenance because it replied that there were some tasks running, even after the kill, because the 2 processes (the VM had 2 disks to export and so two qemu-img processes) remained in defunct and after several minutes no change in web admin feedback about the process....

My first suspicion was something related to fw congestion because the hypervisor network and the nas appliance were in different networks and I wasn't sure if a fw was in place for it....
But on a test oVirt environment with same oVirt version and with the same network for hypervisors I was able to put a Linux server with the same network as the nas and configure it as nfs server.
And the export went with a throughput of about 50MB/s, so no fw problem.
A VM with 55Gb disk exported in 19 minutes.

So I got the rights to mount the nas on the test env and mounted it as export domain and now I have the same problems I can debug.
The same VM with only one disk (55Gb). The process:

vdsm     14342  3270  0 11:17 ?        00:00:03 /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/mnt/blockSD/679c0725-75fb-4af7-bff1-7c447c5d789c/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379 -O raw -o preallocation=falloc /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/d2a89b5e-7d62-4695-96d8-b762ce52b379

On the hypervisor the ls commands quite hang, so from another hypervisor I see that the disk size seems to remain at 4Gb even if timestamp updates...

# ll /rhev/data-center/mnt/172.16.1.137\:_nas_EXPORT-DOMAIN/20433d5d-9d82-4079-9252-0e746ce54106/images/530b3e7f-4ce4-4051-9cac-1112f5f9e8b5/
total 4260941
-rw-rw----. 1 nobody nobody 4363202560 Jul  5 11:23 d2a89b5e-7d62-4695-96d8-b762ce52b379
-rw-r--r--. 1 nobody nobody        261 Jul  5 11:17 d2a89b5e-7d62-4695-96d8-b762ce52b379.meta

On host console I see a throughput of 4mbit/s...

# strace -p 14342
strace: Process 14342 attached
ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, NULL, NULL, 8

# ll /proc/14342/fd
hangs...

# nfsstat -v
Client packet stats:
packets    udp        tcp        tcpconn
0          0          0          0      

Client rpc stats:
calls      retrans    authrefrsh
31171856   0          31186615

Client nfs v4:
null         read         write        commit       open         open_conf    
0         0% 2339179   7% 14872911 47% 7233      0% 74956     0% 2         0%
open_noat    open_dgrd    close        setattr      fsinfo       renew        
2312347   7% 0         0% 2387293   7% 24        0% 23        0% 5         0%
setclntid    confirm      lock         lockt        locku        access      
3         0% 3         0% 8         0% 8         0% 5         0% 1342746   4%
getattr      lookup       lookup_root  remove       rename       link        
3031001   9% 71551     0% 7         0% 74590     0% 6         0% 0         0%
symlink      create       pathconf     statfs       readlink     readdir      
0         0% 9         0% 16        0% 4548231  14% 0         0% 98506     0%
server_caps  delegreturn  getacl       setacl       fs_locations rel_lkowner  
39        0% 14        0% 0         0% 0         0% 0         0% 0         0%
secinfo      exchange_id  create_ses   destroy_ses  sequence     get_lease_t  
0         0% 0         0% 4         0% 2         0% 1         0% 0         0%
reclaim_comp layoutget    getdevinfo   layoutcommit layoutreturn getdevlist  
0         0% 2         0% 0         0% 0         0% 0         0% 0         0%
(null)      
5         0%


# vmstat 3
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  1      0 82867112 437548 7066580    0    0    54     1    0    0  0  0 100  0  0
 0  1      0 82867024 437548 7066620    0    0  1708     0 3720 8638  0  0 95  4  0
 4  1      0 82868728 437552 7066616    0    0   875     9 3004 8457  0  0 95  4  0
 0  1      0 82869600 437552 7066636    0    0  1785     6 2982 8359  0  0 95  4  0

I see the blocked process that is my qemu-img one...

In messages of hypervisor

Jul  5 11:33:06 node4 kernel: INFO: task qemu-img:14343 blocked for more than 120 seconds.
Jul  5 11:33:06 node4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul  5 11:33:06 node4 kernel: qemu-img        D ffff9d960e7e1080     0 14343   3328 0x00000080
Jul  5 11:33:06 node4 kernel: Call Trace:
Jul  5 11:33:06 node4 kernel: [<ffffffffa72de185>] ? sched_clock_cpu+0x85/0xc0
Jul  5 11:33:06 node4 kernel: [<ffffffffa72da830>] ? try_to_wake_up+0x190/0x390
Jul  5 11:33:06 node4 kernel: [<ffffffffa7988089>] schedule_preempt_disabled+0x29/0x70
Jul  5 11:33:06 node4 kernel: [<ffffffffa7985ff7>] __mutex_lock_slowpath+0xc7/0x1d0
Jul  5 11:33:06 node4 kernel: [<ffffffffa79853cf>] mutex_lock+0x1f/0x2f
Jul  5 11:33:06 node4 kernel: [<ffffffffc0db5489>] nfs_start_io_write+0x19/0x40 [nfs]
Jul  5 11:33:06 node4 kernel: [<ffffffffc0dad0d1>] nfs_file_write+0x81/0x1e0 [nfs]
Jul  5 11:33:06 node4 kernel: [<ffffffffa744d063>] do_sync_write+0x93/0xe0
Jul  5 11:33:06 node4 kernel: [<ffffffffa744db50>] vfs_write+0xc0/0x1f0
Jul  5 11:33:06 node4 kernel: [<ffffffffa744eaf2>] SyS_pwrite64+0x92/0xc0
Jul  5 11:33:06 node4 kernel: [<ffffffffa7993ec9>] ? system_call_after_swapgs+0x96/0x13a
Jul  5 11:33:06 node4 kernel: [<ffffffffa7993f92>] system_call_fastpath+0x25/0x2a
Jul  5 11:33:06 node4 kernel: [<ffffffffa7993ed5>] ? system_call_after_swapgs+0xa2/0x13a

Possibly problems with NFSv4? I see that it mounts as nfsv4:

# mount
. . .
172.16.1.137:/nas/EXPORT-DOMAIN on /rhev/data-center/mnt/172.16.1.137:_nas_EXPORT-DOMAIN type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.50.52,local_lock=none,addr=172.16.1.137)

This is a test oVirt env so I can wait and eventually test something...
Let me know your suggestions

Gianluca