Hi ovirt gurus,
This is an interesting issue, one I never expected to have.
When I push high volumes of writes to my NAS, I will cause VM's to go into a paused state. I'm looking at this from a number of angles, including upgrades on the NAS appliance.
I can reproduce this problem at will running a centos 7.9 VM on Ovirt 4.5.
1. Is my analysis of the failure (below) reasonable/correct?
2. What am I looking for to validate this?
3. Is there a configuration that I can set to make it a little more robust while I acquire the hardware to improve the NAS?
Standard test of file write speed:
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=4096 oflag=direct
4096+0 records in
4096+0 records out
2147483648 bytes (2.1 GB) copied, 1.68431 s, 1.3 GB/s
Give it more data
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=12228 oflag=direct
12228+0 records in
12228+0 records out
6410993664 bytes (6.4 GB) copied, 7.22078 s, 888 MB/s
The odds are about 50/50 that 6 GB will kill the VM, but 100% when I hit 8 GB.
What I think appears to be happening is that the intent cache on the NAS is on an SSD, and my VM's are pushing data about three times as fast as the SSD can handle. When the SSD gets queued up beyond a certain point, the NAS (which places reliability over speed) says "Whoah Nellie!", and the VM chokes.