
I replaced the SSD intent log drive with an NVME drive, and the system is much more stable now. *David Johnson* *Director of Development, Maxis Technology* 844.696.2947 ext 702 (o) | 479.531.3590 (c) <https://www.linkedin.com/in/pojoguy/> <https://maxistechnology.com/wp-content/uploads/vcards/vcard-David_Johnson.vcf> <https://maxistechnology.com/> *Follow us:* <https://www.linkedin.com/company/maxis-tech-inc/> On Thu, May 27, 2021 at 5:57 PM David Johnson <djohnson@maxistechnology.com> wrote:
Hi ovirt gurus,
This is an interesting issue, one I never expected to have.
When I push high volumes of writes to my NAS, I will cause VM's to go into a paused state. I'm looking at this from a number of angles, including upgrades on the NAS appliance.
I can reproduce this problem at will running a centos 7.9 VM on Ovirt 4.5.
*Questions:*
1. Is my analysis of the failure (below) reasonable/correct?
2. What am I looking for to validate this?
3. Is there a configuration that I can set to make it a little more robust while I acquire the hardware to improve the NAS?
*Reproduction:*
Standard test of file write speed:
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=4096 oflag=direct 4096+0 records in 4096+0 records out 2147483648 bytes (2.1 GB) copied, 1.68431 s, 1.3 GB/s
Give it more data
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=12228 oflag=direct 12228+0 records in 12228+0 records out 6410993664 bytes (6.4 GB) copied, 7.22078 s, 888 MB/s
The odds are about 50/50 that 6 GB will kill the VM, but 100% when I hit 8 GB.
*Analysis:*
What I think appears to be happening is that the intent cache on the NAS is on an SSD, and my VM's are pushing data about three times as fast as the SSD can handle. When the SSD gets queued up beyond a certain point, the NAS (which places reliability over speed) says "Whoah Nellie!", and the VM chokes.
*David Johnson*