I replaced the SSD intent log drive with an NVME drive, and the system is much more stable now.

David Johnson
Director of Development, Maxis Technology
844.696.2947 ext 702 (o) | 479.531.3590 (c)

 

Follow us: 



On Thu, May 27, 2021 at 5:57 PM David Johnson <djohnson@maxistechnology.com> wrote:
Hi ovirt gurus,

This is an interesting issue, one I never expected to have.

When I push high volumes of writes to my NAS, I will cause VM's to go into a paused state. I'm looking at this from a number of angles, including upgrades on the NAS appliance.

I can reproduce this problem at will running a centos 7.9 VM on Ovirt 4.5.

Questions:

1. Is my analysis of the failure (below) reasonable/correct?

2. What am I looking for to validate this?

3. Is there a configuration that I can set to make it a little more robust while I acquire the hardware to improve the NAS?


Reproduction:

Standard test of file write speed:

[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=4096 oflag=direct
4096+0 records in
4096+0 records out
2147483648 bytes (2.1 GB) copied, 1.68431 s, 1.3 GB/s

Give it more data

[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=12228 oflag=direct
12228+0 records in
12228+0 records out
6410993664 bytes (6.4 GB) copied, 7.22078 s, 888 MB/s


The odds are about 50/50 that 6 GB will kill the VM, but 100% when I hit 8 GB.

Analysis:

What I think appears to be happening is that the intent cache on the NAS is on an SSD, and my VM's are pushing data about three times as fast as the SSD can handle. When the SSD gets queued up beyond a certain point, the NAS (which places reliability over speed) says "Whoah Nellie!", and the VM chokes.


David Johnson