On Tue, Oct 8, 2019 at 4:06 PM Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
Hello,
I'm doing some tests related to storage latency or problems manually
created to debug and manage reactions of hosts and VMs.
What is the subsystem/process/daemon responsible to pause a VM when
problems arise on storage for the host where the VM is running?
How is determined the timeout to use to put the VM in pause mode?
Sometimes I see after clearing the problems that the VM is automatically
un-paused, sometimes no: how is this managed? Are there any counters so
that if VM has been paused and and problems are not solved in a certain
timeframe the unpause can be done only manually by the sysadmin?
Thanks in advance,
Gianluca
I have noticed that when virtual disk is virtio, the VM is not able to be
unpaused in storage unreachable for many seconds, while if I have
virtio-scsi and set high virtual disk timeout (like vSphere does on VMs
when vmware tools have been installed), then VM is able to be resumed.
The udev rule I have put into a CentOS 7 VM
inside /etc/udev/rules.d/99-ovirt.rules is this one
# Set timeout of virtio-SCSI disks to 180 secons like vSphere vmware tools
#
ACTION=="add", SUBSYSTEMS=="scsi", ATTRS{vendor}=="QEMU*",
ATTRS{model}=="QEMU HARDDISK*", ENV{DEVTYPE}=="disk",
RUN+="/bin/sh -c
'echo 180 > /sys$DEVPATH/device/timeout'"
What I have not understood is if it is possible to prevent at all vdsm (is
it the responsible?) to suddenly put the VM in paused state.
Eg for experiment I have iSCSI based storage domains and put this in
multipath.conf
devices {
device {
all_devs yes
# Set timeout of queuing of 5*28 = 140 seconds
# similar to vSphere APD timeout
# no_path_retry fail
no_path_retry 28
polling_interval 5
}
Then I create an iptables rule that for 100 seconds prevents host to reach
storage and a dd task that writes on disk inside VM
The effect is that vm is paused and after about 100 seconds
VM mydbsrv has recovered from paused back to up. 10/9/19 1:59:02 PM
VM mydbsrv has been paused due to storage I/O problem. 10/9/19 1:57:32 PM
VM mydbsrv has been paused. 10/9/19 1:57:32 PM
Any hint on how to prevent action of pausing the VM?
Thanks,
Gianluca