
You can always check the queue_if_no_path multipath.conf option and give it a try. But if your system that is queue-ing get's rebooted - that will be data loss -> use on your own risk. Don't forget that the higher in I/O chain you go - the higher the timeout is needed, so your VM should also use multipath with that option, in addition to the host. Still, we can't help you if you use that feature and you loose data. Best Regards, Strahil NikolovOn Oct 10, 2019 10:55, Francesco Romani <fromani@redhat.com> wrote:
On 10/10/19 9:07 AM, Gianluca Cecchi wrote:
How is determined the timeout to use to put the VM in pause mode?
The VM is paused immediately as soon as libvirt, through QEMU, reports IOError, to avoid data corruption. Now, when libvirt reports this error
depends laregly on the timeout set for the storage configuration, which is done at host level, using system tools (e.g. it is not a Vdsm tunable)
For test I have set this in multipath.conf of host:
devices { device { all_devs yes # Set timeout of queuing of 5*28 = 140 seconds # similar to vSphere APD timeout # no_path_retry fail no_path_retry 28 polling_interval 5 }
So it should wait at least 140 seconds before passing error to upper layer correct?
AFAICT yes
Sometimes I see after clearing the problems that the VM is automatically un-paused, sometimes no: how is this managed?
I noticed that if I set disk as virtio-scsi (it seems virtio has no timeout definable and passes suddenly the error to upper layer) and disk timeout of vm disk (through udev rule) to 180 seconds, I can block access to the storage for example for 100 seconds and the host is able to reinstate paths and then vm is always unpaused. But I would like to prevent VM from pausing at all What else to tweak?
The only way Vdsm will not pause the VM is if libvirt+qemu never reports any ioerror, which is something I'm not sure is possible and that I'd never recommend anyway.
Vdsm always tries hard to be super-careful with respect possible data corruption.
Bests,
--
Francesco Romani
Senior SW Eng., Virtualization R&D
Red Hat
IRC: fromani github: @fromanirh

On Fri, Oct 11, 2019 at 6:04 AM Strahil <hunter86_bg@yahoo.com> wrote:
You can always check the *queue_if_no_path* multipath.conf option and give it a try.
This setting would be at host side, where things are ok to put a timeout of X seconds using entries such as devices { device { all_devs yes # Set timeout of queuing of 5*28 = 140 seconds # similar to vSphere APD timeout # no_path_retry fail no_path_retry 28 polling_interval 5 }
Don't forget that the higher in I/O chain you go - the higher the timeout is needed, so your VM should also use multipath with that option, in addition to the host.
Yes, in fact at guest side I put a udev rule
ACTION=="add", SUBSYSTEMS=="scsi", ATTRS{vendor}=="QEMU*", ATTRS{model}=="QEMU HARDDISK*", ENV{DEVTYPE}=="disk", RUN+="/bin/sh -c 'echo 180 > /sys$DEVPATH/device/timeout'" so that guest disk timeout is more than host storage. Any other settings? Gianluca
participants (2)
-
Gianluca Cecchi
-
Strahil