On 10/10/19 10:44 AM, Gianluca Cecchi wrote:
On Thu, Oct 10, 2019 at 9:56 AM Francesco Romani <fromani@redhat.com> wrote:

The only way Vdsm will not pause the VM is if libvirt+qemu never reports any ioerror, which is something I'm not sure is possible and that I'd never recommend anyway.

Vdsm always tries hard to be super-careful with respect possible data corruption.


OK.
In case of storage not accessible for a bunch of seconds is more a matter of I/O blocked than data corruption.


True, but we can know only ex-poste that the storage was just temporarily unavailable, don't we?


If no other host powers on the VM I think there is no risk of data corruption itself, or at least no more than when you have a physical server and for some reason the I/O operations to its physical disks (local or on a SAN) are blocked for some tens of seconds.


IMO, a storage unresponsive for tens of seconds is something which should be uncommon and very alarming in every circumstances, especially for physical servers.

What i'm trying to say is that yes, there probabily are ways to sidestep this behaviour, but I think this is the wrong direction and adds fragility rather than convenience to the system.


The host could ever do a poweroff of the VM itself, instead of leaving control to the underlying libvirt+qemu 

I see that by default the qemu-kvm process in my oVirt 4.3.6 is spawned for every disk with the options:
...,werror=stop,rerror=stop,...

Only for the ide channel of the CD device I have:
...,werror=report,rerror=report,readonly=on

and the manual page for qemu-kvm tells:

           werror=action,rerror=action
               Specify which action to take on write and read errors. Valid actions are: "ignore"
               (ignore the error and try to continue), "stop" (pause QEMU), "report" (report the
               error to the guest), "enospc" (pause QEMU only if the host disk is full; report
               the error to the guest otherwise).  The default setting is werror=enospc and
               rerror=report.
 
So I think that if I want in any way to modify behavior I have to change the options so that I keep "report" for both write and read errors on virtual disks.


Yep. I don't remember what Engine allows. Worst case you can use an hook, but once again this is making things a bit more fragile.


I'm only experimenting to see possible different options to manage "temporary" problems at storage level, that often resolve without manual actions in tens of seconds, sometimes due to uncorrect operations at levels managed by other teams (network, storage, ecc).


I think the best option is improve the current behaviour: learn why Vdsm fails to unpause the VM and improve here.




-- 
Francesco Romani
Senior SW Eng., Virtualization R&D
Red Hat
IRC: fromani github: @fromanirh