On Thu, Oct 10, 2019 at 9:56 AM Francesco Romani <fromani(a)redhat.com> wrote:
The only way Vdsm will not pause the VM is if libvirt+qemu never reports
any ioerror, which is something I'm not sure is possible and that I'd never
recommend anyway.
Vdsm always tries hard to be super-careful with respect possible data
corruption.
OK.
In case of storage not accessible for a bunch of seconds is more a matter
of I/O blocked than data corruption.
If no other host powers on the VM I think there is no risk of data
corruption itself, or at least no more than when you have a physical server
and for some reason the I/O operations to its physical disks (local or on a
SAN) are blocked for some tens of seconds.
The host could ever do a poweroff of the VM itself, instead of leaving
control to the underlying libvirt+qemu
I see that by default the qemu-kvm process in my oVirt 4.3.6 is spawned for
every disk with the options:
...,werror=stop,rerror=stop,...
Only for the ide channel of the CD device I have:
...,werror=report,rerror=report,readonly=on
and the manual page for qemu-kvm tells:
werror=action,rerror=action
Specify which action to take on write and read errors. Valid
actions are: "ignore"
(ignore the error and try to continue), "stop" (pause QEMU),
"report" (report the
error to the guest), "enospc" (pause QEMU only if the host
disk is full; report
the error to the guest otherwise). The default setting is
werror=enospc and
rerror=report.
So I think that if I want in any way to modify behavior I have to change
the options so that I keep "report" for both write and read errors on
virtual disks.
I'm only experimenting to see possible different options to manage
"temporary" problems at storage level, that often resolve without manual
actions in tens of seconds, sometimes due to uncorrect operations at levels
managed by other teams (network, storage, ecc).
In these circumstances experience told me it is better to "do nothing and
wait", instead of trying to taking any action that anyway will fail until
the "external" problem has been solved (automatically, thanks to logic
outside oVirt control, or manually).
It would be nice to "mimic" the behavior of vSphere in this sense and I'm
investigating possible actions to reach it...
Hope I clarified a bit the origin of my actions...
Thanks,
Gianluca