[ovirt-users] Re: owner of vm paused/unpaused operation

9 Oct 2019

      On 10/8/19 4:06 PM, Gianluca Cecchi wrote:

Hi Gianluca
...
Hello,
I'm doing some tests related to storage latency or problems manually 
created to debug and manage reactions of hosts and VMs.
What is the subsystem/process/daemon responsible to pause a VM when 
problems arise on storage for the host where the VM is running?
It's Vdsm itself.
...
How is determined the timeout to use to put the VM in pause mode?
The VM is paused immediately as soon as libvirt, through QEMU, reports 
IOError, to avoid data corruption. Now, when libvirt reports this error

depends laregly on the timeout set for the storage configuration, which 
is done at host level, using system tools (e.g. it is not a Vdsm tunable)
...
Sometimes I see after clearing the problems that the VM is 
automatically un-paused, sometimes no: how is this managed?
It depends on the error condition that happens. Vdsm tries to recovery 
automatically when it is safe to do so. When in doubt, Vdsm always plays 
it safe wrt user data
...
Are there any counters so that if VM has been paused and and 
problems are not solved in a certain timeframe the unpause can be done 
only manually by the sysadmin?
AFAIR no, because if Vdsm can't be sure, the only real option is to let 
the sysadmin check and decide.

Bests,

-- 
Francesco Romani
Senior SW Eng., Virtualization R&D
Red Hat
IRC: fromani github: @fromanirh