[ovirt-devel] [vdsm] VM recovery now depends on HSM

Fri Jul 11 15:44:31 UTC 2014

----- Original Message -----
> From: "Michal Skrivanek" <michal.skrivanek at redhat.com>
> To: "Adam Litke" <alitke at redhat.com>
> Cc: devel at ovirt.org, "Nir Soffer" <nsoffer at redhat.com>, "Federico Simoncelli" <fsimonce at redhat.com>
> Sent: Thursday, July 10, 2014 8:40:58 AM
> Subject: Re: [ovirt-devel] [vdsm] VM recovery now depends on HSM
> 
> 
> On Jul 9, 2014, at 15:38 , Nir Soffer <nsoffer at redhat.com> wrote:
> 
> > ----- Original Message -----
> >> From: "Adam Litke" <alitke at redhat.com>
> >> To: "Michal Skrivanek" <michal.skrivanek at redhat.com>
> >> Cc: devel at ovirt.org
> >> Sent: Wednesday, July 9, 2014 4:19:09 PM
> >> Subject: Re: [ovirt-devel] [vdsm] VM recovery now depends on HSM
> >> 
> >> On 09/07/14 13:11 +0200, Michal Skrivanek wrote:
> >>> 
> >>> On Jul 8, 2014, at 22:36 , Adam Litke <alitke at redhat.com> wrote:
> >>> 
> >>>> Hi all,
> >>>> 
> >>>> As part of the new live merge feature, when vdsm starts and has to
> >>>> recover existing VMs, it calls VM._syncVolumeChain to ensure that
> >>>> vdsm's view of the volume chain matches libvirt's.  This involves two
> >>>> kinds of operations: 1) sync VM object, 2) sync underlying storage
> >>>> metadata via HSM.
> >>>> 
> >>>> This means that HSM must be up (and the storage domain(s) that the VM
> >>>> is using must be accessible.  When testing some rather eccentric error
> >>>> flows, I am finding this to not always be the case.
> >>>> 
> >>>> Is there a way to have VM recovery wait on HSM to come up?  How should
> >>>> we respond if a required storage domain cannot be accessed?  Is there
> >>>> a mechanism in vdsm to schedule an operation to be retried at a later
> >>>> time?  Perhaps I could just schedule the sync and it could be retried
> >>>> until the required resources are available.
> >>> 
> >>> I've briefly discussed with Federico some time ago that IMHO the
> >>> syncVolumeChain needs to be changed. It must not be part of VM's create
> >>> flow as I expect this quite a bottleneck in big-scale environment (it is
> >>> now in fact not executing only on recovery but on all 4 create flows!).
> >>> I don't know how yet, but we need to find a different way. Now you just
> >>> added yet another reason.
> >>> 
> >>> So…I too ask for more insights:-)
> >> 
> >> Sure, so... We switched to running syncVolumeChain at all times to
> >> cover a very rare scenario:
> >> 
> >> 1. VM is running on host A
> >> 2. User initiates Live Merge on VM
> >> 3. Host A experiences a catastrophic hardware failure before engine
> >> can determine if the merge succeeded or failed
> >> 4. VM is restarted on Host B
> >> 
> >> Since (in this case) the host cannot know if a live merge was in
> >> progress on the previous host, it needs to always check.
> >> 
> >> 
> >> Some ideas to mitigate:
> >> 1. When engine recreates a VM on a new host and a Live Merge was in
> >> progress, engine could call a verb to ask the host to synchronize the
> >> volume chain.  This way, it only happens when engine knows it's needed
> >> and engine can be sure that the required resources (storage
> >> connections and domains) are present.
> > 
> > This seems like the right approach.
> 
> +1
> I like the "only when needed", since indeed we can assume the scenario is
> unlikely to happen most of the times (but very real indeed)

I agree on the assumptions but I disagree on the implementation (new API).
The verb that should be called to trigger the synchronization is the very
same live merge command that was called on host A.
The operation will either resume (fix the inconsistency) or just finish
right away successfully as there was nothing to be done.

Let's keep in mind that fixing this discrepancy needs to be added to the
list of things to verify when we import a data storage domain in order to
sanitize the domain.
(There's not that list anywhere yet, I know, but there should be because
this is not the only thing that requires it).

-- 
Federico