[ovirt-devel] [VDSM] Correct implementation of virt-sysprep job

Tue Dec 6 21:12:30 UTC 2016

On 06/12/16 22:06 +0200, Arik Hadas wrote:
>Adam,

:)  You seem upset.  Sorry if I touched on a nerve...

>Just out of curiosity: when you write "v2v has promised" - what exactly do you
>mean? the tool? Richard Jones (the maintainer of virt-v2v)? Shahar and I that
>implemented the integration with virt-v2v? I'm not aware of such a promise by
>any of these options :)

Some history...

Earlier this year Nir, Francesco (added), Shahar, and I began
discussing the similarities between what storage needed to do with
external commands and what was designed specifically for v2v.  I am
not sure if you were involved in the project at that time.  The plan
was to create common infrastructure that could be extended to fit the
unique needs of the verticals.  The v2v code was going to be moved
over to the new infrastructure (see [1]) and the only thing that
stopped the initial patch was lack of a VMWare testing environment for
verification.

At that time storage refocused on developing verbs that used the new
infrastructure and have been maintaining its suitability for general
use.  Conversion of v2v -> Host Jobs is obviously a lower priority
item and much more difficult now due to the early missed opportunity.

>Anyway, let's say that you were given such a promise by someone and thus
>consider that mechanism to be deprecated - it doesn't really matter.

I may be biased but I think my opinion does matter.

>The current implementation doesn't well fit to this flow (it requires
>per-volume job, it creates leases that are not needed for template's disks,
>...) and with the "next-gen API" with proper support for virt flows not even
>being discussed with us (and iiuc also not with the infra team) yet, I don't
>understand what do you suggest except for some strong, though irrelevant,
>statements.

If you are willing to engage in a good-faith technical discussion I am
sure I can help you to understand.  These operations to storage demand
some form of locking protection.  If volume leases aren't appropriate then
perhaps we should use the VM Leases / xleases that Nir is finishing
off for 4.1 now.

>I suggest loud and clear to reuse (not to add dependencies, not to enhance, ..)
>an existing mechanism for a very similar flow of virt-v2v that works well and
>simple.

I clearly remember discussions involving infra (hello Oved), virt
(hola Michal), and storage where we decided that new APIs performing
async operations involving external commands should use the HostJobs
infrastructure instead of adding more information to Host Stats.
These were the "famous" entity polling meetings.

Of course plans can change but I have never been looped into any such
discussions.

>Do you "promise" to implement your "next gen API" for 4.1 as an alternative?

I guess we need the design first.

>On Tue, Dec 6, 2016 at 5:04 PM, Adam Litke <alitke at redhat.com> wrote:
>
>    On 05/12/16 11:17 +0200, Arik Hadas wrote:
>
>
>
>        On Mon, Dec 5, 2016 at 10:05 AM, Nir Soffer <nsoffer at redhat.com> wrote:
>
>           On Sun, Dec 4, 2016 at 8:50 PM, Shmuel Melamud <smelamud at redhat.com>
>        wrote:
>           >
>           > Hi!
>           >
>           > I'm currently working on integration of virt-sysprep into oVirt.
>           >
>           > Usually, if user creates a template from a regular VM, and then
>        creates
>           new VMs from this template, these new VMs inherit all configuration
>        of the
>           original VM, including SSH keys, UDEV rules, MAC addresses, system
>        ID,
>           hostname etc. It is unfortunate, because you cannot have two network
>           devices with the same MAC address in the same network, for example.
>           >
>           > To avoid this, user must clean all machine-specific configuration
>        from
>           the original VM before creating a template from it. You can do this
>           manually, but there is virt-sysprep utility that does this
>        automatically.
>           >
>           > Ideally, virt-sysprep should be seamlessly integrated into
>        template
>           creation process. But the first step is to create a simple button:
>        user
>           selects a VM, clicks the button and oVirt executes virt-sysprep on
>        the VM.
>           >
>           > virt-sysprep works directly on VM's filesystem. It accepts list of
>        all
>           disks of the VM as parameters:
>           >
>           > virt-sysprep -a disk1.img -a disk2.img -a disk3.img
>           >
>           > The architecture is as follows: command on the Engine side runs a
>        job on
>           VDSM side and tracks its success/failure. The job on VDSM side runs
>           virt-sysprep.
>           >
>           > The question is how to implement the job correctly?
>           >
>           > I thought about using storage jobs, but they are designed to work
>        only
>           with a single volume, correct?
>
>           New storage verbs are volume based. This make it easy to manage
>           them on the engine side, and will allow parallelizing volume
>        operations
>           on single or multiple hosts.
>
>           A storage volume job is using sanlock lease on the modified volume
>           and volume generation number. If a host running pending jobs becomes
>           non-responsive and cannot be fenced, we can detect the state of
>           the job, fence the job, and start the job on another host.
>
>           In the SPM task, if a host becomes non-responsive and cannot be
>           fenced, the whole setup is stuck, there is no way to perform any
>           storage operation.
>             > Is is possible to use them with operation that is performed on
>        multiple
>           volumes?
>           > Or, alternatively, is it possible to use some kind of 'VM jobs' -
>        that
>           work on VM at whole?
>
>           We can do:
>
>           1. Add jobs with multiple volumes leases - can make error handling
>        very
>               complex. How do tell a job state if you have multiple leases?
>        which
>               volume generation you use?
>
>           2. Use volume job using one of the volumes (the boot volume?). This
>        does
>               not protect the other volumes from modification but engine is
>           responsible
>               for this.
>
>           3. Use new "vm jobs", using a vm lease (should be available this
>        week
>           on master).
>               This protects a vm during sysprep from starting the vm.
>               We still need a generation to detect the job state, I think we
>        can
>           use the sanlock
>               lease generation for this.
>
>           I like the last option since sysprep is much like running a vm.
>             > How v2v solves this problem?
>
>           It does not.
>
>           v2v predates storage volume jobs. It does not use volume leases and
>           generation
>           and does have any way to recover if a host running v2v becomes
>           non-responsive
>           and cannot be fenced.
>
>           It also does not use the jobs framework and does not use a thread
>        pool for
>           v2v jobs, so it has no limit on the number of storage operations on
>        a host.
>         
>
>        Right, but let's be fair and present the benefits of v2v-jobs as well:
>        1. it is the simplest "infrastructure" in terms of LOC
>
>
>    It is also deprecated.  V2V has promised to adopt the richer Host Jobs
>    API in the future.
>
>
>        2. it is the most efficient mechanism in terms of interactions between
>        the
>        engine and VDSM (it doesn't require new verbs/call, the data is
>        attached to
>        VdsStats; probably the easiest mechanism to convert to events)
>
>
>    Engine is already polling the host jobs API so I am not sure I agree
>    with you here.
>
>
>        3. it is the most efficient implementation in terms of interaction with
>        the
>        database (no date is persisted into the database, no polling is done)
>
>
>    Again, we're already using the Host Jobs API.  We'll gain efficiency
>    by migrating away from the old v2v API and having a single, unified
>    approach (Host Jobs).
>
>
>        Currently we have 3 mechanisms to report jobs:
>        1. VM jobs - that is currently used for live-merge. This requires the
>        VM entity
>        to exist in VDSM, thus not suitable for virt-sysprep.
>
>
>    Correct, not appropriate for this application.
>
>
>        2. storage jobs - complicated infrastructure, targeted for recovering
>        from
>        failures to maintain storage consistency. Many of the things this
>        infrastructure knows to handle is irrelevant for virt-sysprep flow, and
>        the
>        fact that virt-sysprep is invoked on VM rather than particular disk
>        makes it
>        less suitable.
>
>
>    These are more appropriately called HostJobs and the have the
>    following semantics:
>    - They represent an external process running on a single host
>    - They are not persisted.  If the host or vdsm restarts, the job is
>      aborted
>    - They operate on entities.  Currently storage is the first adopter
>      of the infrastructure but virt was going to adopt these for the
>      next-gen API.  Entities can be volumes, storage domains, vms,
>      network interfaces, etc.
>    - Job status and progress is reported by the Host Jobs API.  If a job
>      is not present, then the underlying entitie(s) must be polled by
>      engine to determine the actual state.
>
>
>        3. V2V jobs - no mechanism is provided to resume failed jobs, no
>        leases, etc
>
>
>    This is the old infra upon which Host Jobs are built.  v2v has
>    promised to move to Host Jobs in the future so we should not add new
>    dependencies to this code.
>
>
>        I have some arguments for using V2V-like jobs [1]:
>        1. creating template from vm is rarely done - if host goes unresponsive
>        or any
>        other failure is detected we can just remove the template and report
>        the error
>
>
>    We can chose this error handling with Host Jobs as well.
>
>
>        2. the phase of virt-sysprep is, unlike typical storage operation,
>        short -
>        reducing the risk of failures during the process 
>
>
>    Reduced risk of failures is never an excuse to have lax error
>    handling.  The storage flavored host jobs provide tons of utilities
>    for making error handling standardized, easy to implement, and
>    correct.
>
>
>        3. during the operation the VM is down - by locking the VM/template and
>        its
>        disks on the engine side, we render leases-like mechanism redundant
>
>
>    Eventually we want to protect all operations on storage with sanlock
>    leases.  This is safer and allows for a more distributed approach to
>    management.  Again, the use of leases correctly in host jobs requires
>    about 5 lines of code.  The benefits of standardization far outweigh
>    any perceived simplification resulting from omitting it.
>
>
>        4. in the worst case - the disk will not be corrupted (only some of the
>        data
>        might be removed).
>
>
>    Again, the way engine chooses to handle job failures is independent of
>    the mechanism.  Let's separate that from this discussion.
>
>
>        So I think that the mechanism for storage jobs is an over-kill for this
>        case.
>        We can keep it simple by generalise the V2V-job for other virt-tools
>        jobs, like
>        virt-sysprep.
>
>
>    I think we ought to standardize on the Host Jobs framework where we
>    can collaborate on unit tests, standardized locking and error
>    handling, abort logic, etc.  When v2v moves to host jobs then we will
>    have a unified method of handling ephemeral jobs that are tied to
>    entities.
>
>    --
>    Adam Litke
>
>

-- 
Adam Litke