On Dec 7, 2016 16:00, "Nir Soffer" <nsoffer(a)redhat.com> wrote:
On Wed, Dec 7, 2016 at 10:17 AM, Oved Ourfali <oourfali(a)redhat.com> wrote:
> On Tue, Dec 6, 2016 at 11:12
PM, Adam Litke <alitke(a)redhat.com> wrote:
>
>> On 06/12/16 22:06 +0200, Arik Hadas wrote:
>>
>>> Adam,
>
>
>> :) You seem upset. Sorry
if I touched on a nerve...
>
>>> Just out of curiosity: when you write "v2v
has promised" - what
exactly
>>> do you
>>> mean? the tool? Richard Jones (the maintainer of virt-v2v)? Shahar
and
I
>>> that
>>> implemented the integration with virt-v2v? I'm not aware of such a
>>> promise by
>>> any of these options :)
>
>
>> Some history...
>
>> Earlier this year Nir, Francesco (added), Shahar, and
I began
>> discussing the similarities between what storage needed to do with
>> external commands and what was designed specifically for v2v. I am
>> not sure if you were involved in the project at that time. The plan
>> was to create common infrastructure that could be extended to fit the
>> unique needs of the verticals. The v2v code was going to be moved
>> over to the new infrastructure (see [1]) and the only thing that
>> stopped the initial patch was lack of a VMWare testing environment for
>> verification.
>
>> At that time storage refocused on developing verbs
that used the new
>> infrastructure and have been maintaining its suitability for general
>> use. Conversion of v2v -> Host Jobs is obviously a lower priority
>> item and much more difficult now due to the early missed opportunity.
>
>>> Anyway, let's say that you were given such a
promise by someone and
thus
>>> consider that mechanism to be deprecated - it
doesn't really matter.
>
>
>> I may be biased but I
think my opinion does matter.
>
>>> The current implementation doesn't well fit
to this flow (it requires
>>> per-volume job, it creates leases that are not needed for template's
>>> disks,
>>> ...) and with the "next-gen API" with proper support for virt
flows
not
>>> even
>>> being discussed with us (and iiuc also not with the infra team) yet, I
>>> don't
>>> understand what do you suggest except for some strong, though
irrelevant,
>>> statements.
>
>
>> If you are willing to
engage in a good-faith technical discussion I am
>> sure I can help you to understand. These operations to storage demand
>> some form of locking protection. If volume leases aren't appropriate
then
>> perhaps we should use the VM Leases / xleases that Nir is
finishing
>> off for 4.1 now.
>
>>> I suggest loud and clear to reuse (not to add
dependencies, not to
>>> enhance, ..)
>>> an existing mechanism for a very similar flow of virt-v2v that works
well
>>> and
>>> simple.
>
>
>> I clearly remember
discussions involving infra (hello Oved), virt
>> (hola Michal), and storage where we decided that new APIs performing
>> async operations involving external commands should use the HostJobs
>> infrastructure instead of adding more information to Host Stats.
>> These were the "famous" entity polling meetings.
We discussed these issues behind close doors, not in the public mailing
list,
so it is not surprising that people do not know about the agreements
we
had.
The core team was there. So it is surprising.
> >
> >> Of course plans can
change but I have never been looped into any such
> >> discussions.
> >
>
> >
Well, I think that when someone builds a good infrastructure he first
needs
> to talk to all consumers and make sure it fits.
> In this case it seems like most work was done to fit the storage
use-case,
> > and now you check whether it can fit others as well....
> The jobs framework is generic and can be used for any
subsystem,
> there is nothing related to storage about it. But modifying disks *is*
> a storage operation, even if someone from the virt team worked on it.
> V2v is also storage operation - if we compare it with
copying disks:
> - we create a new volume that nobody is using yet
> - if the operation fails, the disk must be in illegal state
> - if the operation fails we delete the disks
> - if the operation succeeds the volume must be legal
> - we need to limit the number of operations on a host
> - we need to detect the job state if the host becomes non-responsive
> - we may want to fence the job if the host becomes non-responsive
> in volume jobs, we can increment the volume generation and run
> the same job on another host.
> - we want to take a lease on storage to ensure that other hosts cannot
> access the same entity, or that the job will fail if someone else is
using
> this entity
> - we want to take a lease on storage, ensuring that a job cannot get
> stuck for long time - sanlock kill the owner of a lease when storage
> becomes inaccessible.
> - we want to report progress
> sysprep is less risky because the operation is faster, but
on storage even
> fast operation can get stuck for minutes.
> We need to agree on a standard way to do such operations
that is safe
enough
> and can be managed on the engine side.
> > IMO it makes much more sense to use events where
possible (and you've
> > promised to use those as well, but I don't see you doing that...). v2v
> > should use events for sure, and they have promised to do that in the
past,
> > instead of using the v2v jobs. The reason events weren't used originally
> > with the v2v feature, was that it was too risky and the events
> > infrastructure was added too late in the game.
> Events are not replacing the need for managing jobs in the
vdsm side.
> Engine must have a way to query the current jobs before subscribing
> to events from these jobs, otherwise you will loose events and engine
> will never notice a completed job after network errors.
> The jobs framework supports events, see
>
https://gerrit.ovirt.org/67118
> We are waiting for review from the infra team, maybe you
can
> get someone to review this?
It would have been great to review the design for this before it reaches to
gerrit.
Anyway, I get permissions error when opening. Any clue why?
> Nir
>
>
>
>>
> >>> Do you
"promise" to implement your "next gen API" for 4.1 as an
> >>> alternative?
> >
> >
> >> I guess we need the design first.
> >
> >
> >>> On Tue, Dec 6, 2016 at 5:04 PM, Adam Litke
<alitke(a)redhat.com> wrote:
> >>
> >>> On 05/12/16
11:17 +0200, Arik Hadas wrote:
> >>
> >>
> >>
>
>>> On Mon, Dec 5, 2016 at 10:05 AM, Nir Soffer <nsoffer(a)redhat.com
> >>> wrote:
> >>
> >>> On Sun,
Dec 4, 2016 at 8:50 PM, Shmuel Melamud
> >>> <smelamud(a)redhat.com
>
>>> wrote:
> >>>
> >>> >
Hi!
> >>>
> >>> >
I'm currently working on integration of virt-sysprep into
> >>> oVirt.
> >>>
> >>> >
Usually, if user creates a template from a regular VM, and
> >>> then
> >>> creates
> >>> new VMs from this template, these new VMs inherit all
> >>> configuration
> >>> of the
> >>> original VM, including SSH keys, UDEV rules, MAC addresses,
> >>> system
> >>> ID,
> >>> hostname etc. It is unfortunate, because you cannot have two
> >>> network
> >>> devices with the same MAC address in the same network, for
> >>> example.
> >>>
> >>> > To
avoid this, user must clean all machine-specific
> >>> configuration
> >>> from
> >>> the original VM before creating a template from it. You can
do
> >>> this
> >>> manually, but there is virt-sysprep utility that does this
> >>> automatically.
> >>>
> >>> >
Ideally, virt-sysprep should be seamlessly integrated into
> >>> template
> >>> creation process. But the first step is to create a simple
> >>> button:
> >>> user
> >>> selects a VM, clicks the button and oVirt executes
virt-sysprep
> >>> on
> >>> the VM.
> >>>
> >>> >
virt-sysprep works directly on VM's filesystem. It accepts
> >>> list of
> >>> all
> >>> disks of the VM as parameters:
> >>>
> >>> >
virt-sysprep -a disk1.img -a disk2.img -a disk3.img
> >>>
> >>> > The
architecture is as follows: command on the Engine side
> >>> runs a
> >>> job on
> >>> VDSM side and tracks its success/failure. The job on VDSM
side
> >>> runs
> >>> virt-sysprep.
> >>>
> >>> > The
question is how to implement the job correctly?
> >>>
> >>> > I
thought about using storage jobs, but they are designed
to
> >>> work
> >>> only
> >>> with a single volume, correct?
> >>
> >>> New
storage verbs are volume based. This make it easy to
manage
> >>> them on the engine side, and will allow parallelizing volume
> >>> operations
> >>> on single or multiple hosts.
> >>
> >>> A
storage volume job is using sanlock lease on the modified
> >>> volume
> >>> and volume generation number. If a host running pending jobs
> >>> becomes
> >>> non-responsive and cannot be fenced, we can detect the
state of
> >>> the job, fence the job, and start the job on another host.
> >>
> >>> In the
SPM task, if a host becomes non-responsive and
cannot be
>>> fenced, the whole setup is stuck, there is no
way to perform
>>> any
>>> storage operation.
>>> > Is is possible to use them with operation that is
performed
> >>> on
> >>> multiple
> >>> volumes?
> >>> > Or, alternatively, is it possible to use some kind of
'VM
> >>> jobs' -
> >>> that
> >>> work on VM at whole?
> >>
> >>> We can
do:
> >>
> >>> 1. Add
jobs with multiple volumes leases - can make error
> >>> handling
> >>> very
> >>> complex. How do tell a job state if you have multiple
> >>> leases?
> >>> which
> >>> volume generation you use?
> >>
> >>> 2. Use
volume job using one of the volumes (the boot
volume?).
>>> This
>>> does
>>> not protect the other volumes from modification but
engine
> >>> is
> >>> responsible
> >>> for this.
> >>
> >>> 3. Use
new "vm jobs", using a vm lease (should be available
> >>> this
> >>> week
> >>> on master).
> >>> This protects a vm during sysprep from starting the vm.
> >>> We still need a generation to detect the job state, I
think
> >>> we
> >>> can
> >>> use the sanlock
> >>> lease generation for this.
> >>
> >>> I like
the last option since sysprep is much like running a
vm.
> >>> > How v2v solves this problem?
> >>
> >>> It does
not.
> >>
> >>> v2v
predates storage volume jobs. It does not use volume
leases
>>> and
>>> generation
>>> and does have any way to recover if a host running v2v
becomes
> >>> non-responsive
> >>> and cannot be fenced.
> >>
> >>> It also
does not use the jobs framework and does not use a
> >>> thread
> >>> pool for
> >>> v2v jobs, so it has no limit on the number of storage
> >>> operations on
> >>> a host.
> >>
> >>
> >>> Right, but let's be fair and
present the benefits of v2v-jobs
as
> >>> well:
> >>> 1. it is the simplest "infrastructure" in terms of LOC
> >>
> >>
> >>> It is also deprecated. V2V has promised
to adopt the richer Host
Jobs
> >>> API in the future.
> >>
> >>
> >>> 2. it is the most efficient mechanism
in terms of interactions
> >>> between
> >>> the
> >>> engine and VDSM (it doesn't require new verbs/call, the data
is
> >>> attached to
> >>> VdsStats; probably the easiest mechanism to convert to events)
> >>
> >>
> >>> Engine is already polling the host jobs
API so I am not sure I
agree
> >>> with you here.
> >>
> >>
> >>> 3. it is the most efficient
implementation in terms of
interaction
> >>> with
> >>> the
> >>> database (no date is persisted into the database, no polling is
> >>> done)
> >>
> >>
> >>> Again, we're already using the Host
Jobs API. We'll gain
efficiency
> >>> by migrating away from the old v2v API and having a single, unified
> >>> approach (Host Jobs).
> >>
> >>
> >>> Currently we have 3 mechanisms to
report jobs:
> >>> 1. VM jobs - that is currently used for live-merge. This
requires
> >>> the
> >>> VM entity
> >>> to exist in VDSM, thus not suitable for virt-sysprep.
> >>
> >>
> >>> Correct, not appropriate for this
application.
> >>
> >>
> >>> 2. storage jobs - complicated
infrastructure, targeted for
> >>> recovering
> >>> from
> >>> failures to maintain storage consistency. Many of the things
this
> >>> infrastructure knows to handle is irrelevant for virt-sysprep
> >>> flow, and
> >>> the
> >>> fact that virt-sysprep is invoked on VM rather than particular
> >>> disk
> >>> makes it
> >>> less suitable.
> >>
> >>
> >>> These are more appropriately called
HostJobs and the have the
> >>> following semantics:
> >>> - They represent an external process running on a single host
> >>> - They are not persisted. If the host or vdsm restarts, the job is
> >>> aborted
> >>> - They operate on entities. Currently storage is the first adopter
> >>> of the infrastructure but virt was going to adopt these for the
> >>> next-gen API. Entities can be volumes, storage domains, vms,
> >>> network interfaces, etc.
> >>> - Job status and progress is reported by the Host Jobs API. If a
job
> >>> is not present, then the underlying entitie(s) must be polled by
> >>> engine to determine the actual state.
> >>
> >>
> >>> 3. V2V jobs - no mechanism is provided
to resume failed jobs,
no
> >>> leases, etc
> >>
> >>
> >>> This is the old infra upon which Host Jobs
are built. v2v has
> >>> promised to move to Host Jobs in the future so we should not add
new
> >>> dependencies to this code.
> >>
> >>
> >>> I have some arguments for using
V2V-like jobs [1]:
> >>> 1. creating template from vm is rarely done - if host goes
> >>> unresponsive
> >>> or any
> >>> other failure is detected we can just remove the template and
> >>> report
> >>> the error
> >>
> >>
> >>> We can chose this error handling with Host
Jobs as well.
> >>
> >>
> >>> 2. the phase of virt-sysprep is,
unlike typical storage
operation,
> >>> short -
> >>> reducing the risk of failures during the process
> >>
> >>
> >>> Reduced risk of failures is never an
excuse to have lax error
> >>> handling. The storage flavored host jobs provide tons of utilities
> >>> for making error handling standardized, easy to implement, and
> >>> correct.
> >>
> >>
> >>> 3. during the operation the VM is down
- by locking the
> >>> VM/template and
> >>> its
> >>> disks on the engine side, we render leases-like mechanism
> >>> redundant
> >>
> >>
> >>> Eventually we want to protect all
operations on storage with
sanlock
>>> leases. This is safer and allows for a more
distributed approach
to
>>> management. Again, the use of leases correctly in
host jobs
requires
>>> about 5 lines of code. The benefits of
standardization far
outweigh
> >>> any perceived simplification resulting from omitting it.
> >>
> >>
> >>> 4. in the worst case - the disk will
not be corrupted (only
some
> >>> of the
> >>> data
> >>> might be removed).
> >>
> >>
> >>> Again, the way engine chooses to handle
job failures is
independent of
> >>> the mechanism. Let's separate that from this discussion.
> >>
> >>
> >>> So I think that the mechanism for
storage jobs is an over-kill
for
> >>> this
> >>> case.
> >>> We can keep it simple by generalise the V2V-job for other
> >>> virt-tools
> >>> jobs, like
> >>> virt-sysprep.
> >>
> >>
> >>> I think we ought to standardize on the
Host Jobs framework where we
> >>> can collaborate on unit tests, standardized locking and error
> >>> handling, abort logic, etc. When v2v moves to host jobs then we
will
> >>> have a unified method of handling ephemeral jobs that are tied to
> >>> entities.
> >>
> >>> --
> >>> Adam Litke
> >>
> >>
> >
>
>> --
> >> Adam Litke
>
>