On Tue, Nov 29, 2016 at 5:09 PM, Arik Hadas <ahadas(a)redhat.com> wrote:
----- Original Message -----
> On Tue, Nov 29, 2016 at 4:30 PM, Arik Hadas <ahadas(a)redhat.com> wrote:
> >
> >
> > ----- Original Message -----
> >> On Tue, Nov 29, 2016 at 2:21 PM, Arik Hadas <ahadas(a)redhat.com>
wrote:
> >> >
> >> >
> >> > ----- Original Message -----
> >> >> On Wed, Nov 23, 2016 at 07:59:40AM -0500, Arik Hadas wrote:
> >> >> > Hi All,
> >> >> >
> >> >> > We are working on something that is expected to have a big
impact,
> >> >> > hence
> >> >> > this heads-up.
> >> >> > First, we want you to be aware of this change and provide
your
> >> >> > feedback
> >> >> > to
> >> >> > make it as good as possible.
> >> >> > Second, until the proposed mechanism is fully merged there
will be a
> >> >> > chase
> >> >> > to cover all features unless new features are also implemented
with
> >> >> > the
> >> >> > new mechanism. So please, if you are working on something
that
> >> >> > adds/changes something in the Libvirt's domain xml, do it
with this
> >> >> > new
> >> >> > mechanism as well (first version would be merged soon).
> >> >> >
> >> >> > * Goal
> >> >> > Creating Libvirt XML in the engine rather than in VDSM.
> >> >> > ** Today's flow
> >> >> > Engine: VM business entity -> VM properties map
> >> >> > VDSM: VM properties map -> Libvirt XML
> >> >> > ** Desired flow
> >> >> > Engine: VM business entity -> Libvirt XML
> >> >> >
> >> >> > * Potential Benefits
> >> >> > 1. Reduce the number of conversions from 2 to 1, reducing
chances for
> >> >> > mistakes in the process.
> >> >> > 2. Reduce the amount of code in VDSM.
> >> >> > 3. Make VM related changes easier - today many of these
changes need
> >> >> > to
> >> >> > be
> >> >> > reviewed in 2 projects, this will eliminate the one that tends
to
> >> >> > take
> >> >> > longer.
> >> >> > 4. Prevent shortcuts in the form of VDSM-only changes that
should be
> >> >> > better
> >> >> > reflected in the engine.
> >> >> > 5. Not to re-generate the XML on each rerun attempt of VM
> >> >> > run/migration.
> >> >> > 6. Future - not to re-generate the XML on each attempt to
auto-start
> >> >> > HA
> >> >> > VM
> >> >> > when using vm-leases (need to make sure we're using the
up-to-date VM
> >> >> > configuration though).
> >> >> > 7. We already found improvements and cleanups that could be
made
> >> >> > while
> >> >> > touching this area (e.g., remove the boot order from devices
in the
> >> >> > database).
> >> >> >
> >> >> > * Challenges
> >> >> > 1. Not to move host-specific information to the engine. For
example,
> >> >> > path
> >> >> > to storage domain or sockets of channels.
> >> >> > The solution is to use place-holders that will be replaced
by
> >> >> > VDSM.
> >> >> > 2. Backward compatibility.
> >> >> > 3. The more challenging part is the other direction - that
will be
> >> >> > the
> >> >> > next
> >> >> > phase.
> >> >> >
> >> >> > * Status
> >> >> > As a first step, we began with producing the Libvirt XML in
the
> >> >> > engine
> >> >> > by
> >> >> > converting the VM properties map to XML in the engine [1]
> >> >> > And using the XML that is received as an input in VDSM [2]
> >> >> >
> >> >> >
> >> >> > [1]
https://gerrit.ovirt.org/#/c/64473/
> >> >> > [2]
https://gerrit.ovirt.org/#/c/65182/
> >> >>
> >> >> I should start by saying that I love libvirt's domxml standard.
Unlike
> >> >> Vdsm's API, it is a real *standard* for defining VMs. In this
regards,
> >> >> you are suggesting a positive step.
> >> >>
> >> >> However, Engine is much more complex than Vdsm. It is also our
> >> >> single-point-of-failure, and where CPU is the most scarce. I am
worried
> >> >> that in the foreseeable future it would only make Engine bigger,
> >> >> without
> >> >> reducing the size and complexity of Vdsm.
> >> >>
> >> >> Before taking this move, we must map what Vdsm does, because that
logic
> >> >> would have to be copied into Engine. Few things pop up to mind:
> >> >>
> >> >> - pci addresses. would Vdsm report back the libvirt-assigned
addresses
> >> >> in XML format, or would it keep parsing them?
> >> >
> >> > Ideally, VDSM will report back the devices in XML format.
> >> > The engine will then add the unmanaged devices and update the pci
> >> > addresses.
> >> > Need to put some more thoughts into this, though.
> >> >
> >> >>
> >> >> - hot plug. Device xml should be generated by Engine, much like in
the
> >> >> vm cteate flow
> >> >
> >> > Good point, I didn't think of hot plugs - right, they could be
changed
> >> > as
> >> > well later on.
> >> >
> >> >>
> >> >> - network rewiring. Vdsm uses the "dummy bridge" to
implement a vNIC
> >> >> that is connected no-where. Engine would need to care about what
was
> >> >> up until now a vdsm-side implementation detail.
> >> >
> >> > Right, I almost finished to copy the creation of the network
interfaces
> >> > to
> >> > the engine.
> >> > This knowledge that you refer to will only be in the module that
creates
> >> > the XML, it doesn't seem to be much of an issue.
> >> >
> >> >>
> >> >> - storage path. this was mentioned above, but actually, the paths
are
> >> >> the same on all hosts. We inteded to have an abstraction layer
there,
> >> >> but we never ever used it. All volumes sit under
> >> >> /rhev/data-center/poolID/domainID/imageID/volumeID
> >> >> Basically, Engine can hard-code this in the domxml, and no one
would
> >> >> notice.
> >>
> >> This is wrong, and engine cannot hard code this or anything else.
> >>
> >> Engine should can describe only what is knows about disks, only vdsm
> >> can add the disk xml.
> >
> > Of course, the engine will describe only the information it knows, but that
> > seems to be most of the disk's related data.
> > Let's say that the engine is managed to generate something like:
> > <disk type='file' device='disk' snapshot='no'>
>
> file may be "block" or "network", in glusterfs engine may send
"file" and
> vdsm
> will replace it to "network".
Right right, that's what I meant by "and similar structures for the other kind
of disks" -
basically we'll move the same logic that determines this to the engine in order to
produce the right elements.
>
> > <driver name='qemu' type='qcow2' cache='none'/>
> > <source file='$PATH:PDIV or other representations$'/>
>
> file may need to be "dev"
Same, we'll check the diskType in the engine in order to determine the right
attributes.
diskType that engine send is wrong, it is correct only for network
disks (cpeh, glustefs),
engine does send incorrect type for file and block based disks.
>
> > <target dev='hda' bus='ide'/>
> > <serial>54-a672-23e5b495a9ea</serial>
> > </disk>
>
> Basically you must inspect the code in vdsm to understand the difference
> and what engine can do and cannot.
>
> Also, how will vdsm fix the xml without the pool, domin, image and volume
> ids?
>
> How do you want to send them if we send xml?
Note that the placeholder contains this information ($PATH:PDIV or other
representations$)
Lets not invent such placeholders.
If vdsm needs a list of drives, it must get them in properly,
for example as list of dicts, documented in the schema,
or add the data inside the xml:
<vdsm_data>
<sd_id>xxxyyy</sd_id>
</vdsm_data>
Or move the needed logic to engine. If vdsm reports to engine the storage
repo path (/rhev/data-center or /run/vdsm/data-cetner), engine can build
the paths. This path must be the same on all the hosts otherwise you
could not migrate vms from host to host.
Same for LUNs, vdsm reports today the unique name of the device, but it
could report the full path (/dev/mapper/xxxyyy). This path is always
the same on all the hosts in the cluster (by multipath configuration).
The reason why glusterfs disk may change from file to network
is we do not manage glusterfs properly on engine. Instead, we have code
in vdsm contacting gluster server and fetching the information needed
to create the network disk xml.
If this was managed properly on engine, like other storage types, engine
would have all the information needed to create the xml.
Same for other details, like using file or dev - engine does not send
the type to vdsm although engine knows the storage domain type,
so vdsm have to go and check if the path is a file or block device.
This makes no sense file do not become block devices, so there
is no need to perform this check on vdsm.
It will be the engine's responsibility to provide all the data
that VDSM needs in order to produce the right replacements within the placeholders
>
> >
> > (and similar structures for the other kind of disks).
> >
> > Then, VDSM can simply replace the placeholder $PATH...$ with the concrete
> > path of the disk.
> > For VDSM it would save a lot of code (most of the code in
> > vdsm/virt/vmdevices/storage.py I suppose)
> >
> > What else doesn't the engine know? And couldn't this data be set by VDSM
by
> > replacing place-holders as in the example above?
> >
> >>
> >> >
> >> > But I see that LUN and cinder disks are represented differently (not
as
> >> > PDIV) - I'll check this.
> >>
> >> Of course, and even disks using DIV can modified in by vdsm, for
> >> example glusterfs
> >> using libgfapi.
> >>
> >> >
> >> >>
> >> >> - OvS. Recently, we have changed how VMs can be connected to their
> >> >> network. It is possible (albeit not recommended yet!) to connect
a VM
> >> >> to an OvS instead of Linux bridges. This is done without Engine
> >> >> really
> >> >> caring, or knowing how the domxml is modified.
> >> >
> >> > Yeah, I saw that. The only complication I see at this point is that
for
> >> > OvS
> >> > we create more elements than only the 'source' element.
> >> > I believe that we could use a place-holder that contains the network
> >> > name
> >> > and replace it with the tags that are needed for SR-IOV, OvS and
> >> > ordinary
> >> > interfaces, no?
> >> > This seems to be the only thing that is difficult to generate on the
> >> > engine's side (related to the network interfaces).
> >> >
> >> >>
> >> >> - minor tweaks. exposing a new feature into Engine's UI is
hard. Over
> >> >> the years, few tweaks have been pushed as custom properties.
> >> >> there are not many (I see now only sndbuf, queues, viodiskcache,
> >> >> vhost) but the implementation should make sure they are not
> >> >> forgotten.
> >> >
> >> > Sure.
> >> >
> >> >>
> >> >> Maybe, Vdsm should consider Engine's domxml only as a
"recomendation"
> >> >> and modify it based on its hooks and custom properties. This can
> >> >> surprise Engine, a defies the pupose of having xml-building logic
moved
> >> >> away from Vdsm.
> >> >>
> >> > _______________________________________________
> >> > Devel mailing list
> >> > Devel(a)ovirt.org
> >> >
http://lists.ovirt.org/mailman/listinfo/devel
> >>
>