[ovirt-devel] [vdsm] Engine XML: metadata and devices from XML

Francesco Romani fromani at redhat.com
Wed Mar 22 16:00:15 UTC 2017


Please note that the approach described here is outdated, because
xmlpickle was too easy to use

and too generic: that lead to bloated XML and too high risk of misusing
metadata;

the new approach (https://gerrit.ovirt.org/#/c/74206/) tries to balance
things making usage

convenient but disallowing arbitrarily nesting


On 03/18/2017 03:27 PM, Nir Soffer wrote:
> On Wed, Mar 15, 2017 at 2:28 PM Francesco Romani <fromani at redhat.com
> <mailto:fromani at redhat.com>> wrote:
>
>     Hi everyone,
>
>     This is both a report of the current state of my Vdsm patches for
>     Engine
>     XML support, and a proposal to how move forward and solve
>     the current open issues.
>
>     TL;DR:
>     1. we can and IMO should reuse the current JSON schema to describe the
>     structure (layout) and the types of the metadata section.
>     2. we don't need a priori validation of stuff in the metadata section.
>     We will just raise in the creation flow if data is missing, or wrong,
>     according to our schema.
>     2. we will add *few* items to the metadata section, only thing we
>     can't
>     express clearly-or at all in the libvirt XML. Redundancy and
>     verbosiness
>     will be thus kept at bay
>     3. I believe [3] is the best tool to do (de)serialize data to the
>     metadata section. Existing tools fits poorly in our very specific
>     use case
>
>     Examples below
>
>     +++
>
>     Long(er) discussion:
>
>
>     I have working code[1][2] to encode any custom, picklable, python
>     object in the metadata section.
>
>     We should decide which module will do the actual python<=>XML
>     transformation.
>     Please note that this actually also influences how the data in the
>     medata section look like, so the two things are a bit coupled.
>
>     I'm eager to reinvent another wheel, but after
>     initial evaluation I honestly think that my pyxmlpickle[3] is the best
>     tool for the job over the current alternatives: plistlib[4] and
>     xmltodict[5].
>
>     I added the initial rationale here:
>     https://gerrit.ovirt.org/#/c/73790/4//COMMIT_MSG
>
>     I have completed the initial draft of patches to make it possible to
>     initialize devices from their XML representation [6]. This is the bare
>     minimum we need to support the Engine XML, and we *need* this
>     anyway to
>     unlock the cleanup we planned and I outlined in my google doc.
>
>     So we are progressing, but I'd like to speed up things. Those [6]
>     patches are not yet complete, many flows are not covered or
>     tested;  but
>     they are good enough to demonstrate that there *are* pieces of
>     information wen need to properly initialize the devices, but we can't
>     easily extract from the XML.
>
>     First examples that come to my mind are the storage.Drive UUIDs; there
>     could also be some ambiguity I'm investigating right now for
>     displayIp/displayNetwork in Graphics devices. In [6] there are various
>     TODO to mark more of those cases. Most likely, few more cases will pop
>     out as I cover all the flows we support.
>
>     Long story short: it is hard to correctly rebuild the device conf from
>     the XML. This is why in [6] I added the 'meta' argument to
>     from_xml_tree
>     classmethod in [7].
>
>     'meta' is supposed to be the device metadata: extra data related to a
>     device which doesn't (yet) fit in the libvirt XML representation.
>     For example, we can store 'displayIp' and 'displayNetwork' here and be
>     done with that: using both per-device metadata and the XML
>     representation of one graphic device, we will have everything we
>     need to
>     properly build one graphics.Graphics device.
>     This example may (hopefully) be bogus, but I'm keeping it because
>     it is
>     one case easy to follow.
>
>     The device metadata is going to be stored in the vm metadata for the
>     short/mid term future. Even if the per-device metadata idea/RFE is
>     accepted (no answer yet, but we are working on it), we will not
>     have in
>     7.4, and unlikely in 7.5.
>
>     As it stands today, I believe there are two open questions:
>
>     1. do we need a schema for the metadata section?
>     2. how do we bind the metadata to the devices? How do we know which
>     metadata belongs to which metadata, if we don't have aliases nor
>     addresses to match? (e.g. very first time the VM is created!)
>
>     My current stance is the following
>     1. In general, one schema gives us two benefits: 1.a. we document how
>     the layout of the data should be, including types; 1.b. we can
>     validate
>     the data we receive.
>     So yes, we need a schema, but we don't need a *new* schema. I think we
>     are in good enough shape with the current Vdsm schema: we can just
>     translate the python object layout to a XML layout.
>
>     One example is probably more explicative. Some actual data may look
>     like, using my pyxmlpickle module:
>
>     <domain type='kvm' id='5'>
>       <name>a0</name>
>       <uuid>ccd945c8-8069-4f31-8471-bbb58e9dd6ea</uuid>
>       <metadata xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0"
>
>
> The url is broken - do we have this?

I guess no. This is old code, dating back to 3.6,/which I'm taking
without changes./

>  
>
>     xmlns:ovirt-vm="http://ovirt.org/vm/1.0"
>     xmlns:ovirt-instance="http://ovirt.org/vm/instance/1.0">
>         <ovirt-tune:qos/> 
>
>
> What do we keep here? why does it need its own namespace?
>  

yes, because of how we use the metadata:

http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainGetMetadata
http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainSetMetadata

we use type =
http://libvirt.org/html/libvirt-libvirt-domain.html#VIR_DOMAIN_METADATA_ELEMENT

-- I don't think we should change that

>         <ovirt-vm:vm/>
>
>
> What do we keep here? why does it need its own namespace?

same as above

>
> Can we merge all namespaces into one generic namespace?

Probably, but we have the nice property that code can trivially access
one subset of metadata, e.g.
only the qos subtree, only the container subtree.
Furthermore if we don't change that we are backward compatible at XML
level (inbound migrations
care about that).

>  
>
>         <ovirt-instance:instance>
>
>
> What is instance?

Gone, it duplicated vm without a good reason

>  
>
>           <ovirt-instance:value type="dict">  
>
>             <ovirt-instance:item key="devices" type="list"> 
>
>               <ovirt-instance:item index="0" type="dict">
>
>
> Isn't index redundant?

It is, but now is gone

>  
>
>                 <ovirt-instance:item key="device"
>     type="str">vnc</ovirt-instance:item>
>                 <ovirt-instance:item key="specParams" type="dict">
>                   <ovirt-instance:item key="displayNetwork"
>     type="str">ovirtmgmt</ovirt-instance:item>
>                   <ovirt-instance:item key="displayIp"
>     type="str">192.168.1.53</ovirt-instance:item>
>                 </ovirt-instance:item>
>                 <ovirt-instance:item key="type"
>     type="str">graphics</ovirt-instance:item>
>               </ovirt-instance:item>
>               <ovirt-instance:item index="1" type="dict">
>                 <ovirt-instance:item key="device"
>     type="str">spice</ovirt-instance:item>
>                 <ovirt-instance:item key="specParams" type="dict">
>                   <ovirt-instance:item key="displayNetwork"
>     type="str">ovirtmgmt</ovirt-instance:item>
>                   <ovirt-instance:item key="displayIp"
>     type="str">192.168.1.53</ovirt-instance:item>
>                 </ovirt-instance:item>
>                 <ovirt-instance:item key="type"
>     type="str">graphics</ovirt-instance:item>
>               </ovirt-instance:item>
>               <ovirt-instance:item index="2" type="dict">
>                 <ovirt-instance:item key="poolID"
>     type="str">5890a292-0390-01d2-01ed-00000000029a</ovirt-instance:item>
>                 <ovirt-instance:item key="imageID"
>     type="str">66441539-f7ac-4946-8a25-75e422f939d4</ovirt-instance:item>
>                 <ovirt-instance:item key="domainID"
>     type="str">c578566d-bc61-420c-8f1e-8dfa0a18efd5</ovirt-instance:item>
>                 <ovirt-instance:item key="device"
>     type="str">disk</ovirt-instance:item>
>                 <ovirt-instance:item key="path"
>     type="str">/rhev/data-center/5890a292-0390-01d2-01ed-00000000029a/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc</ovirt-instance:item>
>                 <ovirt-instance:item key="volumeID"
>     type="str">5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc</ovirt-instance:item>
>               </ovirt-instance:item>
>             </ovirt-instance:item>
>           </ovirt-instance:value>
>         </ovirt-instance:instance>
>       </metadata>
>      <!-- omitted for brevity -->
>     </domain>
>
>
> How about a generic json namespace?
>
> json:
>
> {
>     "devices": [
>        {
>            "foo": "bar"
>        }
>    ]
> }
>
> xml:
>
> <json:object>
>     <json:key>devices</json:key>
>     <json:list>
>        <json:object>
>            <json:key>foo</json:key>
>            <json:string>bar</json:string>
>        </json:object>
>     </json:list>
> </json:object>
>
> We can query and modify this xml using xpath, like this:
>
> >>>
> r.findall("./metadata/json:object/[json:key='devices']/json:list/json:object/",
> namespaces={"json": "http://ovirt.org/json/1.0"})
> [<Element '{http://ovirt.org/json/1.0}key
> <http://ovirt.org/json/1.0%7Dkey>' at 0x7fd3386a69d0>, <Element
> '{http://ovirt.org/json/1.0}string
> <http://ovirt.org/json/1.0%7Dstring>' at 0x7fd3386a6a50>]
>
> Not sure this format will be easy to modify, needs to tinker more
> with this.
>
> It will be probably easier to parse the entire metadata, change it,
> and serialize it back.
>
> Maybe using key and type attribute as you suggest makes it simpler to use,
> and we since we convert from xml to python or python to xml, we can
> have a "py" namespace.
>
> <py:item>
>     <py:item key="devices" type="list">
>         <py:item type="dict">
>             <py:item key="foo" type="str">bar</py:item>
>         </py:item>
>     </py:item>
> </py:item>
>
> With this we can have:
>
> >>>
> r.findall("./metadata/py:item/py:item[@key='devices']/py:item[1]/py:item[@key='foo']",
> namespaces={"py": "http://ovirt.org/py/1.0"})[0].text
> 'bar'
>
> Something like this can be useful for others as well.

Great suggestions, I like the py namespace, but concerns were raised
about the solution here being too generic, so I limited it, see
https://gerrit.ovirt.org/#/c/74206/11

but I will add my pyxmlpickle module
(https://github.com/fromanirh/pyxmlpickle) anyway in the future
according to your comments.


-- 
Francesco Romani
Red Hat Engineering Virtualization R & D
IRC: fromani

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170322/2cca607d/attachment-0001.html>


More information about the Devel mailing list