[ovirt-devel] [vdsm] Engine XML: metadata and devices from XML

Nir Soffer nsoffer at redhat.com
Sat Mar 18 19:26:20 UTC 2017


On Wed, Mar 15, 2017 at 2:28 PM Francesco Romani <fromani at redhat.com> wrote:

> Hi everyone,
>
> This is both a report of the current state of my Vdsm patches for Engine
> XML support, and a proposal to how move forward and solve
> the current open issues.
>
> TL;DR:
> 1. we can and IMO should reuse the current JSON schema to describe the
> structure (layout) and the types of the metadata section.
> 2. we don't need a priori validation of stuff in the metadata section.
> We will just raise in the creation flow if data is missing, or wrong,
> according to our schema.
> 2. we will add *few* items to the metadata section, only thing we can't
> express clearly-or at all in the libvirt XML. Redundancy and verbosiness
> will be thus kept at bay
> 3. I believe [3] is the best tool to do (de)serialize data to the
> metadata section. Existing tools fits poorly in our very specific use case
>
> Examples below
>
> +++
>
> Long(er) discussion:
>
>
> I have working code[1][2] to encode any custom, picklable, python
> object in the metadata section.
>
> We should decide which module will do the actual python<=>XML
> transformation.
> Please note that this actually also influences how the data in the
> medata section look like, so the two things are a bit coupled.
>
> I'm eager to reinvent another wheel, but after
> initial evaluation I honestly think that my pyxmlpickle[3] is the best
> tool for the job over the current alternatives: plistlib[4] and
> xmltodict[5].
>
> I added the initial rationale here:
> https://gerrit.ovirt.org/#/c/73790/4//COMMIT_MSG
>
> I have completed the initial draft of patches to make it possible to
> initialize devices from their XML representation [6]. This is the bare
> minimum we need to support the Engine XML, and we *need* this anyway to
> unlock the cleanup we planned and I outlined in my google doc.
>
> So we are progressing, but I'd like to speed up things. Those [6]
> patches are not yet complete, many flows are not covered or tested;  but
> they are good enough to demonstrate that there *are* pieces of
> information wen need to properly initialize the devices, but we can't
> easily extract from the XML.
>
> First examples that come to my mind are the storage.Drive UUIDs; there
> could also be some ambiguity I'm investigating right now for
> displayIp/displayNetwork in Graphics devices. In [6] there are various
> TODO to mark more of those cases. Most likely, few more cases will pop
> out as I cover all the flows we support.
>
> Long story short: it is hard to correctly rebuild the device conf from
> the XML. This is why in [6] I added the 'meta' argument to from_xml_tree
> classmethod in [7].
>
> 'meta' is supposed to be the device metadata: extra data related to a
> device which doesn't (yet) fit in the libvirt XML representation.
> For example, we can store 'displayIp' and 'displayNetwork' here and be
> done with that: using both per-device metadata and the XML
> representation of one graphic device, we will have everything we need to
> properly build one graphics.Graphics device.
> This example may (hopefully) be bogus, but I'm keeping it because it is
> one case easy to follow.
>
> The device metadata is going to be stored in the vm metadata for the
> short/mid term future. Even if the per-device metadata idea/RFE is
> accepted (no answer yet, but we are working on it), we will not have in
> 7.4, and unlikely in 7.5.
>
> As it stands today, I believe there are two open questions:
>
> 1. do we need a schema for the metadata section?
> 2. how do we bind the metadata to the devices? How do we know which
> metadata belongs to which metadata, if we don't have aliases nor
> addresses to match? (e.g. very first time the VM is created!)
>
> My current stance is the following
> 1. In general, one schema gives us two benefits: 1.a. we document how
> the layout of the data should be, including types; 1.b. we can validate
> the data we receive.
> So yes, we need a schema, but we don't need a *new* schema. I think we
> are in good enough shape with the current Vdsm schema: we can just
> translate the python object layout to a XML layout.
>
> One example is probably more explicative. Some actual data may look
> like, using my pyxmlpickle module:
>
> <domain type='kvm' id='5'>
>   <name>a0</name>
>   <uuid>ccd945c8-8069-4f31-8471-bbb58e9dd6ea</uuid>
>   <metadata xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0"
> xmlns:ovirt-vm="http://ovirt.org/vm/1.0"
> xmlns:ovirt-instance="http://ovirt.org/vm/instance/1.0">
>     <ovirt-tune:qos/>
>     <ovirt-vm:vm/>
>     <ovirt-instance:instance>
>       <ovirt-instance:value type="dict">
>         <ovirt-instance:item key="devices" type="list">
>           <ovirt-instance:item index="0" type="dict">
>             <ovirt-instance:item key="device"
> type="str">vnc</ovirt-instance:item>
>             <ovirt-instance:item key="specParams" type="dict">
>               <ovirt-instance:item key="displayNetwork"
> type="str">ovirtmgmt</ovirt-instance:item>
>               <ovirt-instance:item key="displayIp"
> type="str">192.168.1.53</ovirt-instance:item>
>             </ovirt-instance:item>
>             <ovirt-instance:item key="type"
> type="str">graphics</ovirt-instance:item>
>           </ovirt-instance:item>
>           <ovirt-instance:item index="1" type="dict">
>             <ovirt-instance:item key="device"
> type="str">spice</ovirt-instance:item>
>             <ovirt-instance:item key="specParams" type="dict">
>               <ovirt-instance:item key="displayNetwork"
> type="str">ovirtmgmt</ovirt-instance:item>
>               <ovirt-instance:item key="displayIp"
> type="str">192.168.1.53</ovirt-instance:item>
>             </ovirt-instance:item>
>             <ovirt-instance:item key="type"
> type="str">graphics</ovirt-instance:item>
>           </ovirt-instance:item>
>           <ovirt-instance:item index="2" type="dict">
>             <ovirt-instance:item key="poolID"
> type="str">5890a292-0390-01d2-01ed-00000000029a</ovirt-instance:item>
>             <ovirt-instance:item key="imageID"
> type="str">66441539-f7ac-4946-8a25-75e422f939d4</ovirt-instance:item>
>             <ovirt-instance:item key="domainID"
> type="str">c578566d-bc61-420c-8f1e-8dfa0a18efd5</ovirt-instance:item>
>             <ovirt-instance:item key="device"
> type="str">disk</ovirt-instance:item>
>             <ovirt-instance:item key="path"
>
> type="str">/rhev/data-center/5890a292-0390-01d2-01ed-00000000029a/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc</ovirt-instance:item>
>             <ovirt-instance:item key="volumeID"
> type="str">5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc</ovirt-instance:item>
>           </ovirt-instance:item>
>         </ovirt-instance:item>
>       </ovirt-instance:value>
>     </ovirt-instance:instance>
>   </metadata>
>  <!-- omitted for brevity -->
> </domain>
>
>
> Please note that yes, this is still verbose, but we don't want to add
> much data here, for most of information the most reliable source will
> be the domain XML. We will add here only the extra info we can't really
> fetch from that.
>
> 2. I don't think we need explicit validation: we could just raise along
> the way in the creation flow if we don't find some extra metadata we
> need. This will also solve the issue that if we reuse the current schema
> and we omit most of data, we will lack quite a lot of elements
> marked mandatory.


> Once we reached agreement, I will update my
>
> https://docs.google.com/document/d/1eD8KSLwwyo2Sk64MytbmE0wBxxMlpIyEI1GRcHDkp7Y/edit#heading=h.hqdqzmmm9i77
> accordingly.
>
> Final note: while device take the lion's share, we will likely need help
> from the metadata section also to store VM extra info, but all the above
> discussion also applies here.
>
> +++
>
> [1]
>
> https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata3
> - uses xmltodict

[2]
>
> https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-pyxmlpickle
> ported the 'virt-metadata3' topic to pyxmlpickle

[3] https://github.com/fromanirh/pyxmlpickle


Looks good, I like the simple loads() and dumps().

Issues:
- index attribute seems unneeded
- not sure why we need the value element, seems that everything can be an
item
- not sure special root element is needed, why not parse the contents of
  any element, and raise ValueError if there is not type info?
- strange name - here some alternative names:
  - xmlon (xml object notation)
  - pxon (python xml object notation)

If there is no other library that we can use, this seems to be
the best direction.

Bug I think plist xml format is much nicer - it should be easy to
support this format instead of the type attributes:

pyxmlpickle:

<pyxmlpickle>
    <value type="dict">
        <item key="foo" type="str">bar</item>
        <item key="list" type="list">
           <item index="0" type="int">42</item>
           <item index="1" type="float">3.14</item>
        </item>
    </value>
</pyxmlpicle>

plist:

<plist>
    <dict>
       <key>foo</key>
       <string>bar</string>
       <key>list</key>
       <array>
          <integer>42</integer>
          <real>3.14</real>
       </array>
    </dict>
</plist>

Less noise, more readable, easier to parse?

Since we cannot use plistlib as is, we can make
the format nicer and more pythonic:

<py>
    <dict>
       <key>foo</key>
       <str>bar</str>
       <key>list</key>
       <list>
          <int>42</int>
          <float>3.14</float>
       </list>
    </dict>
</py>

Adding namespace will ruin this but if this required by libvirt
we have no choice.


> [4] https://docs.python.org/2/library/plistlib.html


We cannot use it as is, since it does not support reading and writing
elements, only complete document. We need integration with etree
- convert dict to etree element and etree element to dict. Finally, it does
not support None.

We can borrow code from this module, it is well tested and the author
is the same author of simplejson and other nice stuff.


>
> [5] https://github.com/martinblech/xmltodict


xmltodict converts everything to string, and does not parse the
type from the xml, this is very far from the json module.

I found also dicttoxml, which does keep the types, but does
not support parsing xml.

I don't see any value in including this dependency and hacking
around it to make it do what we want.

There is also:
https://pypi.python.org/pypi/xmljson

Using the parker module seems nice:

>>> from xml.etree.ElementTree import tostring, fromstring
>>> d = {'devices': [{'int': 42, 'none': None, 'float': 3.14, 'string':
'value'}]}
>>> from xmljson import parker
>>> root = parker.etree(d)[0]
>>> tostring(root)
'<devices><int>42</int><none>None</none><float>3.14</float><string>value</string></devices>'
>>> parker.data(root)
OrderedDict([('int', 42), ('none', 'None'), ('float', 3.14), ('string',
'value')])

But the list was converted to dict :-)

Seems that this project focus on converting any xml to json
while we care about converting json to xml and back, we
don't care about any xml.

The integration with etree is nice, we should have this.

[6]
>
> https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:vm-devs-xml
> [7] https://gerrit.ovirt.org/#/c/72880/15/lib/vdsm/virt/vmdevices/core.py
>
> --
> Francesco Romani
> Red Hat Engineering Virtualization R & D
> IRC: fromani
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170318/dc7581d2/attachment-0001.html>


More information about the Devel mailing list