<div dir="ltr"><br><div class="gmail_quote"><div dir="ltr">On Wed, Mar 15, 2017 at 2:28 PM Francesco Romani <<a href="mailto:fromani@redhat.com">fromani@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi everyone,<br class="gmail_msg">
<br class="gmail_msg">
This is both a report of the current state of my Vdsm patches for Engine<br class="gmail_msg">
XML support, and a proposal to how move forward and solve<br class="gmail_msg">
the current open issues.<br class="gmail_msg">
<br class="gmail_msg">
TL;DR:<br class="gmail_msg">
1. we can and IMO should reuse the current JSON schema to describe the<br class="gmail_msg">
structure (layout) and the types of the metadata section.<br class="gmail_msg">
2. we don't need a priori validation of stuff in the metadata section.<br class="gmail_msg">
We will just raise in the creation flow if data is missing, or wrong,<br class="gmail_msg">
according to our schema.<br class="gmail_msg">
2. we will add *few* items to the metadata section, only thing we can't<br class="gmail_msg">
express clearly-or at all in the libvirt XML. Redundancy and verbosiness<br class="gmail_msg">
will be thus kept at bay<br class="gmail_msg">
3. I believe [3] is the best tool to do (de)serialize data to the<br class="gmail_msg">
metadata section. Existing tools fits poorly in our very specific use case<br class="gmail_msg">
<br class="gmail_msg">
Examples below<br class="gmail_msg">
<br class="gmail_msg">
+++<br class="gmail_msg">
<br class="gmail_msg">
Long(er) discussion:<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
I have working code[1][2] to encode any custom, picklable, python<br class="gmail_msg">
object in the metadata section.<br class="gmail_msg">
<br class="gmail_msg">
We should decide which module will do the actual python<=>XML<br class="gmail_msg">
transformation.<br class="gmail_msg">
Please note that this actually also influences how the data in the<br class="gmail_msg">
medata section look like, so the two things are a bit coupled.<br class="gmail_msg">
<br class="gmail_msg">
I'm eager to reinvent another wheel, but after<br class="gmail_msg">
initial evaluation I honestly think that my pyxmlpickle[3] is the best<br class="gmail_msg">
tool for the job over the current alternatives: plistlib[4] and<br class="gmail_msg">
xmltodict[5].<br class="gmail_msg">
<br class="gmail_msg">
I added the initial rationale here:<br class="gmail_msg">
<a href="https://gerrit.ovirt.org/#/c/73790/4//COMMIT_MSG" rel="noreferrer" class="gmail_msg" target="_blank">https://gerrit.ovirt.org/#/c/73790/4//COMMIT_MSG</a><br class="gmail_msg">
<br class="gmail_msg">
I have completed the initial draft of patches to make it possible to<br class="gmail_msg">
initialize devices from their XML representation [6]. This is the bare<br class="gmail_msg">
minimum we need to support the Engine XML, and we *need* this anyway to<br class="gmail_msg">
unlock the cleanup we planned and I outlined in my google doc.<br class="gmail_msg">
<br class="gmail_msg">
So we are progressing, but I'd like to speed up things. Those [6]<br class="gmail_msg">
patches are not yet complete, many flows are not covered or tested; but<br class="gmail_msg">
they are good enough to demonstrate that there *are* pieces of<br class="gmail_msg">
information wen need to properly initialize the devices, but we can't<br class="gmail_msg">
easily extract from the XML.<br class="gmail_msg">
<br class="gmail_msg">
First examples that come to my mind are the storage.Drive UUIDs; there<br class="gmail_msg">
could also be some ambiguity I'm investigating right now for<br class="gmail_msg">
displayIp/displayNetwork in Graphics devices. In [6] there are various<br class="gmail_msg">
TODO to mark more of those cases. Most likely, few more cases will pop<br class="gmail_msg">
out as I cover all the flows we support.<br class="gmail_msg">
<br class="gmail_msg">
Long story short: it is hard to correctly rebuild the device conf from<br class="gmail_msg">
the XML. This is why in [6] I added the 'meta' argument to from_xml_tree<br class="gmail_msg">
classmethod in [7].<br class="gmail_msg">
<br class="gmail_msg">
'meta' is supposed to be the device metadata: extra data related to a<br class="gmail_msg">
device which doesn't (yet) fit in the libvirt XML representation.<br class="gmail_msg">
For example, we can store 'displayIp' and 'displayNetwork' here and be<br class="gmail_msg">
done with that: using both per-device metadata and the XML<br class="gmail_msg">
representation of one graphic device, we will have everything we need to<br class="gmail_msg">
properly build one graphics.Graphics device.<br class="gmail_msg">
This example may (hopefully) be bogus, but I'm keeping it because it is<br class="gmail_msg">
one case easy to follow.<br class="gmail_msg">
<br class="gmail_msg">
The device metadata is going to be stored in the vm metadata for the<br class="gmail_msg">
short/mid term future. Even if the per-device metadata idea/RFE is<br class="gmail_msg">
accepted (no answer yet, but we are working on it), we will not have in<br class="gmail_msg">
7.4, and unlikely in 7.5.<br class="gmail_msg">
<br class="gmail_msg">
As it stands today, I believe there are two open questions:<br class="gmail_msg">
<br class="gmail_msg">
1. do we need a schema for the metadata section?<br class="gmail_msg">
2. how do we bind the metadata to the devices? How do we know which<br class="gmail_msg">
metadata belongs to which metadata, if we don't have aliases nor<br class="gmail_msg">
addresses to match? (e.g. very first time the VM is created!)<br class="gmail_msg">
<br class="gmail_msg">
My current stance is the following<br class="gmail_msg">
1. In general, one schema gives us two benefits: 1.a. we document how<br class="gmail_msg">
the layout of the data should be, including types; 1.b. we can validate<br class="gmail_msg">
the data we receive.<br class="gmail_msg">
So yes, we need a schema, but we don't need a *new* schema. I think we<br class="gmail_msg">
are in good enough shape with the current Vdsm schema: we can just<br class="gmail_msg">
translate the python object layout to a XML layout.<br class="gmail_msg">
<br class="gmail_msg">
One example is probably more explicative. Some actual data may look<br class="gmail_msg">
like, using my pyxmlpickle module:<br class="gmail_msg">
<br class="gmail_msg">
<domain type='kvm' id='5'><br class="gmail_msg">
<name>a0</name><br class="gmail_msg">
<uuid>ccd945c8-8069-4f31-8471-bbb58e9dd6ea</uuid><br class="gmail_msg">
<metadata xmlns:ovirt-tune="<a href="http://ovirt.org/vm/tune/1.0" rel="noreferrer" class="gmail_msg" target="_blank">http://ovirt.org/vm/tune/1.0</a>"<br class="gmail_msg">
xmlns:ovirt-vm="<a href="http://ovirt.org/vm/1.0" rel="noreferrer" class="gmail_msg" target="_blank">http://ovirt.org/vm/1.0</a>"<br class="gmail_msg">
xmlns:ovirt-instance="<a href="http://ovirt.org/vm/instance/1.0" rel="noreferrer" class="gmail_msg" target="_blank">http://ovirt.org/vm/instance/1.0</a>"><br class="gmail_msg">
<ovirt-tune:qos/><br class="gmail_msg">
<ovirt-vm:vm/><br class="gmail_msg">
<ovirt-instance:instance><br class="gmail_msg">
<ovirt-instance:value type="dict"><br class="gmail_msg">
<ovirt-instance:item key="devices" type="list"><br class="gmail_msg">
<ovirt-instance:item index="0" type="dict"><br class="gmail_msg">
<ovirt-instance:item key="device"<br class="gmail_msg">
type="str">vnc</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="specParams" type="dict"><br class="gmail_msg">
<ovirt-instance:item key="displayNetwork"<br class="gmail_msg">
type="str">ovirtmgmt</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="displayIp"<br class="gmail_msg">
type="str">192.168.1.53</ovirt-instance:item><br class="gmail_msg">
</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="type"<br class="gmail_msg">
type="str">graphics</ovirt-instance:item><br class="gmail_msg">
</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item index="1" type="dict"><br class="gmail_msg">
<ovirt-instance:item key="device"<br class="gmail_msg">
type="str">spice</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="specParams" type="dict"><br class="gmail_msg">
<ovirt-instance:item key="displayNetwork"<br class="gmail_msg">
type="str">ovirtmgmt</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="displayIp"<br class="gmail_msg">
type="str">192.168.1.53</ovirt-instance:item><br class="gmail_msg">
</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="type"<br class="gmail_msg">
type="str">graphics</ovirt-instance:item><br class="gmail_msg">
</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item index="2" type="dict"><br class="gmail_msg">
<ovirt-instance:item key="poolID"<br class="gmail_msg">
type="str">5890a292-0390-01d2-01ed-00000000029a</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="imageID"<br class="gmail_msg">
type="str">66441539-f7ac-4946-8a25-75e422f939d4</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="domainID"<br class="gmail_msg">
type="str">c578566d-bc61-420c-8f1e-8dfa0a18efd5</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="device"<br class="gmail_msg">
type="str">disk</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="path"<br class="gmail_msg">
type="str">/rhev/data-center/5890a292-0390-01d2-01ed-00000000029a/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc</ovirt-instance:item><br class="gmail_msg">
<ovirt-instance:item key="volumeID"<br class="gmail_msg">
type="str">5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc</ovirt-instance:item><br class="gmail_msg">
</ovirt-instance:item><br class="gmail_msg">
</ovirt-instance:item><br class="gmail_msg">
</ovirt-instance:value><br class="gmail_msg">
</ovirt-instance:instance><br class="gmail_msg">
</metadata><br class="gmail_msg">
<!-- omitted for brevity --><br class="gmail_msg">
</domain><br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
Please note that yes, this is still verbose, but we don't want to add<br class="gmail_msg">
much data here, for most of information the most reliable source will<br class="gmail_msg">
be the domain XML. We will add here only the extra info we can't really<br class="gmail_msg">
fetch from that.<br class="gmail_msg">
<br class="gmail_msg">
2. I don't think we need explicit validation: we could just raise along<br class="gmail_msg">
the way in the creation flow if we don't find some extra metadata we<br class="gmail_msg">
need. This will also solve the issue that if we reuse the current schema<br class="gmail_msg">
and we omit most of data, we will lack quite a lot of elements<br class="gmail_msg">
marked mandatory. </blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br class="gmail_msg">
Once we reached agreement, I will update my<br class="gmail_msg">
<a href="https://docs.google.com/document/d/1eD8KSLwwyo2Sk64MytbmE0wBxxMlpIyEI1GRcHDkp7Y/edit#heading=h.hqdqzmmm9i77" rel="noreferrer" class="gmail_msg" target="_blank">https://docs.google.com/document/d/1eD8KSLwwyo2Sk64MytbmE0wBxxMlpIyEI1GRcHDkp7Y/edit#heading=h.hqdqzmmm9i77</a><br class="gmail_msg">
accordingly.<br class="gmail_msg">
<br class="gmail_msg">
Final note: while device take the lion's share, we will likely need help<br class="gmail_msg">
from the metadata section also to store VM extra info, but all the above<br class="gmail_msg">
discussion also applies here.<br class="gmail_msg">
<br class="gmail_msg">
+++<br class="gmail_msg">
<br class="gmail_msg">
[1]<br class="gmail_msg">
<a href="https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata3" rel="noreferrer" class="gmail_msg" target="_blank">https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata3</a><br class="gmail_msg">
- uses xmltodict</blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">[2]<br class="gmail_msg">
<a href="https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-pyxmlpickle" rel="noreferrer" class="gmail_msg" target="_blank">https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-pyxmlpickle</a><br class="gmail_msg">
ported the 'virt-metadata3' topic to pyxmlpickle </blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
[3] <a href="https://github.com/fromanirh/pyxmlpickle" rel="noreferrer" class="gmail_msg" target="_blank">https://github.com/fromanirh/pyxmlpickle</a></blockquote><div><br></div><div>Looks good, I like the simple loads() and dumps().</div><div><br></div><div>Issues:</div><div>- index attribute seems unneeded</div><div>- not sure why we need the value element, seems that everything can be an item<br></div><div>- not sure special root element is needed, why not parse the contents of</div><div> any element, and raise ValueError if there is not type info?</div><div><div>- strange name - here some alternative names:</div><div> - xmlon (xml object notation)</div><div> - pxon (python xml object notation)</div><br class="inbox-inbox-Apple-interchange-newline"></div><div>If there is no other library that we can use, this seems to be</div><div>the best direction.</div><div><br></div><div>Bug I think plist xml format is much nicer - it should be easy to</div><div>support this format instead of the type attributes:</div><div><br></div><div>pyxmlpickle:</div><div><br></div><div><pyxmlpickle></div><div> <value type="dict"></div><div> <item key="foo" type="str">bar</item></div><div> <item key="list" type="list"></div><div> <item index="0" type="int">42</item></div><div> <item index="1" type="float">3.14</item></div><div> </item></div><div> </value></div><div></pyxmlpicle></div><div><br></div><div>plist:</div><div><br></div><div><plist></div><div> <dict></div><div> <key>foo</key></div><div> <string>bar</string></div><div> <key>list</key></div><div> <array></div><div> <integer>42</integer></div><div> <real>3.14</real></div><div> </array></div><div> </dict></div><div></plist></div><div><br></div><div>Less noise, more readable, easier to parse?</div><div><br></div><div>Since we cannot use plistlib as is, we can make</div><div>the format nicer and more pythonic:</div><div><br></div><div><div><py></div><div> <dict></div><div> <key>foo</key></div><div> <str>bar</str></div><div> <key>list</key></div><div> <list></div><div> <int>42</int></div><div> <float>3.14</float></div><div> </list></div><div> </dict></div><div></py></div></div><div><br></div><div>Adding namespace will ruin this but if this required by libvirt</div><div>we have no choice.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="gmail_msg">
[4] <a href="https://docs.python.org/2/library/plistlib.html" rel="noreferrer" class="gmail_msg" target="_blank">https://docs.python.org/2/library/plistlib.html</a></blockquote><div><br></div><div>We cannot use it as is, since it does not support reading and writing</div><div>elements, only complete document. We need integration with etree</div><div>- convert dict to etree element and etree element to dict. Finally, it does</div><div>not support None.</div><div><br></div><div>We can borrow code from this module, it is well tested and the author</div><div>is the same author of simplejson and other nice stuff.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="gmail_msg">
[5] <a href="https://github.com/martinblech/xmltodict" rel="noreferrer" class="gmail_msg" target="_blank">https://github.com/martinblech/xmltodict</a></blockquote><div><br></div><div>xmltodict converts everything to string, and does not parse the</div><div>type from the xml, this is very far from the json module.</div><div><br></div><div>I found also dicttoxml, which does keep the types, but does</div><div>not support parsing xml.</div><div><br></div><div>I don't see any value in including this dependency and hacking</div><div>around it to make it do what we want.</div><div><br></div><div>There is also:</div><div><a href="https://pypi.python.org/pypi/xmljson">https://pypi.python.org/pypi/xmljson</a></div><div><br></div><div>Using the parker module seems nice:</div><div><br></div><div>>>> from xml.etree.ElementTree import tostring, fromstring</div><div><div>>>> d = {'devices': [{'int': 42, 'none': None, 'float': 3.14, 'string': 'value'}]}</div></div><div><div>>>> from xmljson import parker</div><div>>>> root = parker.etree(d)[0]<br></div><div>>>> tostring(root)</div><div>'<devices><int>42</int><none>None</none><float>3.14</float><string>value</string></devices>'</div><div>>>> parker.data(root)</div><div>OrderedDict([('int', 42), ('none', 'None'), ('float', 3.14), ('string', 'value')])</div></div><div><br></div><div>But the list was converted to dict :-)</div><div><br></div><div>Seems that this project focus on converting any xml to json</div><div>while we care about converting json to xml and back, we </div><div>don't care about any xml.</div><div><br></div><div>The integration with etree is nice, we should have this.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
[6]<br class="gmail_msg">
<a href="https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:vm-devs-xml" rel="noreferrer" class="gmail_msg" target="_blank">https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:vm-devs-xml</a><br class="gmail_msg">
[7] <a href="https://gerrit.ovirt.org/#/c/72880/15/lib/vdsm/virt/vmdevices/core.py" rel="noreferrer" class="gmail_msg" target="_blank">https://gerrit.ovirt.org/#/c/72880/15/lib/vdsm/virt/vmdevices/core.py</a><br class="gmail_msg">
<br class="gmail_msg">
--<br class="gmail_msg">
Francesco Romani<br class="gmail_msg">
Red Hat Engineering Virtualization R & D<br class="gmail_msg">
IRC: fromani<br class="gmail_msg">
<br class="gmail_msg">
_______________________________________________<br class="gmail_msg">
Devel mailing list<br class="gmail_msg">
<a href="mailto:Devel@ovirt.org" class="gmail_msg" target="_blank">Devel@ovirt.org</a><br class="gmail_msg">
<a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" class="gmail_msg" target="_blank">http://lists.ovirt.org/mailman/listinfo/devel</a><br class="gmail_msg">
</blockquote></div></div>