[PATCH v6 0/4] Host device passthrough: Summary

Hi, This patch series is to enable Kimchi to assign hos devices directly to a VM, thus greately improve VM performance. Currently we support assigning PCI device, USB device and SCSI LUN. For example, we can assign an NIC to VM to improve guest network throughput, or passthrough a USB camera to enable the guest OS to record video. Host devices form a tree. We can assign most of the devices in the tree to VM. By assigning a device, all the devices in its sub-tree are also assigned. It might not make sense to assign a USB controller, because the host may be using one of the devices connected to the controller. Instead, Kimchi just presents the "leaf" devices to assign to guest. In recent Linux kernel and KVM, it is able to recognize the IOMMU group of a PCI device. The "leaf" PCI devices in the same IOMMU group should be assigned and dismissed together. The IOMMU group is the actual smallest isolation granularity of the PCI devices. The first patch is to list all host devices information. It's useful on its own to show host devices information. The second patch is to list all eligible host devices to assign, as well as the "affected" devices in the same IOMMU group. The third patch creates a sub-collection "hostdevs" to the VM resource, and deals with assigning and dismissing devices. The fourth patch adds a sub-collection "vm_holders" to the host device resource. It's to list all VMs that are holding the device. I'll update API and unit test once everyone is happy with the interface and logic. v6: Do not passthrough PCI device of class code 0x07. It might contains system device not suitable to assign to guest. v5: Filter ealigible pci devices according to pci class. When assigning a device to VM, check if there are other VMs holding it. Use "kimchi.model.utils.get_vm_config_flag()" to correctly set the device attaching API flag. v4: Add new sub-collection to host device to list the VMs holding the device. v3: Fix a small naming error introduced by rebase. v2: Handle the devices in VM's sub-collection "hostdevs". v1: Handle the devices in VM template. Zhou Zheng Sheng (4): Host device passthrough: List all types of host devices Host device passthrough: List eligible device to passthrough Host device passthrough: Directly assign and dissmis host device from VM Host device passthrough: List VMs that are holding a host device docs/API.md | 11 +- src/kimchi/control/host.py | 16 ++ src/kimchi/control/vm/hostdevs.py | 44 +++++ src/kimchi/featuretests.py | 10 +- src/kimchi/hostdev.py | 316 +++++++++++++++++++++++++++++++ src/kimchi/i18n.py | 9 + src/kimchi/mockmodel.py | 7 +- src/kimchi/model/config.py | 2 + src/kimchi/model/host.py | 32 ++-- src/kimchi/model/libvirtstoragepool.py | 18 +- src/kimchi/model/vmhostdevs.py | 327 +++++++++++++++++++++++++++++++++ src/kimchi/rollbackcontext.py | 3 + src/kimchi/xmlutils.py | 26 ++- tests/test_rest.py | 6 +- tests/test_storagepool.py | 7 +- 15 files changed, 795 insertions(+), 39 deletions(-) create mode 100644 src/kimchi/control/vm/hostdevs.py create mode 100644 src/kimchi/hostdev.py create mode 100644 src/kimchi/model/vmhostdevs.py -- 1.9.3

The URI /host/devices only presents scsi_host (particularly fc_host) device information. To implement host PCI pass through, we should list all types of host devices. This patch adds support for parsing various host devices information, and listing them on /host/devices. So the user is free to choose any listed PCI device to pass through to guest. Since the patch changes the device information dictionary format, the existing code consuming the device information is also changed accordingly. To get all types of host device, access the following URL. curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ https://127.0.0.1:8001/host/devices To get only fc_host devices, change the URL to "https://127.0.0.1:8001/host/devices?_cap=fc_host" To get only pci device, change the URL to "https://127.0.0.1:8001/host/devices?_cap=pci" v1: Parse the node device XML using xpath. v2: Write a "dictize" function and parse the node device XML using dictize. v3: Fix a naming mistake. v4: It is observed that sometimes the parent devices is not listed by libvirt but the child device is listed. In previous version we catch this exception and ignore it. The root cause is unknown, and we failed to re-produce the problem. In v4 we do not catch it. It seems to be related to USB removable disk, and the problem is gone after we upgraded Linux kernel. Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- docs/API.md | 11 +- src/kimchi/hostdev.py | 209 +++++++++++++++++++++++++++++++++ src/kimchi/mockmodel.py | 7 +- src/kimchi/model/host.py | 15 +-- src/kimchi/model/libvirtstoragepool.py | 18 +-- src/kimchi/xmlutils.py | 26 +++- tests/test_rest.py | 6 +- tests/test_storagepool.py | 7 +- 8 files changed, 262 insertions(+), 37 deletions(-) create mode 100644 src/kimchi/hostdev.py diff --git a/docs/API.md b/docs/API.md index ef5bf14..5026771 100644 --- a/docs/API.md +++ b/docs/API.md @@ -869,12 +869,11 @@ stats history * **GET**: Retrieve information of a single pci device. Currently only scsi_host devices are supported: * name: The name of the device. - * adapter_type: The capability type of the scsi_host device (fc_host). - Empty if pci device is not scsi_host. - * wwnn: The HBA Word Wide Node Name. - Empty if pci device is not scsi_host. - * wwpn: The HBA Word Wide Port Name - Empty if pci device is not scsi_host. + * path: Path of device in sysfs. + * adapter: Host adapter information. Empty if pci device is not scsi_host. + * type: The capability type of the scsi_host device (fc_host, vport_ops). + * wwnn: The HBA Word Wide Node Name. Empty if pci device is not fc_host. + * wwpn: The HBA Word Wide Port Name. Empty if pci device is not fc_host. ### Collection: Host Packages Update diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py new file mode 100644 index 0000000..d4c142d --- /dev/null +++ b/src/kimchi/hostdev.py @@ -0,0 +1,209 @@ +# +# Kimchi +# +# Copyright IBM Corp, 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +from kimchi.model.libvirtconnection import LibvirtConnection +from kimchi.utils import kimchi_log +from kimchi.xmlutils import dictize + + +def _get_all_host_dev_infos(): + libvirt_conn = LibvirtConnection('qemu:///system').get() + node_devs = libvirt_conn.listAllDevices() + return [get_dev_info(node_dev) for node_dev in node_devs] + + +def _get_dev_info_tree(dev_infos): + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + root = None + for dev_info in dev_infos: + if dev_info['parent'] is None: + root = dev_info + continue + parent = devs[dev_info['parent']] + + try: + children = parent['children'] + except KeyError: + parent['children'] = [dev_info] + else: + children.append(dev_info) + return root + + +def get_dev_info(node_dev): + ''' Parse the node device XML string into dict according to + http://libvirt.org/formatnode.html. ''' + + def shift_subdict(d, toshift): + subdict = d.pop(toshift) + d.update(subdict) + return d + + xmlstr = node_dev.XMLDesc() + info = dictize(xmlstr)['device'] + dev_type = info['capability'].pop('type') + info['device_type'] = dev_type + shift_subdict(info, 'capability') + info['parent'] = node_dev.parent() + + get_dev_type_info = { + 'net': _get_net_dev_info, + 'pci': _get_pci_dev_info, + 'scsi': _get_scsi_dev_info, + 'scsi_generic': _get_scsi_generic_dev_info, + 'scsi_host': _get_scsi_host_dev_info, + 'scsi_target': _get_scsi_target_dev_info, + 'storage': _get_storage_dev_info, + 'system': _get_system_dev_info, + 'usb': _get_usb_dev_info, + 'usb_device': _get_usb_device_dev_info, + } + try: + get_detail_info = get_dev_type_info[dev_type] + except KeyError: + kimchi_log.error("Unknown device type: %s", dev_type) + return info + + return get_detail_info(info) + + +def _get_net_dev_info(info): + cap = info.pop('capability') + links = {"80203": "IEEE 802.3", "80211": "IEEE 802.11"} + link_raw = cap['type'] + info['link_type'] = links.get(link_raw, link_raw) + + return info + + +def _get_pci_dev_info(info): + for k in ('vendor', 'product'): + info[k]['description'] = info[k].pop('pyval') + try: + info['iommuGroup'] = info['iommuGroup']['number'] + except KeyError: + # No IOMMU group support. + pass + return info + + +def _get_scsi_dev_info(info): + return info + + +def _get_scsi_generic_dev_info(info): + # scsi_generic is not documented in libvirt official website. Try to + # parse scsi_generic according to the following libvirt path series. + # https://www.redhat.com/archives/libvir-list/2013-June/msg00014.html + return info + + +def _get_scsi_host_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + # kimchi.model.libvirtstoragepool.ScsiPoolDef assumes + # info['adapter']['type'] always exists. + info['adapter'] = {'type': ''} + return info + info['adapter'] = cap_info + return info + + +def _get_scsi_target_dev_info(info): + # scsi_target is not documented in libvirt official website. Try to + # parse scsi_target according to the libvirt commit db19834a0a. + return info + + +def _get_storage_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + return info + + if cap_info['type'] == 'removable': + cap_info['available'] = bool(cap_info.pop('media_available')) + if cap_info['available']: + cap_info.update({'size': cap_info.pop('media_size'), + 'label': cap_info.pop('media_label')}) + info['media'] = cap_info + return info + + +def _get_system_dev_info(info): + return info + + +def _get_usb_dev_info(info): + return info + + +def _get_usb_device_dev_info(info): + for k in ('vendor', 'product'): + try: + info[k]['description'] = info[k].pop('pyval') + except KeyError: + # Some USB devices don't provide vendor/product description. + pass + return info + + +# For test and debug +def _print_host_dev_tree(): + dev_infos = _get_all_host_dev_infos() + root = _get_dev_info_tree(dev_infos) + if root is None: + print "No device found" + return + print '-----------------' + print '\n'.join(_format_dev_node(root)) + + +def _format_dev_node(node): + from pprint import pformat + + try: + children = node['children'] + del node['children'] + except KeyError: + children = [] + + lines = [] + lines.extend([' ~' + line for line in pformat(node).split('\n')]) + + count = len(children) + for i, child in enumerate(children): + if count == 1: + lines.append(' \-----------------') + else: + lines.append(' +-----------------') + clines = _format_dev_node(child) + if i == count - 1: + p = ' ' + else: + p = ' |' + lines.extend([p + cline for cline in clines]) + lines.append('') + + return lines + + +if __name__ == '__main__': + _print_host_dev_tree() diff --git a/src/kimchi/mockmodel.py b/src/kimchi/mockmodel.py index 05720f4..7420e7e 100644 --- a/src/kimchi/mockmodel.py +++ b/src/kimchi/mockmodel.py @@ -493,9 +493,10 @@ class MockModel(object): def device_lookup(self, nodedev_name): return { 'name': nodedev_name, - 'adapter_type': 'fc_host', - 'wwnn': uuid.uuid4().hex[:16], - 'wwpn': uuid.uuid4().hex[:16]} + 'adapter': { + 'type': 'fc_host', + 'wwnn': uuid.uuid4().hex[:16], + 'wwpn': uuid.uuid4().hex[:16]}} def isopool_lookup(self, name): return {'state': 'active', diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 5844f4b..1ea97f0 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -28,6 +28,7 @@ import psutil from cherrypy.process.plugins import BackgroundTask from kimchi import disks +from kimchi import hostdev from kimchi import netinfo from kimchi import xmlutils from kimchi.basemodel import Singleton @@ -279,20 +280,10 @@ class DeviceModel(object): def lookup(self, nodedev_name): conn = self.conn.get() try: - dev_xml = conn.nodeDeviceLookupByName(nodedev_name).XMLDesc(0) + dev = conn.nodeDeviceLookupByName(nodedev_name) except: raise NotFoundError('KCHHOST0003E', {'name': nodedev_name}) - cap_type = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/@type') - wwnn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwnn') - wwpn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwpn') - return { - 'name': nodedev_name, - 'adapter_type': cap_type[0] if len(cap_type) >= 1 else '', - 'wwnn': wwnn[0] if len(wwnn) == 1 else '', - 'wwpn': wwpn[0] if len(wwpn) == 1 else ''} + return hostdev.get_dev_info(dev) class PackagesUpdateModel(object): diff --git a/src/kimchi/model/libvirtstoragepool.py b/src/kimchi/model/libvirtstoragepool.py index 47b239b..b15bf1a 100644 --- a/src/kimchi/model/libvirtstoragepool.py +++ b/src/kimchi/model/libvirtstoragepool.py @@ -180,34 +180,34 @@ class ScsiPoolDef(StoragePoolDef): self.poolArgs['source']['name'] = tmp_name.replace('scsi_', '') # fc_host adapters type are only available in libvirt >= 1.0.5 if not self.poolArgs['fc_host_support']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + self.poolArgs['source']['adapter']['type'] = 'scsi_host' msg = "Libvirt version <= 1.0.5. Setting SCSI host name as '%s'; "\ "setting SCSI adapter type as 'scsi_host'; "\ "ignoring wwnn and wwpn." % tmp_name kimchi_log.info(msg) # Path for Fibre Channel scsi hosts self.poolArgs['path'] = '/dev/disk/by-path' - if not self.poolArgs['source']['adapter_type']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + if not self.poolArgs['source']['adapter']['type']: + self.poolArgs['source']['adapter']['type'] = 'scsi_host' @property def xml(self): # Required parameters # name: - # source[adapter_type]: + # source[adapter][type]: # source[name]: - # source[wwnn]: - # source[wwpn]: + # source[adapter][wwnn]: + # source[adapter][wwpn]: # path: xml = """ <pool type='scsi'> <name>{name}</name> <source> - <adapter type='{source[adapter_type]}'\ + <adapter type='{source[adapter][type]}'\ name='{source[name]}'\ - wwnn='{source[wwnn]}'\ - wwpn='{source[wwpn]}'/> + wwnn='{source[adapter][wwnn]}'\ + wwpn='{source[adapter][wwpn]}'/> </source> <target> <path>{path}</path> diff --git a/src/kimchi/xmlutils.py b/src/kimchi/xmlutils.py index 76f0696..56517f2 100644 --- a/src/kimchi/xmlutils.py +++ b/src/kimchi/xmlutils.py @@ -18,6 +18,7 @@ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA import libxml2 +from lxml import objectify from xml.etree import ElementTree @@ -26,7 +27,7 @@ from xml.etree import ElementTree def xpath_get_text(xml, expr): doc = libxml2.parseDoc(xml) res = doc.xpathEval(expr) - ret = [None if x.children == None else x.children.content for x in res] + ret = [None if x.children is None else x.children.content for x in res] doc.freeDoc() return ret @@ -37,3 +38,26 @@ def xml_item_update(xml, xpath, value): item = root.find(xpath) item.text = value return ElementTree.tostring(root, encoding="utf-8") + + +def dictize(xmlstr): + root = objectify.fromstring(xmlstr) + return {root.tag: _dictize(root)} + + +def _dictize(e): + d = {} + if e.text is not None: + if not e.attrib and e.countchildren() == 0: + return e.pyval + d['pyval'] = e.pyval + d.update(e.attrib) + for child in e.iterchildren(): + if child.tag in d: + continue + if len(child) > 1: + d[child.tag] = [ + _dictize(same_tag_child) for same_tag_child in child] + else: + d[child.tag] = _dictize(child) + return d diff --git a/tests/test_rest.py b/tests/test_rest.py index 7ed94cb..cb1ae9a 100644 --- a/tests/test_rest.py +++ b/tests/test_rest.py @@ -158,9 +158,9 @@ class RestTests(unittest.TestCase): nodedev = json.loads(self.request('/host/devices/scsi_host4').read()) # Mockmodel generates random wwpn and wwnn self.assertEquals('scsi_host4', nodedev['name']) - self.assertEquals('fc_host', nodedev['adapter_type']) - self.assertEquals(16, len(nodedev['wwpn'])) - self.assertEquals(16, len(nodedev['wwnn'])) + self.assertEquals('fc_host', nodedev['adapter']['type']) + self.assertEquals(16, len(nodedev['adapter']['wwpn'])) + self.assertEquals(16, len(nodedev['adapter']['wwnn'])) def test_get_vms(self): vms = json.loads(self.request('/vms').read()) diff --git a/tests/test_storagepool.py b/tests/test_storagepool.py index 22b4943..3e3ad83 100644 --- a/tests/test_storagepool.py +++ b/tests/test_storagepool.py @@ -145,9 +145,10 @@ class storagepoolTests(unittest.TestCase): 'path': '/dev/disk/by-path', 'source': { 'name': 'scsi_host3', - 'adapter_type': 'fc_host', - 'wwpn': '0123456789abcdef', - 'wwnn': 'abcdef0123456789'}}, + 'adapter': { + 'type': 'fc_host', + 'wwpn': '0123456789abcdef', + 'wwnn': 'abcdef0123456789'}}}, 'xml': """ <pool type='scsi'> -- 1.9.3

Reviewed-by: ShaoHe Feng <shaohef@linux.vnet.ibm.com> On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
The URI /host/devices only presents scsi_host (particularly fc_host) device information. To implement host PCI pass through, we should list all types of host devices. This patch adds support for parsing various host devices information, and listing them on /host/devices. So the user is free to choose any listed PCI device to pass through to guest. Since the patch changes the device information dictionary format, the existing code consuming the device information is also changed accordingly.
To get all types of host device, access the following URL.
curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ https://127.0.0.1:8001/host/devices
To get only fc_host devices, change the URL to "https://127.0.0.1:8001/host/devices?_cap=fc_host"
To get only pci device, change the URL to "https://127.0.0.1:8001/host/devices?_cap=pci"
v1: Parse the node device XML using xpath.
v2: Write a "dictize" function and parse the node device XML using dictize.
v3: Fix a naming mistake.
v4: It is observed that sometimes the parent devices is not listed by libvirt but the child device is listed. In previous version we catch this exception and ignore it. The root cause is unknown, and we failed to re-produce the problem. In v4 we do not catch it. It seems to be related to USB removable disk, and the problem is gone after we upgraded Linux kernel.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- docs/API.md | 11 +- src/kimchi/hostdev.py | 209 +++++++++++++++++++++++++++++++++ src/kimchi/mockmodel.py | 7 +- src/kimchi/model/host.py | 15 +-- src/kimchi/model/libvirtstoragepool.py | 18 +-- src/kimchi/xmlutils.py | 26 +++- tests/test_rest.py | 6 +- tests/test_storagepool.py | 7 +- 8 files changed, 262 insertions(+), 37 deletions(-) create mode 100644 src/kimchi/hostdev.py
diff --git a/docs/API.md b/docs/API.md index ef5bf14..5026771 100644 --- a/docs/API.md +++ b/docs/API.md @@ -869,12 +869,11 @@ stats history * **GET**: Retrieve information of a single pci device. Currently only scsi_host devices are supported: * name: The name of the device. - * adapter_type: The capability type of the scsi_host device (fc_host). - Empty if pci device is not scsi_host. - * wwnn: The HBA Word Wide Node Name. - Empty if pci device is not scsi_host. - * wwpn: The HBA Word Wide Port Name - Empty if pci device is not scsi_host. + * path: Path of device in sysfs. + * adapter: Host adapter information. Empty if pci device is not scsi_host. + * type: The capability type of the scsi_host device (fc_host, vport_ops). + * wwnn: The HBA Word Wide Node Name. Empty if pci device is not fc_host. + * wwpn: The HBA Word Wide Port Name. Empty if pci device is not fc_host.
### Collection: Host Packages Update
diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py new file mode 100644 index 0000000..d4c142d --- /dev/null +++ b/src/kimchi/hostdev.py @@ -0,0 +1,209 @@ +# +# Kimchi +# +# Copyright IBM Corp, 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +from kimchi.model.libvirtconnection import LibvirtConnection +from kimchi.utils import kimchi_log +from kimchi.xmlutils import dictize + + +def _get_all_host_dev_infos(): + libvirt_conn = LibvirtConnection('qemu:///system').get() + node_devs = libvirt_conn.listAllDevices() + return [get_dev_info(node_dev) for node_dev in node_devs] + + +def _get_dev_info_tree(dev_infos): + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + root = None + for dev_info in dev_infos: + if dev_info['parent'] is None: + root = dev_info + continue + parent = devs[dev_info['parent']] + + try: + children = parent['children'] + except KeyError: + parent['children'] = [dev_info] + else: + children.append(dev_info) + return root + + +def get_dev_info(node_dev): + ''' Parse the node device XML string into dict according to + http://libvirt.org/formatnode.html. ''' + + def shift_subdict(d, toshift): + subdict = d.pop(toshift) + d.update(subdict) + return d + + xmlstr = node_dev.XMLDesc() + info = dictize(xmlstr)['device'] + dev_type = info['capability'].pop('type') + info['device_type'] = dev_type + shift_subdict(info, 'capability') + info['parent'] = node_dev.parent() + + get_dev_type_info = { + 'net': _get_net_dev_info, + 'pci': _get_pci_dev_info, + 'scsi': _get_scsi_dev_info, + 'scsi_generic': _get_scsi_generic_dev_info, + 'scsi_host': _get_scsi_host_dev_info, + 'scsi_target': _get_scsi_target_dev_info, + 'storage': _get_storage_dev_info, + 'system': _get_system_dev_info, + 'usb': _get_usb_dev_info, + 'usb_device': _get_usb_device_dev_info, + } + try: + get_detail_info = get_dev_type_info[dev_type] + except KeyError: + kimchi_log.error("Unknown device type: %s", dev_type) + return info + + return get_detail_info(info) + + +def _get_net_dev_info(info): + cap = info.pop('capability') + links = {"80203": "IEEE 802.3", "80211": "IEEE 802.11"} + link_raw = cap['type'] + info['link_type'] = links.get(link_raw, link_raw) + + return info + + +def _get_pci_dev_info(info): + for k in ('vendor', 'product'): + info[k]['description'] = info[k].pop('pyval') + try: + info['iommuGroup'] = info['iommuGroup']['number'] + except KeyError: + # No IOMMU group support. + pass + return info + + +def _get_scsi_dev_info(info): + return info + + +def _get_scsi_generic_dev_info(info): + # scsi_generic is not documented in libvirt official website. Try to + # parse scsi_generic according to the following libvirt path series. + # https://www.redhat.com/archives/libvir-list/2013-June/msg00014.html + return info + + +def _get_scsi_host_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + # kimchi.model.libvirtstoragepool.ScsiPoolDef assumes + # info['adapter']['type'] always exists. + info['adapter'] = {'type': ''} + return info + info['adapter'] = cap_info + return info + + +def _get_scsi_target_dev_info(info): + # scsi_target is not documented in libvirt official website. Try to + # parse scsi_target according to the libvirt commit db19834a0a. + return info + + +def _get_storage_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + return info + + if cap_info['type'] == 'removable': + cap_info['available'] = bool(cap_info.pop('media_available')) + if cap_info['available']: + cap_info.update({'size': cap_info.pop('media_size'), + 'label': cap_info.pop('media_label')}) + info['media'] = cap_info + return info + + +def _get_system_dev_info(info): + return info + + +def _get_usb_dev_info(info): + return info + + +def _get_usb_device_dev_info(info): + for k in ('vendor', 'product'): + try: + info[k]['description'] = info[k].pop('pyval') + except KeyError: + # Some USB devices don't provide vendor/product description. + pass + return info + + +# For test and debug +def _print_host_dev_tree(): + dev_infos = _get_all_host_dev_infos() + root = _get_dev_info_tree(dev_infos) + if root is None: + print "No device found" + return + print '-----------------' + print '\n'.join(_format_dev_node(root)) + + +def _format_dev_node(node): + from pprint import pformat + + try: + children = node['children'] + del node['children'] + except KeyError: + children = [] + + lines = [] + lines.extend([' ~' + line for line in pformat(node).split('\n')]) + + count = len(children) + for i, child in enumerate(children): + if count == 1: + lines.append(' \-----------------') + else: + lines.append(' +-----------------') + clines = _format_dev_node(child) + if i == count - 1: + p = ' ' + else: + p = ' |' + lines.extend([p + cline for cline in clines]) + lines.append('') + + return lines + + +if __name__ == '__main__': + _print_host_dev_tree() diff --git a/src/kimchi/mockmodel.py b/src/kimchi/mockmodel.py index 05720f4..7420e7e 100644 --- a/src/kimchi/mockmodel.py +++ b/src/kimchi/mockmodel.py @@ -493,9 +493,10 @@ class MockModel(object): def device_lookup(self, nodedev_name): return { 'name': nodedev_name, - 'adapter_type': 'fc_host', - 'wwnn': uuid.uuid4().hex[:16], - 'wwpn': uuid.uuid4().hex[:16]} + 'adapter': { + 'type': 'fc_host', + 'wwnn': uuid.uuid4().hex[:16], + 'wwpn': uuid.uuid4().hex[:16]}}
def isopool_lookup(self, name): return {'state': 'active', diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 5844f4b..1ea97f0 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -28,6 +28,7 @@ import psutil from cherrypy.process.plugins import BackgroundTask
from kimchi import disks +from kimchi import hostdev from kimchi import netinfo from kimchi import xmlutils from kimchi.basemodel import Singleton @@ -279,20 +280,10 @@ class DeviceModel(object): def lookup(self, nodedev_name): conn = self.conn.get() try: - dev_xml = conn.nodeDeviceLookupByName(nodedev_name).XMLDesc(0) + dev = conn.nodeDeviceLookupByName(nodedev_name) except: raise NotFoundError('KCHHOST0003E', {'name': nodedev_name}) - cap_type = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/@type') - wwnn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwnn') - wwpn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwpn') - return { - 'name': nodedev_name, - 'adapter_type': cap_type[0] if len(cap_type) >= 1 else '', - 'wwnn': wwnn[0] if len(wwnn) == 1 else '', - 'wwpn': wwpn[0] if len(wwpn) == 1 else ''} + return hostdev.get_dev_info(dev)
class PackagesUpdateModel(object): diff --git a/src/kimchi/model/libvirtstoragepool.py b/src/kimchi/model/libvirtstoragepool.py index 47b239b..b15bf1a 100644 --- a/src/kimchi/model/libvirtstoragepool.py +++ b/src/kimchi/model/libvirtstoragepool.py @@ -180,34 +180,34 @@ class ScsiPoolDef(StoragePoolDef): self.poolArgs['source']['name'] = tmp_name.replace('scsi_', '') # fc_host adapters type are only available in libvirt >= 1.0.5 if not self.poolArgs['fc_host_support']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + self.poolArgs['source']['adapter']['type'] = 'scsi_host' msg = "Libvirt version <= 1.0.5. Setting SCSI host name as '%s'; "\ "setting SCSI adapter type as 'scsi_host'; "\ "ignoring wwnn and wwpn." % tmp_name kimchi_log.info(msg) # Path for Fibre Channel scsi hosts self.poolArgs['path'] = '/dev/disk/by-path' - if not self.poolArgs['source']['adapter_type']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + if not self.poolArgs['source']['adapter']['type']: + self.poolArgs['source']['adapter']['type'] = 'scsi_host'
@property def xml(self): # Required parameters # name: - # source[adapter_type]: + # source[adapter][type]: # source[name]: - # source[wwnn]: - # source[wwpn]: + # source[adapter][wwnn]: + # source[adapter][wwpn]: # path:
xml = """ <pool type='scsi'> <name>{name}</name> <source> - <adapter type='{source[adapter_type]}'\ + <adapter type='{source[adapter][type]}'\ name='{source[name]}'\ - wwnn='{source[wwnn]}'\ - wwpn='{source[wwpn]}'/> + wwnn='{source[adapter][wwnn]}'\ + wwpn='{source[adapter][wwpn]}'/> </source> <target> <path>{path}</path> diff --git a/src/kimchi/xmlutils.py b/src/kimchi/xmlutils.py index 76f0696..56517f2 100644 --- a/src/kimchi/xmlutils.py +++ b/src/kimchi/xmlutils.py @@ -18,6 +18,7 @@ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
import libxml2 +from lxml import objectify
from xml.etree import ElementTree @@ -26,7 +27,7 @@ from xml.etree import ElementTree def xpath_get_text(xml, expr): doc = libxml2.parseDoc(xml) res = doc.xpathEval(expr) - ret = [None if x.children == None else x.children.content for x in res] + ret = [None if x.children is None else x.children.content for x in res]
doc.freeDoc() return ret @@ -37,3 +38,26 @@ def xml_item_update(xml, xpath, value): item = root.find(xpath) item.text = value return ElementTree.tostring(root, encoding="utf-8") + + +def dictize(xmlstr): + root = objectify.fromstring(xmlstr) + return {root.tag: _dictize(root)} + + +def _dictize(e): + d = {} + if e.text is not None: + if not e.attrib and e.countchildren() == 0: + return e.pyval + d['pyval'] = e.pyval + d.update(e.attrib) + for child in e.iterchildren(): + if child.tag in d: + continue + if len(child) > 1: + d[child.tag] = [ + _dictize(same_tag_child) for same_tag_child in child] + else: + d[child.tag] = _dictize(child) + return d diff --git a/tests/test_rest.py b/tests/test_rest.py index 7ed94cb..cb1ae9a 100644 --- a/tests/test_rest.py +++ b/tests/test_rest.py @@ -158,9 +158,9 @@ class RestTests(unittest.TestCase): nodedev = json.loads(self.request('/host/devices/scsi_host4').read()) # Mockmodel generates random wwpn and wwnn self.assertEquals('scsi_host4', nodedev['name']) - self.assertEquals('fc_host', nodedev['adapter_type']) - self.assertEquals(16, len(nodedev['wwpn'])) - self.assertEquals(16, len(nodedev['wwnn'])) + self.assertEquals('fc_host', nodedev['adapter']['type']) + self.assertEquals(16, len(nodedev['adapter']['wwpn'])) + self.assertEquals(16, len(nodedev['adapter']['wwnn']))
def test_get_vms(self): vms = json.loads(self.request('/vms').read()) diff --git a/tests/test_storagepool.py b/tests/test_storagepool.py index 22b4943..3e3ad83 100644 --- a/tests/test_storagepool.py +++ b/tests/test_storagepool.py @@ -145,9 +145,10 @@ class storagepoolTests(unittest.TestCase): 'path': '/dev/disk/by-path', 'source': { 'name': 'scsi_host3', - 'adapter_type': 'fc_host', - 'wwpn': '0123456789abcdef', - 'wwnn': 'abcdef0123456789'}}, + 'adapter': { + 'type': 'fc_host', + 'wwpn': '0123456789abcdef', + 'wwnn': 'abcdef0123456789'}}}, 'xml': """ <pool type='scsi'>
-- Thanks and best regards! Sheldon Feng(冯少合)<shaohef@linux.vnet.ibm.com> IBM Linux Technology Center

The URI /host/devices only presents scsi_host (particularly fc_host) device information. To implement host PCI pass through, we should list all types of host devices. This patch adds support for parsing various host devices information, and listing them on /host/devices. So the user is free to choose any listed PCI device to pass through to guest. Since the patch changes the device information dictionary format, the existing code consuming the device information is also changed accordingly.
To get all types of host device, access the following URL.
curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ https://127.0.0.1:8001/host/devices
To get only fc_host devices, change the URL to "https://127.0.0.1:8001/host/devices?_cap=fc_host"
To get only pci device, change the URL to "https://127.0.0.1:8001/host/devices?_cap=pci"
v1: Parse the node device XML using xpath.
v2: Write a "dictize" function and parse the node device XML using dictize.
v3: Fix a naming mistake.
v4: It is observed that sometimes the parent devices is not listed by libvirt but the child device is listed. In previous version we catch this exception and ignore it. The root cause is unknown, and we failed to re-produce the problem. In v4 we do not catch it. It seems to be related to USB removable disk, and the problem is gone after we upgraded Linux kernel.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- docs/API.md | 11 +- src/kimchi/hostdev.py | 209 +++++++++++++++++++++++++++++++++ src/kimchi/mockmodel.py | 7 +- src/kimchi/model/host.py | 15 +-- src/kimchi/model/libvirtstoragepool.py | 18 +-- src/kimchi/xmlutils.py | 26 +++- tests/test_rest.py | 6 +- tests/test_storagepool.py | 7 +- 8 files changed, 262 insertions(+), 37 deletions(-) create mode 100644 src/kimchi/hostdev.py
diff --git a/docs/API.md b/docs/API.md index ef5bf14..5026771 100644 --- a/docs/API.md +++ b/docs/API.md @@ -869,12 +869,11 @@ stats history * **GET**: Retrieve information of a single pci device. Currently only scsi_host devices are supported: * name: The name of the device. - * adapter_type: The capability type of the scsi_host device (fc_host). - Empty if pci device is not scsi_host. - * wwnn: The HBA Word Wide Node Name. - Empty if pci device is not scsi_host. - * wwpn: The HBA Word Wide Port Name - Empty if pci device is not scsi_host. + * path: Path of device in sysfs. + * adapter: Host adapter information. Empty if pci device is not scsi_host. + * type: The capability type of the scsi_host device (fc_host, vport_ops). + * wwnn: The HBA Word Wide Node Name. Empty if pci device is not fc_host. + * wwpn: The HBA Word Wide Port Name. Empty if pci device is not fc_host.
### Collection: Host Packages Update
diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py new file mode 100644 index 0000000..d4c142d --- /dev/null +++ b/src/kimchi/hostdev.py @@ -0,0 +1,209 @@ +# +# Kimchi +# +# Copyright IBM Corp, 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +from kimchi.model.libvirtconnection import LibvirtConnection +from kimchi.utils import kimchi_log +from kimchi.xmlutils import dictize + + +def _get_all_host_dev_infos(): + libvirt_conn = LibvirtConnection('qemu:///system').get() + node_devs = libvirt_conn.listAllDevices() + return [get_dev_info(node_dev) for node_dev in node_devs] + + +def _get_dev_info_tree(dev_infos): + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + root = None + for dev_info in dev_infos: + if dev_info['parent'] is None: + root = dev_info + continue + parent = devs[dev_info['parent']] + + try: + children = parent['children'] + except KeyError: + parent['children'] = [dev_info] + else: + children.append(dev_info) + return root + + +def get_dev_info(node_dev): + ''' Parse the node device XML string into dict according to + http://libvirt.org/formatnode.html. ''' + + def shift_subdict(d, toshift): It only has one usage, so we can remove the second parameter to make it
On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote: purpose more clear. We can call it pop_capability or sth similar.
+ subdict = d.pop(toshift) + d.update(subdict) + return d + + xmlstr = node_dev.XMLDesc() + info = dictize(xmlstr)['device'] + dev_type = info['capability'].pop('type') + info['device_type'] = dev_type + shift_subdict(info, 'capability') + info['parent'] = node_dev.parent() + + get_dev_type_info = { + 'net': _get_net_dev_info, + 'pci': _get_pci_dev_info, + 'scsi': _get_scsi_dev_info, + 'scsi_generic': _get_scsi_generic_dev_info, + 'scsi_host': _get_scsi_host_dev_info, + 'scsi_target': _get_scsi_target_dev_info, + 'storage': _get_storage_dev_info, + 'system': _get_system_dev_info, + 'usb': _get_usb_dev_info, + 'usb_device': _get_usb_device_dev_info, + } + try: + get_detail_info = get_dev_type_info[dev_type] + except KeyError: + kimchi_log.error("Unknown device type: %s", dev_type) + return info + + return get_detail_info(info) + + +def _get_net_dev_info(info): + cap = info.pop('capability') + links = {"80203": "IEEE 802.3", "80211": "IEEE 802.11"} + link_raw = cap['type'] + info['link_type'] = links.get(link_raw, link_raw) + + return info + + +def _get_pci_dev_info(info): + for k in ('vendor', 'product'): + info[k]['description'] = info[k].pop('pyval') + try: + info['iommuGroup'] = info['iommuGroup']['number'] + except KeyError: + # No IOMMU group support. + pass + return info + + +def _get_scsi_dev_info(info): + return info + + +def _get_scsi_generic_dev_info(info): + # scsi_generic is not documented in libvirt official website. Try to + # parse scsi_generic according to the following libvirt path series. + # https://www.redhat.com/archives/libvir-list/2013-June/msg00014.html + return info + + +def _get_scsi_host_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + # kimchi.model.libvirtstoragepool.ScsiPoolDef assumes + # info['adapter']['type'] always exists. + info['adapter'] = {'type': ''} + return info + info['adapter'] = cap_info + return info + + +def _get_scsi_target_dev_info(info): + # scsi_target is not documented in libvirt official website. Try to + # parse scsi_target according to the libvirt commit db19834a0a. + return info + + +def _get_storage_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + return info + + if cap_info['type'] == 'removable': + cap_info['available'] = bool(cap_info.pop('media_available')) + if cap_info['available']: + cap_info.update({'size': cap_info.pop('media_size'), + 'label': cap_info.pop('media_label')}) + info['media'] = cap_info + return info + + +def _get_system_dev_info(info): + return info + + +def _get_usb_dev_info(info): + return info + + +def _get_usb_device_dev_info(info): + for k in ('vendor', 'product'): + try: + info[k]['description'] = info[k].pop('pyval') + except KeyError: + # Some USB devices don't provide vendor/product description. + pass + return info + + +# For test and debug +def _print_host_dev_tree(): + dev_infos = _get_all_host_dev_infos() + root = _get_dev_info_tree(dev_infos) + if root is None: + print "No device found" + return + print '-----------------' + print '\n'.join(_format_dev_node(root)) + + +def _format_dev_node(node): + from pprint import pformat + + try: + children = node['children'] + del node['children'] + except KeyError: + children = [] + + lines = [] + lines.extend([' ~' + line for line in pformat(node).split('\n')]) + + count = len(children) + for i, child in enumerate(children): + if count == 1: + lines.append(' \-----------------') + else: + lines.append(' +-----------------') + clines = _format_dev_node(child) + if i == count - 1: + p = ' ' + else: + p = ' |' + lines.extend([p + cline for cline in clines]) + lines.append('') + + return lines + + +if __name__ == '__main__': + _print_host_dev_tree() diff --git a/src/kimchi/mockmodel.py b/src/kimchi/mockmodel.py index 05720f4..7420e7e 100644 --- a/src/kimchi/mockmodel.py +++ b/src/kimchi/mockmodel.py @@ -493,9 +493,10 @@ class MockModel(object): def device_lookup(self, nodedev_name): return { 'name': nodedev_name, - 'adapter_type': 'fc_host', - 'wwnn': uuid.uuid4().hex[:16], - 'wwpn': uuid.uuid4().hex[:16]} + 'adapter': { + 'type': 'fc_host', + 'wwnn': uuid.uuid4().hex[:16], + 'wwpn': uuid.uuid4().hex[:16]}}
def isopool_lookup(self, name): return {'state': 'active', diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 5844f4b..1ea97f0 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -28,6 +28,7 @@ import psutil from cherrypy.process.plugins import BackgroundTask
from kimchi import disks +from kimchi import hostdev from kimchi import netinfo from kimchi import xmlutils from kimchi.basemodel import Singleton @@ -279,20 +280,10 @@ class DeviceModel(object): def lookup(self, nodedev_name): conn = self.conn.get() try: - dev_xml = conn.nodeDeviceLookupByName(nodedev_name).XMLDesc(0) + dev = conn.nodeDeviceLookupByName(nodedev_name) except: raise NotFoundError('KCHHOST0003E', {'name': nodedev_name}) - cap_type = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/@type') - wwnn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwnn') - wwpn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwpn') - return { - 'name': nodedev_name, - 'adapter_type': cap_type[0] if len(cap_type) >= 1 else '', - 'wwnn': wwnn[0] if len(wwnn) == 1 else '', - 'wwpn': wwpn[0] if len(wwpn) == 1 else ''} + return hostdev.get_dev_info(dev)
class PackagesUpdateModel(object): diff --git a/src/kimchi/model/libvirtstoragepool.py b/src/kimchi/model/libvirtstoragepool.py index 47b239b..b15bf1a 100644 --- a/src/kimchi/model/libvirtstoragepool.py +++ b/src/kimchi/model/libvirtstoragepool.py @@ -180,34 +180,34 @@ class ScsiPoolDef(StoragePoolDef): self.poolArgs['source']['name'] = tmp_name.replace('scsi_', '') # fc_host adapters type are only available in libvirt >= 1.0.5 if not self.poolArgs['fc_host_support']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + self.poolArgs['source']['adapter']['type'] = 'scsi_host' msg = "Libvirt version <= 1.0.5. Setting SCSI host name as '%s'; "\ "setting SCSI adapter type as 'scsi_host'; "\ "ignoring wwnn and wwpn." % tmp_name kimchi_log.info(msg) # Path for Fibre Channel scsi hosts self.poolArgs['path'] = '/dev/disk/by-path' - if not self.poolArgs['source']['adapter_type']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + if not self.poolArgs['source']['adapter']['type']: + self.poolArgs['source']['adapter']['type'] = 'scsi_host'
@property def xml(self): # Required parameters # name: - # source[adapter_type]: + # source[adapter][type]: # source[name]: - # source[wwnn]: - # source[wwpn]: + # source[adapter][wwnn]: + # source[adapter][wwpn]: # path:
xml = """ <pool type='scsi'> <name>{name}</name> <source> - <adapter type='{source[adapter_type]}'\ + <adapter type='{source[adapter][type]}'\ name='{source[name]}'\ - wwnn='{source[wwnn]}'\ - wwpn='{source[wwpn]}'/> + wwnn='{source[adapter][wwnn]}'\ + wwpn='{source[adapter][wwpn]}'/> </source> <target> <path>{path}</path> diff --git a/src/kimchi/xmlutils.py b/src/kimchi/xmlutils.py index 76f0696..56517f2 100644 --- a/src/kimchi/xmlutils.py +++ b/src/kimchi/xmlutils.py @@ -18,6 +18,7 @@ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
import libxml2 +from lxml import objectify
from xml.etree import ElementTree @@ -26,7 +27,7 @@ from xml.etree import ElementTree def xpath_get_text(xml, expr): doc = libxml2.parseDoc(xml) res = doc.xpathEval(expr) - ret = [None if x.children == None else x.children.content for x in res] + ret = [None if x.children is None else x.children.content for x in res]
doc.freeDoc() return ret @@ -37,3 +38,26 @@ def xml_item_update(xml, xpath, value): item = root.find(xpath) item.text = value return ElementTree.tostring(root, encoding="utf-8") + + +def dictize(xmlstr): + root = objectify.fromstring(xmlstr) + return {root.tag: _dictize(root)} + + +def _dictize(e): + d = {} + if e.text is not None: + if not e.attrib and e.countchildren() == 0: + return e.pyval + d['pyval'] = e.pyval + d.update(e.attrib) + for child in e.iterchildren(): + if child.tag in d: + continue + if len(child) > 1: + d[child.tag] = [ + _dictize(same_tag_child) for same_tag_child in child] + else: + d[child.tag] = _dictize(child) + return d diff --git a/tests/test_rest.py b/tests/test_rest.py index 7ed94cb..cb1ae9a 100644 --- a/tests/test_rest.py +++ b/tests/test_rest.py @@ -158,9 +158,9 @@ class RestTests(unittest.TestCase): nodedev = json.loads(self.request('/host/devices/scsi_host4').read()) # Mockmodel generates random wwpn and wwnn self.assertEquals('scsi_host4', nodedev['name']) - self.assertEquals('fc_host', nodedev['adapter_type']) - self.assertEquals(16, len(nodedev['wwpn'])) - self.assertEquals(16, len(nodedev['wwnn'])) + self.assertEquals('fc_host', nodedev['adapter']['type']) + self.assertEquals(16, len(nodedev['adapter']['wwpn'])) + self.assertEquals(16, len(nodedev['adapter']['wwnn']))
def test_get_vms(self): vms = json.loads(self.request('/vms').read()) diff --git a/tests/test_storagepool.py b/tests/test_storagepool.py index 22b4943..3e3ad83 100644 --- a/tests/test_storagepool.py +++ b/tests/test_storagepool.py @@ -145,9 +145,10 @@ class storagepoolTests(unittest.TestCase): 'path': '/dev/disk/by-path', 'source': { 'name': 'scsi_host3', - 'adapter_type': 'fc_host', - 'wwpn': '0123456789abcdef', - 'wwnn': 'abcdef0123456789'}}, + 'adapter': { + 'type': 'fc_host', + 'wwpn': '0123456789abcdef', + 'wwnn': 'abcdef0123456789'}}}, 'xml': """ <pool type='scsi'>

on 2014/06/16 17:56, Mark Wu wrote:
The URI /host/devices only presents scsi_host (particularly fc_host) device information. To implement host PCI pass through, we should list all types of host devices. This patch adds support for parsing various host devices information, and listing them on /host/devices. So the user is free to choose any listed PCI device to pass through to guest. Since the patch changes the device information dictionary format, the existing code consuming the device information is also changed accordingly.
To get all types of host device, access the following URL.
curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ https://127.0.0.1:8001/host/devices
To get only fc_host devices, change the URL to "https://127.0.0.1:8001/host/devices?_cap=fc_host"
To get only pci device, change the URL to "https://127.0.0.1:8001/host/devices?_cap=pci"
v1: Parse the node device XML using xpath.
v2: Write a "dictize" function and parse the node device XML using dictize.
v3: Fix a naming mistake.
v4: It is observed that sometimes the parent devices is not listed by libvirt but the child device is listed. In previous version we catch this exception and ignore it. The root cause is unknown, and we failed to re-produce the problem. In v4 we do not catch it. It seems to be related to USB removable disk, and the problem is gone after we upgraded Linux kernel.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- docs/API.md | 11 +- src/kimchi/hostdev.py | 209 +++++++++++++++++++++++++++++++++ src/kimchi/mockmodel.py | 7 +- src/kimchi/model/host.py | 15 +-- src/kimchi/model/libvirtstoragepool.py | 18 +-- src/kimchi/xmlutils.py | 26 +++- tests/test_rest.py | 6 +- tests/test_storagepool.py | 7 +- 8 files changed, 262 insertions(+), 37 deletions(-) create mode 100644 src/kimchi/hostdev.py
diff --git a/docs/API.md b/docs/API.md index ef5bf14..5026771 100644 --- a/docs/API.md +++ b/docs/API.md @@ -869,12 +869,11 @@ stats history * **GET**: Retrieve information of a single pci device. Currently only scsi_host devices are supported: * name: The name of the device. - * adapter_type: The capability type of the scsi_host device (fc_host). - Empty if pci device is not scsi_host. - * wwnn: The HBA Word Wide Node Name. - Empty if pci device is not scsi_host. - * wwpn: The HBA Word Wide Port Name - Empty if pci device is not scsi_host. + * path: Path of device in sysfs. + * adapter: Host adapter information. Empty if pci device is not scsi_host. + * type: The capability type of the scsi_host device (fc_host, vport_ops). + * wwnn: The HBA Word Wide Node Name. Empty if pci device is not fc_host. + * wwpn: The HBA Word Wide Port Name. Empty if pci device is not fc_host.
### Collection: Host Packages Update
diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py new file mode 100644 index 0000000..d4c142d --- /dev/null +++ b/src/kimchi/hostdev.py @@ -0,0 +1,209 @@ +# +# Kimchi +# +# Copyright IBM Corp, 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +from kimchi.model.libvirtconnection import LibvirtConnection +from kimchi.utils import kimchi_log +from kimchi.xmlutils import dictize + + +def _get_all_host_dev_infos(): + libvirt_conn = LibvirtConnection('qemu:///system').get() + node_devs = libvirt_conn.listAllDevices() + return [get_dev_info(node_dev) for node_dev in node_devs] + + +def _get_dev_info_tree(dev_infos): + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + root = None + for dev_info in dev_infos: + if dev_info['parent'] is None: + root = dev_info + continue + parent = devs[dev_info['parent']] + + try: + children = parent['children'] + except KeyError: + parent['children'] = [dev_info] + else: + children.append(dev_info) + return root + + +def get_dev_info(node_dev): + ''' Parse the node device XML string into dict according to + http://libvirt.org/formatnode.html. ''' + + def shift_subdict(d, toshift): It only has one usage, so we can remove the second parameter to make it
On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote: purpose more clear. We can call it pop_capability or sth similar.
OK. Thanks.
+ subdict = d.pop(toshift) + d.update(subdict) + return d + + xmlstr = node_dev.XMLDesc() + info = dictize(xmlstr)['device'] + dev_type = info['capability'].pop('type') + info['device_type'] = dev_type + shift_subdict(info, 'capability') + info['parent'] = node_dev.parent() + + get_dev_type_info = { + 'net': _get_net_dev_info, + 'pci': _get_pci_dev_info, + 'scsi': _get_scsi_dev_info, + 'scsi_generic': _get_scsi_generic_dev_info, + 'scsi_host': _get_scsi_host_dev_info, + 'scsi_target': _get_scsi_target_dev_info, + 'storage': _get_storage_dev_info, + 'system': _get_system_dev_info, + 'usb': _get_usb_dev_info, + 'usb_device': _get_usb_device_dev_info, + } + try: + get_detail_info = get_dev_type_info[dev_type] + except KeyError: + kimchi_log.error("Unknown device type: %s", dev_type) + return info + + return get_detail_info(info) + + +def _get_net_dev_info(info): + cap = info.pop('capability') + links = {"80203": "IEEE 802.3", "80211": "IEEE 802.11"} + link_raw = cap['type'] + info['link_type'] = links.get(link_raw, link_raw) + + return info + + +def _get_pci_dev_info(info): + for k in ('vendor', 'product'): + info[k]['description'] = info[k].pop('pyval') + try: + info['iommuGroup'] = info['iommuGroup']['number'] + except KeyError: + # No IOMMU group support. + pass + return info + + +def _get_scsi_dev_info(info): + return info + + +def _get_scsi_generic_dev_info(info): + # scsi_generic is not documented in libvirt official website. Try to + # parse scsi_generic according to the following libvirt path series. + # https://www.redhat.com/archives/libvir-list/2013-June/msg00014.html + return info + + +def _get_scsi_host_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + # kimchi.model.libvirtstoragepool.ScsiPoolDef assumes + # info['adapter']['type'] always exists. + info['adapter'] = {'type': ''} + return info + info['adapter'] = cap_info + return info + + +def _get_scsi_target_dev_info(info): + # scsi_target is not documented in libvirt official website. Try to + # parse scsi_target according to the libvirt commit db19834a0a. + return info + + +def _get_storage_dev_info(info): + try: + cap_info = info.pop('capability') + except KeyError: + return info + + if cap_info['type'] == 'removable': + cap_info['available'] = bool(cap_info.pop('media_available')) + if cap_info['available']: + cap_info.update({'size': cap_info.pop('media_size'), + 'label': cap_info.pop('media_label')}) + info['media'] = cap_info + return info + + +def _get_system_dev_info(info): + return info + + +def _get_usb_dev_info(info): + return info + + +def _get_usb_device_dev_info(info): + for k in ('vendor', 'product'): + try: + info[k]['description'] = info[k].pop('pyval') + except KeyError: + # Some USB devices don't provide vendor/product description. + pass + return info + + +# For test and debug +def _print_host_dev_tree(): + dev_infos = _get_all_host_dev_infos() + root = _get_dev_info_tree(dev_infos) + if root is None: + print "No device found" + return + print '-----------------' + print '\n'.join(_format_dev_node(root)) + + +def _format_dev_node(node): + from pprint import pformat + + try: + children = node['children'] + del node['children'] + except KeyError: + children = [] + + lines = [] + lines.extend([' ~' + line for line in pformat(node).split('\n')]) + + count = len(children) + for i, child in enumerate(children): + if count == 1: + lines.append(' \-----------------') + else: + lines.append(' +-----------------') + clines = _format_dev_node(child) + if i == count - 1: + p = ' ' + else: + p = ' |' + lines.extend([p + cline for cline in clines]) + lines.append('') + + return lines + + +if __name__ == '__main__': + _print_host_dev_tree() diff --git a/src/kimchi/mockmodel.py b/src/kimchi/mockmodel.py index 05720f4..7420e7e 100644 --- a/src/kimchi/mockmodel.py +++ b/src/kimchi/mockmodel.py @@ -493,9 +493,10 @@ class MockModel(object): def device_lookup(self, nodedev_name): return { 'name': nodedev_name, - 'adapter_type': 'fc_host', - 'wwnn': uuid.uuid4().hex[:16], - 'wwpn': uuid.uuid4().hex[:16]} + 'adapter': { + 'type': 'fc_host', + 'wwnn': uuid.uuid4().hex[:16], + 'wwpn': uuid.uuid4().hex[:16]}}
def isopool_lookup(self, name): return {'state': 'active', diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 5844f4b..1ea97f0 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -28,6 +28,7 @@ import psutil from cherrypy.process.plugins import BackgroundTask
from kimchi import disks +from kimchi import hostdev from kimchi import netinfo from kimchi import xmlutils from kimchi.basemodel import Singleton @@ -279,20 +280,10 @@ class DeviceModel(object): def lookup(self, nodedev_name): conn = self.conn.get() try: - dev_xml = conn.nodeDeviceLookupByName(nodedev_name).XMLDesc(0) + dev = conn.nodeDeviceLookupByName(nodedev_name) except: raise NotFoundError('KCHHOST0003E', {'name': nodedev_name}) - cap_type = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/@type') - wwnn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwnn') - wwpn = xmlutils.xpath_get_text( - dev_xml, '/device/capability/capability/wwpn') - return { - 'name': nodedev_name, - 'adapter_type': cap_type[0] if len(cap_type) >= 1 else '', - 'wwnn': wwnn[0] if len(wwnn) == 1 else '', - 'wwpn': wwpn[0] if len(wwpn) == 1 else ''} + return hostdev.get_dev_info(dev)
class PackagesUpdateModel(object): diff --git a/src/kimchi/model/libvirtstoragepool.py b/src/kimchi/model/libvirtstoragepool.py index 47b239b..b15bf1a 100644 --- a/src/kimchi/model/libvirtstoragepool.py +++ b/src/kimchi/model/libvirtstoragepool.py @@ -180,34 +180,34 @@ class ScsiPoolDef(StoragePoolDef): self.poolArgs['source']['name'] = tmp_name.replace('scsi_', '') # fc_host adapters type are only available in libvirt >= 1.0.5 if not self.poolArgs['fc_host_support']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + self.poolArgs['source']['adapter']['type'] = 'scsi_host' msg = "Libvirt version <= 1.0.5. Setting SCSI host name as '%s'; "\ "setting SCSI adapter type as 'scsi_host'; "\ "ignoring wwnn and wwpn." % tmp_name kimchi_log.info(msg) # Path for Fibre Channel scsi hosts self.poolArgs['path'] = '/dev/disk/by-path' - if not self.poolArgs['source']['adapter_type']: - self.poolArgs['source']['adapter_type'] = 'scsi_host' + if not self.poolArgs['source']['adapter']['type']: + self.poolArgs['source']['adapter']['type'] = 'scsi_host'
@property def xml(self): # Required parameters # name: - # source[adapter_type]: + # source[adapter][type]: # source[name]: - # source[wwnn]: - # source[wwpn]: + # source[adapter][wwnn]: + # source[adapter][wwpn]: # path:
xml = """ <pool type='scsi'> <name>{name}</name> <source> - <adapter type='{source[adapter_type]}'\ + <adapter type='{source[adapter][type]}'\ name='{source[name]}'\ - wwnn='{source[wwnn]}'\ - wwpn='{source[wwpn]}'/> + wwnn='{source[adapter][wwnn]}'\ + wwpn='{source[adapter][wwpn]}'/> </source> <target> <path>{path}</path> diff --git a/src/kimchi/xmlutils.py b/src/kimchi/xmlutils.py index 76f0696..56517f2 100644 --- a/src/kimchi/xmlutils.py +++ b/src/kimchi/xmlutils.py @@ -18,6 +18,7 @@ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
import libxml2 +from lxml import objectify
from xml.etree import ElementTree @@ -26,7 +27,7 @@ from xml.etree import ElementTree def xpath_get_text(xml, expr): doc = libxml2.parseDoc(xml) res = doc.xpathEval(expr) - ret = [None if x.children == None else x.children.content for x in res] + ret = [None if x.children is None else x.children.content for x in res]
doc.freeDoc() return ret @@ -37,3 +38,26 @@ def xml_item_update(xml, xpath, value): item = root.find(xpath) item.text = value return ElementTree.tostring(root, encoding="utf-8") + + +def dictize(xmlstr): + root = objectify.fromstring(xmlstr) + return {root.tag: _dictize(root)} + + +def _dictize(e): + d = {} + if e.text is not None: + if not e.attrib and e.countchildren() == 0: + return e.pyval + d['pyval'] = e.pyval + d.update(e.attrib) + for child in e.iterchildren(): + if child.tag in d: + continue + if len(child) > 1: + d[child.tag] = [ + _dictize(same_tag_child) for same_tag_child in child] + else: + d[child.tag] = _dictize(child) + return d diff --git a/tests/test_rest.py b/tests/test_rest.py index 7ed94cb..cb1ae9a 100644 --- a/tests/test_rest.py +++ b/tests/test_rest.py @@ -158,9 +158,9 @@ class RestTests(unittest.TestCase): nodedev = json.loads(self.request('/host/devices/scsi_host4').read()) # Mockmodel generates random wwpn and wwnn self.assertEquals('scsi_host4', nodedev['name']) - self.assertEquals('fc_host', nodedev['adapter_type']) - self.assertEquals(16, len(nodedev['wwpn'])) - self.assertEquals(16, len(nodedev['wwnn'])) + self.assertEquals('fc_host', nodedev['adapter']['type']) + self.assertEquals(16, len(nodedev['adapter']['wwpn'])) + self.assertEquals(16, len(nodedev['adapter']['wwnn']))
def test_get_vms(self): vms = json.loads(self.request('/vms').read()) diff --git a/tests/test_storagepool.py b/tests/test_storagepool.py index 22b4943..3e3ad83 100644 --- a/tests/test_storagepool.py +++ b/tests/test_storagepool.py @@ -145,9 +145,10 @@ class storagepoolTests(unittest.TestCase): 'path': '/dev/disk/by-path', 'source': { 'name': 'scsi_host3', - 'adapter_type': 'fc_host', - 'wwpn': '0123456789abcdef', - 'wwnn': 'abcdef0123456789'}}, + 'adapter': { + 'type': 'fc_host', + 'wwpn': '0123456789abcdef', + 'wwnn': 'abcdef0123456789'}}}, 'xml': """ <pool type='scsi'>
-- Zhou Zheng Sheng / 周征晟 E-mail: zhshzhou@linux.vnet.ibm.com Telephone: 86-10-82454397

This patch adds a '_passthrough=1' filter to /host/devices, so it can filter and shows all devices eligible to passthrough to guest. Theoretically, all PCI, USB and SCSI devices can be assigned to guest directly. However usually all host devices form a tree, if we assign a PCI port/SCSI controller/USB controller to guest, all devices/disks under the controller are assigned as well. In this patch we only present the "leaf" host devices to the user as potential passthrough devices. In other word, the possible devices are wireless network interface, SD card reader, camera, SCSI unit(disk or CD), and so on. Linux kernel is able to recognize the host IOMMU group layout. If two PCI devices are in the same IOMMU group, it means there are possible interconnections between the devices, and the devices can talk to each other bypassing IOMMU. This implies isolation is not pefect between those devices, so all devices in a IOMMU group must be assigned to guest together. On host that recognizes IOMMU groups, by accessing the URI /host/devices/deviceX/passthrough_affected_devices, it returns a list containing the devices in the same IOMMU group as deviceX. How to test: List all types of devices to passthrough curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices?_passthrough=true' List all eligible PCI devices to passthrough /host/devices?_passthrough=true&_cap=pci List all USB devices to passthrough /host/devices?_passthrough=true&_cap=usb_device List all SCSI devices to passthrough /host/devices?_passthrough=true&_cap=scsi List devices in the same IOMMU group as pci_0000_00_19_0 /host/devices/pci_0000_00_19_0/passthrough_affected_devices v1: v1 series does not contain this patch. v2: Deal with calculation "leaf" device and "affected" device. v3 v4: No change. v5: Change _passthrough=1 to _passthrough=true in the URI scheme. Filter PCI devices according the PCI class. v6: Don't passthrough PCI device of class code 07. In modern x86 machine, it's possible that "6 Series/C200 Series Chipset Family MEI Controller" and "6 Series/C200 Series Chipset Family KT Controller" are of this class code. These two devices are not suitable to passthrough to guest. We don't have simple and reliable way to distinguish normal serial controller and host chipset XXX controller. This type of PCI devices also include various serial, parallel, modem, communication controller. Serial and parallel controllers can be re-direct from ttyS0 to QEMU's pty using socat, and there is little performance benefit to directly assign to guest. So it'k ok not to passththrough PCI device of class code 07. Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 9 ++++ src/kimchi/hostdev.py | 107 +++++++++++++++++++++++++++++++++++++++++++++ src/kimchi/model/host.py | 17 ++++++- 3 files changed, 132 insertions(+), 1 deletion(-) diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ebf1bed..15f2343 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -103,9 +103,18 @@ class Devices(Collection): self.resource = Device +class PassthroughAffectedDevices(Collection): + def __init__(self, model, device_id): + super(PassthroughAffectedDevices, self).__init__(model) + self.resource = Device + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) + self.passthrough_affected_devices = \ + PassthroughAffectedDevices(self.model, id) @property def data(self): diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py index d4c142d..f70154f 100644 --- a/src/kimchi/hostdev.py +++ b/src/kimchi/hostdev.py @@ -17,6 +17,8 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +import os + from kimchi.model.libvirtconnection import LibvirtConnection from kimchi.utils import kimchi_log from kimchi.xmlutils import dictize @@ -46,6 +48,108 @@ def _get_dev_info_tree(dev_infos): return root +def _strip_parents(devs, dev): + parent = dev['parent'] + while parent is not None: + try: + parent_dev = devs.pop(parent) + except KeyError: + break + + if (parent_dev['device_type'], + dev['device_type']) == ('usb_device', 'scsi'): + # For USB device containing mass storage, passthrough the + # USB device itself, not the SCSI unit. + devs.pop(dev['name']) + break + + parent = parent_dev['parent'] + + +def _is_pci_qualified(pci_dev): + # PCI class such as bridge and storage controller are not suitable to + # passthrough to VM, so we make a whitelist and only passthrough PCI + # class in the list. + + whitelist_pci_classes = { + # Refer to Linux Kernel code include/linux/pci_ids.h + 0x000000: { # Old PCI devices + 0x000100: None}, # Old VGA devices + 0x020000: None, # Network controller + 0x030000: None, # Display controller + 0x040000: None, # Multimedia device + 0x090000: None, # Inupt device + 0x0d0000: None, # Wireless controller + 0x0f0000: None, # Satellite communication controller + 0x100000: None, # Cryption controller + 0x110000: None, # Signal Processing controller + } + + with open(os.path.join(pci_dev['path'], 'class')) as f: + pci_class = int(f.read(), 16) + + try: + subclass = whitelist_pci_classes[pci_class & 0xff0000] + except KeyError: + return False + + if subclass is None: + return True + + if pci_class & 0xffff00 in subclass: + return True + + return False + + +def get_passthrough_dev_infos(): + ''' Get devices eligible to be passed through to VM. ''' + + dev_infos = _get_all_host_dev_infos() + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + + for dev in dev_infos: + if dev['device_type'] in ('pci', 'usb_device', 'scsi'): + _strip_parents(devs, dev) + + def is_eligible(dev): + if dev['device_type'] not in ('pci', 'usb_device', 'scsi'): + return False + if dev['device_type'] == 'pci': + return _is_pci_qualified(dev) + return True + + return [dev for dev in devs.itervalues() if is_eligible(dev)] + + +def get_affected_passthrough_devices(passthrough_dev): + devs = dict([(dev['name'], dev) for dev in _get_all_host_dev_infos()]) + + def get_iommu_group(dev_info): + try: + return int(dev_info['iommuGroup']) + except KeyError: + pass + + parent = dev_info['parent'] + while parent is not None: + try: + iommuGroup = int(devs[parent]['iommuGroup']) + except KeyError: + pass + else: + return iommuGroup + parent = devs[parent]['parent'] + + return -1 + + iommu_group = get_iommu_group(passthrough_dev) + + return [dev for dev in get_passthrough_dev_infos() + if dev['name'] != passthrough_dev['name'] and + get_iommu_group(dev) == iommu_group] + + def get_dev_info(node_dev): ''' Parse the node device XML string into dict according to http://libvirt.org/formatnode.html. ''' @@ -206,4 +310,7 @@ def _format_dev_node(node): if __name__ == '__main__': + from pprint import pprint _print_host_dev_tree() + print 'Eligible passthrough devices:' + pprint(get_passthrough_dev_infos()) diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 1ea97f0..280aa53 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -247,7 +247,7 @@ class DevicesModel(object): def __init__(self, **kargs): self.conn = kargs['conn'] - def get_list(self, _cap=None): + def get_list(self, _cap=None, _passthrough=None): conn = self.conn.get() if _cap is None: dev_names = [name.name() for name in conn.listAllDevices(0)] @@ -256,6 +256,11 @@ class DevicesModel(object): else: # Get devices with required capability dev_names = conn.listDevices(_cap, 0) + + if _passthrough is not None and _passthrough.lower() == 'true': + passthrough_names = [ + dev['name'] for dev in hostdev.get_passthrough_dev_infos()] + dev_names = list(set(dev_names) & set(passthrough_names)) return dev_names def _get_devices_fc_host(self): @@ -273,6 +278,16 @@ class DevicesModel(object): return conn.listDevices('fc_host', 0) +class PassthroughAffectedDevicesModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + dev_info = DeviceModel(conn=self.conn).lookup(device_id) + affected = hostdev.get_affected_passthrough_devices(dev_info) + return [dev['name'] for dev in affected] + + class DeviceModel(object): def __init__(self, **kargs): self.conn = kargs['conn'] -- 1.9.3

On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
This patch adds a '_passthrough=1' filter to /host/devices, so it can _passthrough=True :-)
filter and shows all devices eligible to passthrough to guest. Theoretically, all PCI, USB and SCSI devices can be assigned to guest directly. However usually all host devices form a tree, if we assign a PCI port/SCSI controller/USB controller to guest, all devices/disks under the controller are assigned as well. In this patch we only present the "leaf" host devices to the user as potential passthrough devices. In other word, the possible devices are wireless network interface, SD card reader, camera, SCSI unit(disk or CD), and so on.
Linux kernel is able to recognize the host IOMMU group layout. If two PCI devices are in the same IOMMU group, it means there are possible interconnections between the devices, and the devices can talk to each other bypassing IOMMU. This implies isolation is not pefect between those devices, so all devices in a IOMMU group must be assigned to guest together. On host that recognizes IOMMU groups, by accessing the URI /host/devices/deviceX/passthrough_affected_devices, it returns a list containing the devices in the same IOMMU group as deviceX.
How to test:
List all types of devices to passthrough curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices?_passthrough=true'
List all eligible PCI devices to passthrough /host/devices?_passthrough=true&_cap=pci
List all USB devices to passthrough /host/devices?_passthrough=true&_cap=usb_device
List all SCSI devices to passthrough /host/devices?_passthrough=true&_cap=scsi
List devices in the same IOMMU group as pci_0000_00_19_0 /host/devices/pci_0000_00_19_0/passthrough_affected_devices
v1: v1 series does not contain this patch.
v2: Deal with calculation "leaf" device and "affected" device.
v3 v4: No change.
v5: Change _passthrough=1 to _passthrough=true in the URI scheme. Filter PCI devices according the PCI class.
v6: Don't passthrough PCI device of class code 07. In modern x86 machine, it's possible that "6 Series/C200 Series Chipset Family MEI Controller" and "6 Series/C200 Series Chipset Family KT Controller" are of this class code. These two devices are not suitable to passthrough to guest. We don't have simple and reliable way to distinguish normal serial controller and host chipset XXX controller. This type of PCI devices also include various serial, parallel, modem, communication controller. Serial and parallel controllers can be re-direct from ttyS0 to QEMU's pty using socat, and there is little performance benefit to directly assign to guest. So it'k ok not to passththrough PCI device of class code 07.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 9 ++++ src/kimchi/hostdev.py | 107 +++++++++++++++++++++++++++++++++++++++++++++ src/kimchi/model/host.py | 17 ++++++- 3 files changed, 132 insertions(+), 1 deletion(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ebf1bed..15f2343 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -103,9 +103,18 @@ class Devices(Collection): self.resource = Device
+class PassthroughAffectedDevices (Collection): + def __init__(self, model, device_id): + super(PassthroughAffectedDevices, self).__init__(model) + self.resource = Device + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) + self.passthrough_affected_devices = \ + PassthroughAffectedDevices(self.model, id)
@property def data(self): diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py index d4c142d..f70154f 100644 --- a/src/kimchi/hostdev.py +++ b/src/kimchi/hostdev.py @@ -17,6 +17,8 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import os + from kimchi.model.libvirtconnection import LibvirtConnection from kimchi.utils import kimchi_log from kimchi.xmlutils import dictize @@ -46,6 +48,108 @@ def _get_dev_info_tree(dev_infos): return root
+def _strip_parents(devs, dev): + parent = dev['parent'] + while parent is not None: + try: + parent_dev = devs.pop(parent) + except KeyError: + break + + if (parent_dev['device_type'], + dev['device_type']) == ('usb_device', 'scsi'): + # For USB device containing mass storage, passthrough the + # USB device itself, not the SCSI unit. + devs.pop(dev['name']) + break + + parent = parent_dev['parent'] + + +def _is_pci_qualified(pci_dev): + # PCI class such as bridge and storage controller are not suitable to + # passthrough to VM, so we make a whitelist and only passthrough PCI + # class in the list. + + whitelist_pci_classes = { + # Refer to Linux Kernel code include/linux/pci_ids.h + 0x000000: { # Old PCI devices + 0x000100: None}, # Old VGA devices looks strange? + 0x020000: None, # Network controller + 0x030000: None, # Display controller + 0x040000: None, # Multimedia device + 0x090000: None, # Inupt device + 0x0d0000: None, # Wireless controller + 0x0f0000: None, # Satellite communication controller + 0x100000: None, # Cryption controller + 0x110000: None, # Signal Processing controller + } + + with open(os.path.join(pci_dev['path'], 'class')) as f: + pci_class = int(f.read(), 16) + + try: + subclass = whitelist_pci_classes[pci_class & 0xff0000] + except KeyError: + return False + + if subclass is None: + return True + + if pci_class & 0xffff00 in subclass: + return True + + return False + + +def get_passthrough_dev_infos(): + ''' Get devices eligible to be passed through to VM. ''' + + dev_infos = _get_all_host_dev_infos() + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + + for dev in dev_infos: + if dev['device_type'] in ('pci', 'usb_device', 'scsi'): + _strip_parents(devs, dev) + + def is_eligible(dev): + if dev['device_type'] not in ('pci', 'usb_device', 'scsi'): + return False + if dev['device_type'] == 'pci': + return _is_pci_qualified(dev) + return True + + return [dev for dev in devs.itervalues() if is_eligible(dev)] + + +def get_affected_passthrough_devices(passthrough_dev): + devs = dict([(dev['name'], dev) for dev in _get_all_host_dev_infos()]) + + def get_iommu_group(dev_info): + try: + return int(dev_info['iommuGroup']) + except KeyError: + pass + + parent = dev_info['parent'] + while parent is not None: + try: + iommuGroup = int(devs[parent]['iommuGroup']) + except KeyError: + pass + else: + return iommuGroup + parent = devs[parent]['parent'] + + return -1 + + iommu_group = get_iommu_group(passthrough_dev) + + return [dev for dev in get_passthrough_dev_infos() + if dev['name'] != passthrough_dev['name'] and + get_iommu_group(dev) == iommu_group] does this means dev['name'] and passthrough_dev['name'] in same IOMMU group ? + + def get_dev_info(node_dev): ''' Parse the node device XML string into dict according to http://libvirt.org/formatnode.html. ''' @@ -206,4 +310,7 @@ def _format_dev_node(node):
if __name__ == '__main__': + from pprint import pprint _print_host_dev_tree() + print 'Eligible passthrough devices:' + pprint(get_passthrough_dev_infos()) diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 1ea97f0..280aa53 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -247,7 +247,7 @@ class DevicesModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
- def get_list(self, _cap=None): + def get_list(self, _cap=None, _passthrough=None): conn = self.conn.get() if _cap is None: dev_names = [name.name() for name in conn.listAllDevices(0)] @@ -256,6 +256,11 @@ class DevicesModel(object): else: # Get devices with required capability dev_names = conn.listDevices(_cap, 0) + + if _passthrough is not None and _passthrough.lower() == 'true': + passthrough_names = [ + dev['name'] for dev in hostdev.get_passthrough_dev_infos()] + dev_names = list(set(dev_names) & set(passthrough_names)) return dev_names
def _get_devices_fc_host(self): @@ -273,6 +278,16 @@ class DevicesModel(object): return conn.listDevices('fc_host', 0)
+class PassthroughAffectedDevicesModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + dev_info = DeviceModel(conn=self.conn).lookup(device_id) + affected = hostdev.get_affected_passthrough_devices(dev_info) + return [dev['name'] for dev in affected] + + class DeviceModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
-- Thanks and best regards! Sheldon Feng(冯少合)<shaohef@linux.vnet.ibm.com> IBM Linux Technology Center

on 2014/06/09 23:14, Sheldon wrote:
On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
This patch adds a '_passthrough=1' filter to /host/devices, so it can _passthrough=True :-)
Oops, thank you very much.
filter and shows all devices eligible to passthrough to guest. Theoretically, all PCI, USB and SCSI devices can be assigned to guest directly. However usually all host devices form a tree, if we assign a PCI port/SCSI controller/USB controller to guest, all devices/disks under the controller are assigned as well. In this patch we only present the "leaf" host devices to the user as potential passthrough devices. In other word, the possible devices are wireless network interface, SD card reader, camera, SCSI unit(disk or CD), and so on.
Linux kernel is able to recognize the host IOMMU group layout. If two PCI devices are in the same IOMMU group, it means there are possible interconnections between the devices, and the devices can talk to each other bypassing IOMMU. This implies isolation is not pefect between those devices, so all devices in a IOMMU group must be assigned to guest together. On host that recognizes IOMMU groups, by accessing the URI /host/devices/deviceX/passthrough_affected_devices, it returns a list containing the devices in the same IOMMU group as deviceX.
How to test:
List all types of devices to passthrough curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices?_passthrough=true'
List all eligible PCI devices to passthrough /host/devices?_passthrough=true&_cap=pci
List all USB devices to passthrough /host/devices?_passthrough=true&_cap=usb_device
List all SCSI devices to passthrough /host/devices?_passthrough=true&_cap=scsi
List devices in the same IOMMU group as pci_0000_00_19_0 /host/devices/pci_0000_00_19_0/passthrough_affected_devices
v1: v1 series does not contain this patch.
v2: Deal with calculation "leaf" device and "affected" device.
v3 v4: No change.
v5: Change _passthrough=1 to _passthrough=true in the URI scheme. Filter PCI devices according the PCI class.
v6: Don't passthrough PCI device of class code 07. In modern x86 machine, it's possible that "6 Series/C200 Series Chipset Family MEI Controller" and "6 Series/C200 Series Chipset Family KT Controller" are of this class code. These two devices are not suitable to passthrough to guest. We don't have simple and reliable way to distinguish normal serial controller and host chipset XXX controller. This type of PCI devices also include various serial, parallel, modem, communication controller. Serial and parallel controllers can be re-direct from ttyS0 to QEMU's pty using socat, and there is little performance benefit to directly assign to guest. So it'k ok not to passththrough PCI device of class code 07.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 9 ++++ src/kimchi/hostdev.py | 107 +++++++++++++++++++++++++++++++++++++++++++++ src/kimchi/model/host.py | 17 ++++++- 3 files changed, 132 insertions(+), 1 deletion(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ebf1bed..15f2343 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -103,9 +103,18 @@ class Devices(Collection): self.resource = Device
+class PassthroughAffectedDevices (Collection): + def __init__(self, model, device_id): + super(PassthroughAffectedDevices, self).__init__(model) + self.resource = Device + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) + self.passthrough_affected_devices = \ + PassthroughAffectedDevices(self.model, id)
@property def data(self): diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py index d4c142d..f70154f 100644 --- a/src/kimchi/hostdev.py +++ b/src/kimchi/hostdev.py @@ -17,6 +17,8 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import os + from kimchi.model.libvirtconnection import LibvirtConnection from kimchi.utils import kimchi_log from kimchi.xmlutils import dictize @@ -46,6 +48,108 @@ def _get_dev_info_tree(dev_infos): return root
+def _strip_parents(devs, dev): + parent = dev['parent'] + while parent is not None: + try: + parent_dev = devs.pop(parent) + except KeyError: + break + + if (parent_dev['device_type'], + dev['device_type']) == ('usb_device', 'scsi'): + # For USB device containing mass storage, passthrough the + # USB device itself, not the SCSI unit. + devs.pop(dev['name']) + break + + parent = parent_dev['parent'] + + +def _is_pci_qualified(pci_dev): + # PCI class such as bridge and storage controller are not suitable to + # passthrough to VM, so we make a whitelist and only passthrough PCI + # class in the list. + + whitelist_pci_classes = { + # Refer to Linux Kernel code include/linux/pci_ids.h + 0x000000: { # Old PCI devices + 0x000100: None}, # Old VGA devices looks strange? + 0x020000: None, # Network controller + 0x030000: None, # Display controller + 0x040000: None, # Multimedia device + 0x090000: None, # Inupt device + 0x0d0000: None, # Wireless controller + 0x0f0000: None, # Satellite communication controller + 0x100000: None, # Cryption controller + 0x110000: None, # Signal Processing controller + } + + with open(os.path.join(pci_dev['path'], 'class')) as f: + pci_class = int(f.read(), 16) + + try: + subclass = whitelist_pci_classes[pci_class & 0xff0000] + except KeyError: + return False + + if subclass is None: + return True + + if pci_class & 0xffff00 in subclass: + return True + + return False + + +def get_passthrough_dev_infos(): + ''' Get devices eligible to be passed through to VM. ''' + + dev_infos = _get_all_host_dev_infos() + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + + for dev in dev_infos: + if dev['device_type'] in ('pci', 'usb_device', 'scsi'): + _strip_parents(devs, dev) + + def is_eligible(dev): + if dev['device_type'] not in ('pci', 'usb_device', 'scsi'): + return False + if dev['device_type'] == 'pci': + return _is_pci_qualified(dev) + return True + + return [dev for dev in devs.itervalues() if is_eligible(dev)] + + +def get_affected_passthrough_devices(passthrough_dev): + devs = dict([(dev['name'], dev) for dev in _get_all_host_dev_infos()]) + + def get_iommu_group(dev_info): + try: + return int(dev_info['iommuGroup']) + except KeyError: + pass + + parent = dev_info['parent'] + while parent is not None: + try: + iommuGroup = int(devs[parent]['iommuGroup']) + except KeyError: + pass + else: + return iommuGroup + parent = devs[parent]['parent'] + + return -1 + + iommu_group = get_iommu_group(passthrough_dev) + + return [dev for dev in get_passthrough_dev_infos() + if dev['name'] != passthrough_dev['name'] and + get_iommu_group(dev) == iommu_group] does this means dev['name'] and passthrough_dev['name'] in same IOMMU group ? + + def get_dev_info(node_dev): ''' Parse the node device XML string into dict according to http://libvirt.org/formatnode.html. ''' @@ -206,4 +310,7 @@ def _format_dev_node(node):
if __name__ == '__main__': + from pprint import pprint _print_host_dev_tree() + print 'Eligible passthrough devices:' + pprint(get_passthrough_dev_infos()) diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 1ea97f0..280aa53 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -247,7 +247,7 @@ class DevicesModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
- def get_list(self, _cap=None): + def get_list(self, _cap=None, _passthrough=None): conn = self.conn.get() if _cap is None: dev_names = [name.name() for name in conn.listAllDevices(0)] @@ -256,6 +256,11 @@ class DevicesModel(object): else: # Get devices with required capability dev_names = conn.listDevices(_cap, 0) + + if _passthrough is not None and _passthrough.lower() == 'true': + passthrough_names = [ + dev['name'] for dev in hostdev.get_passthrough_dev_infos()] + dev_names = list(set(dev_names) & set(passthrough_names)) return dev_names
def _get_devices_fc_host(self): @@ -273,6 +278,16 @@ class DevicesModel(object): return conn.listDevices('fc_host', 0)
+class PassthroughAffectedDevicesModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + dev_info = DeviceModel(conn=self.conn).lookup(device_id) + affected = hostdev.get_affected_passthrough_devices(dev_info) + return [dev['name'] for dev in affected] + + class DeviceModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
-- Zhou Zheng Sheng / 周征晟 E-mail: zhshzhou@linux.vnet.ibm.com Telephone: 86-10-82454397

on 2014/06/09 23:14, Sheldon wrote:
On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
+def _is_pci_qualified(pci_dev): + # PCI class such as bridge and storage controller are not suitable to + # passthrough to VM, so we make a whitelist and only passthrough PCI + # class in the list. + + whitelist_pci_classes = { + # Refer to Linux Kernel code include/linux/pci_ids.h + 0x000000: { # Old PCI devices + 0x000100: None}, # Old VGA devices looks strange?
So the problem you think here is?
+ 0x020000: None, # Network controller + 0x030000: None, # Display controller + 0x040000: None, # Multimedia device + 0x090000: None, # Inupt device + 0x0d0000: None, # Wireless controller + 0x0f0000: None, # Satellite communication controller + 0x100000: None, # Cryption controller + 0x110000: None, # Signal Processing controller + } + + with open(os.path.join(pci_dev['path'], 'class')) as f: + pci_class = int(f.read(), 16) + + try: + subclass = whitelist_pci_classes[pci_class & 0xff0000] + except KeyError: + return False + + if subclass is None: + return True + + if pci_class & 0xffff00 in subclass: + return True + + return False + + +def get_passthrough_dev_infos(): + ''' Get devices eligible to be passed through to VM. ''' + + dev_infos = _get_all_host_dev_infos() + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + + for dev in dev_infos: + if dev['device_type'] in ('pci', 'usb_device', 'scsi'): + _strip_parents(devs, dev) + + def is_eligible(dev): + if dev['device_type'] not in ('pci', 'usb_device', 'scsi'): + return False + if dev['device_type'] == 'pci': + return _is_pci_qualified(dev) + return True + + return [dev for dev in devs.itervalues() if is_eligible(dev)] + + +def get_affected_passthrough_devices(passthrough_dev): + devs = dict([(dev['name'], dev) for dev in _get_all_host_dev_infos()]) + + def get_iommu_group(dev_info): + try: + return int(dev_info['iommuGroup']) + except KeyError: + pass + + parent = dev_info['parent'] + while parent is not None: + try: + iommuGroup = int(devs[parent]['iommuGroup']) + except KeyError: + pass + else: + return iommuGroup + parent = devs[parent]['parent'] + + return -1 + + iommu_group = get_iommu_group(passthrough_dev) + + return [dev for dev in get_passthrough_dev_infos() + if dev['name'] != passthrough_dev['name'] and + get_iommu_group(dev) == iommu_group] does this means dev['name'] and passthrough_dev['name'] in same IOMMU group ?
Yes.

Reviewed-by: ShaoHe Feng <shaohef@linux.vnet.ibm.com> looks good for me after with zhengsheng. On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
This patch adds a '_passthrough=1' filter to /host/devices, so it can filter and shows all devices eligible to passthrough to guest. Theoretically, all PCI, USB and SCSI devices can be assigned to guest directly. However usually all host devices form a tree, if we assign a PCI port/SCSI controller/USB controller to guest, all devices/disks under the controller are assigned as well. In this patch we only present the "leaf" host devices to the user as potential passthrough devices. In other word, the possible devices are wireless network interface, SD card reader, camera, SCSI unit(disk or CD), and so on.
Linux kernel is able to recognize the host IOMMU group layout. If two PCI devices are in the same IOMMU group, it means there are possible interconnections between the devices, and the devices can talk to each other bypassing IOMMU. This implies isolation is not pefect between those devices, so all devices in a IOMMU group must be assigned to guest together. On host that recognizes IOMMU groups, by accessing the URI /host/devices/deviceX/passthrough_affected_devices, it returns a list containing the devices in the same IOMMU group as deviceX.
How to test:
List all types of devices to passthrough curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices?_passthrough=true'
List all eligible PCI devices to passthrough /host/devices?_passthrough=true&_cap=pci
List all USB devices to passthrough /host/devices?_passthrough=true&_cap=usb_device
List all SCSI devices to passthrough /host/devices?_passthrough=true&_cap=scsi
List devices in the same IOMMU group as pci_0000_00_19_0 /host/devices/pci_0000_00_19_0/passthrough_affected_devices
v1: v1 series does not contain this patch.
v2: Deal with calculation "leaf" device and "affected" device.
v3 v4: No change.
v5: Change _passthrough=1 to _passthrough=true in the URI scheme. Filter PCI devices according the PCI class.
v6: Don't passthrough PCI device of class code 07. In modern x86 machine, it's possible that "6 Series/C200 Series Chipset Family MEI Controller" and "6 Series/C200 Series Chipset Family KT Controller" are of this class code. These two devices are not suitable to passthrough to guest. We don't have simple and reliable way to distinguish normal serial controller and host chipset XXX controller. This type of PCI devices also include various serial, parallel, modem, communication controller. Serial and parallel controllers can be re-direct from ttyS0 to QEMU's pty using socat, and there is little performance benefit to directly assign to guest. So it'k ok not to passththrough PCI device of class code 07.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 9 ++++ src/kimchi/hostdev.py | 107 +++++++++++++++++++++++++++++++++++++++++++++ src/kimchi/model/host.py | 17 ++++++- 3 files changed, 132 insertions(+), 1 deletion(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ebf1bed..15f2343 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -103,9 +103,18 @@ class Devices(Collection): self.resource = Device
+class PassthroughAffectedDevices(Collection): + def __init__(self, model, device_id): + super(PassthroughAffectedDevices, self).__init__(model) + self.resource = Device + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) + self.passthrough_affected_devices = \ + PassthroughAffectedDevices(self.model, id)
@property def data(self): diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py index d4c142d..f70154f 100644 --- a/src/kimchi/hostdev.py +++ b/src/kimchi/hostdev.py @@ -17,6 +17,8 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import os + from kimchi.model.libvirtconnection import LibvirtConnection from kimchi.utils import kimchi_log from kimchi.xmlutils import dictize @@ -46,6 +48,108 @@ def _get_dev_info_tree(dev_infos): return root
+def _strip_parents(devs, dev): + parent = dev['parent'] + while parent is not None: + try: + parent_dev = devs.pop(parent) + except KeyError: + break + + if (parent_dev['device_type'], + dev['device_type']) == ('usb_device', 'scsi'): + # For USB device containing mass storage, passthrough the + # USB device itself, not the SCSI unit. + devs.pop(dev['name']) + break + + parent = parent_dev['parent'] + + +def _is_pci_qualified(pci_dev): + # PCI class such as bridge and storage controller are not suitable to + # passthrough to VM, so we make a whitelist and only passthrough PCI + # class in the list. + + whitelist_pci_classes = { + # Refer to Linux Kernel code include/linux/pci_ids.h + 0x000000: { # Old PCI devices + 0x000100: None}, # Old VGA devices + 0x020000: None, # Network controller + 0x030000: None, # Display controller + 0x040000: None, # Multimedia device + 0x090000: None, # Inupt device + 0x0d0000: None, # Wireless controller + 0x0f0000: None, # Satellite communication controller + 0x100000: None, # Cryption controller + 0x110000: None, # Signal Processing controller + } + + with open(os.path.join(pci_dev['path'], 'class')) as f: + pci_class = int(f.read(), 16) + + try: + subclass = whitelist_pci_classes[pci_class & 0xff0000] + except KeyError: + return False + + if subclass is None: + return True + + if pci_class & 0xffff00 in subclass: + return True + + return False + + +def get_passthrough_dev_infos(): + ''' Get devices eligible to be passed through to VM. ''' + + dev_infos = _get_all_host_dev_infos() + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + + for dev in dev_infos: + if dev['device_type'] in ('pci', 'usb_device', 'scsi'): + _strip_parents(devs, dev) + + def is_eligible(dev): + if dev['device_type'] not in ('pci', 'usb_device', 'scsi'): + return False + if dev['device_type'] == 'pci': + return _is_pci_qualified(dev) + return True + + return [dev for dev in devs.itervalues() if is_eligible(dev)] + + +def get_affected_passthrough_devices(passthrough_dev): + devs = dict([(dev['name'], dev) for dev in _get_all_host_dev_infos()]) + + def get_iommu_group(dev_info): + try: + return int(dev_info['iommuGroup']) + except KeyError: + pass + + parent = dev_info['parent'] + while parent is not None: + try: + iommuGroup = int(devs[parent]['iommuGroup']) + except KeyError: + pass + else: + return iommuGroup + parent = devs[parent]['parent'] + + return -1 + + iommu_group = get_iommu_group(passthrough_dev) + + return [dev for dev in get_passthrough_dev_infos() + if dev['name'] != passthrough_dev['name'] and + get_iommu_group(dev) == iommu_group] + + def get_dev_info(node_dev): ''' Parse the node device XML string into dict according to http://libvirt.org/formatnode.html. ''' @@ -206,4 +310,7 @@ def _format_dev_node(node):
if __name__ == '__main__': + from pprint import pprint _print_host_dev_tree() + print 'Eligible passthrough devices:' + pprint(get_passthrough_dev_infos()) diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 1ea97f0..280aa53 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -247,7 +247,7 @@ class DevicesModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
- def get_list(self, _cap=None): + def get_list(self, _cap=None, _passthrough=None): conn = self.conn.get() if _cap is None: dev_names = [name.name() for name in conn.listAllDevices(0)] @@ -256,6 +256,11 @@ class DevicesModel(object): else: # Get devices with required capability dev_names = conn.listDevices(_cap, 0) + + if _passthrough is not None and _passthrough.lower() == 'true': + passthrough_names = [ + dev['name'] for dev in hostdev.get_passthrough_dev_infos()] + dev_names = list(set(dev_names) & set(passthrough_names)) return dev_names
def _get_devices_fc_host(self): @@ -273,6 +278,16 @@ class DevicesModel(object): return conn.listDevices('fc_host', 0)
+class PassthroughAffectedDevicesModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + dev_info = DeviceModel(conn=self.conn).lookup(device_id) + affected = hostdev.get_affected_passthrough_devices(dev_info) + return [dev['name'] for dev in affected] + + class DeviceModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
-- Thanks and best regards! Sheldon Feng(冯少合)<shaohef@linux.vnet.ibm.com> IBM Linux Technology Center

On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
This patch adds a '_passthrough=1' filter to /host/devices, so it can filter and shows all devices eligible to passthrough to guest. Theoretically, all PCI, USB and SCSI devices can be assigned to guest directly. However usually all host devices form a tree, if we assign a PCI port/SCSI controller/USB controller to guest, all devices/disks under the controller are assigned as well. In this patch we only present the "leaf" host devices to the user as potential passthrough devices. In other word, the possible devices are wireless network interface, SD card reader, camera, SCSI unit(disk or CD), and so on.
Linux kernel is able to recognize the host IOMMU group layout. If two PCI devices are in the same IOMMU group, it means there are possible interconnections between the devices, and the devices can talk to each other bypassing IOMMU. This implies isolation is not pefect between those devices, so all devices in a IOMMU group must be assigned to guest together. On host that recognizes IOMMU groups, by accessing the URI /host/devices/deviceX/passthrough_affected_devices, it returns a list containing the devices in the same IOMMU group as deviceX.
How to test:
List all types of devices to passthrough curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices?_passthrough=true'
List all eligible PCI devices to passthrough /host/devices?_passthrough=true&_cap=pci
List all USB devices to passthrough /host/devices?_passthrough=true&_cap=usb_device
List all SCSI devices to passthrough /host/devices?_passthrough=true&_cap=scsi
List devices in the same IOMMU group as pci_0000_00_19_0 /host/devices/pci_0000_00_19_0/passthrough_affected_devices
v1: v1 series does not contain this patch.
v2: Deal with calculation "leaf" device and "affected" device.
v3 v4: No change.
v5: Change _passthrough=1 to _passthrough=true in the URI scheme. Filter PCI devices according the PCI class.
v6: Don't passthrough PCI device of class code 07. In modern x86 machine, it's possible that "6 Series/C200 Series Chipset Family MEI Controller" and "6 Series/C200 Series Chipset Family KT Controller" are of this class code. These two devices are not suitable to passthrough to guest. We don't have simple and reliable way to distinguish normal serial controller and host chipset XXX controller. This type of PCI devices also include various serial, parallel, modem, communication controller. Serial and parallel controllers can be re-direct from ttyS0 to QEMU's pty using socat, and there is little performance benefit to directly assign to guest. So it'k ok not to passththrough PCI device of class code 07.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 9 ++++ src/kimchi/hostdev.py | 107 +++++++++++++++++++++++++++++++++++++++++++++ src/kimchi/model/host.py | 17 ++++++- 3 files changed, 132 insertions(+), 1 deletion(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ebf1bed..15f2343 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -103,9 +103,18 @@ class Devices(Collection): self.resource = Device
+class PassthroughAffectedDevices(Collection): + def __init__(self, model, device_id): + super(PassthroughAffectedDevices, self).__init__(model) + self.resource = Device + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) + self.passthrough_affected_devices = \ + PassthroughAffectedDevices(self.model, id)
@property def data(self): diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py index d4c142d..f70154f 100644 --- a/src/kimchi/hostdev.py +++ b/src/kimchi/hostdev.py @@ -17,6 +17,8 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import os + from kimchi.model.libvirtconnection import LibvirtConnection from kimchi.utils import kimchi_log from kimchi.xmlutils import dictize @@ -46,6 +48,108 @@ def _get_dev_info_tree(dev_infos): return root
+def _strip_parents(devs, dev): + parent = dev['parent'] + while parent is not None: + try: + parent_dev = devs.pop(parent) + except KeyError: + break + + if (parent_dev['device_type'], + dev['device_type']) == ('usb_device', 'scsi'): + # For USB device containing mass storage, passthrough the + # USB device itself, not the SCSI unit. + devs.pop(dev['name']) + break + + parent = parent_dev['parent'] + + +def _is_pci_qualified(pci_dev): + # PCI class such as bridge and storage controller are not suitable to + # passthrough to VM, so we make a whitelist and only passthrough PCI + # class in the list. + + whitelist_pci_classes = { + # Refer to Linux Kernel code include/linux/pci_ids.h + 0x000000: { # Old PCI devices + 0x000100: None}, # Old VGA devices + 0x020000: None, # Network controller + 0x030000: None, # Display controller + 0x040000: None, # Multimedia device + 0x090000: None, # Inupt device + 0x0d0000: None, # Wireless controller + 0x0f0000: None, # Satellite communication controller + 0x100000: None, # Cryption controller + 0x110000: None, # Signal Processing controller + } + + with open(os.path.join(pci_dev['path'], 'class')) as f: + pci_class = int(f.read(), 16) better to and use readline strip '\n' even though it will not break. + + try: + subclass = whitelist_pci_classes[pci_class & 0xff0000] I would like to suggest you separate the old pci device from the list. You can define two lists:
whitelist_pci_classes = (0x020000, 0x030000, 0x040000, ...) whilelist_old_pci_classes = (0x000100,) if pci_class & 0xff0000 != 0: use whitelist_pci_classes else: whilelist_old_pci_classes
+ except KeyError: + return False + + if subclass is None: + return True + + if pci_class & 0xffff00 in subclass: + return True + + return False + + +def get_passthrough_dev_infos(): + ''' Get devices eligible to be passed through to VM. ''' + + dev_infos = _get_all_host_dev_infos() + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + + for dev in dev_infos: + if dev['device_type'] in ('pci', 'usb_device', 'scsi') and dev in devs:
and dev in devs ? skip the device already removed in devs?
+ _strip_parents(devs, dev) + + def is_eligible(dev): + if dev['device_type'] not in ('pci', 'usb_device', 'scsi'): + return False + if dev['device_type'] == 'pci': + return _is_pci_qualified(dev) + return True return dev['device_type'] in ('usb_device', scsi') or (dev['device_type'] == 'pci' and _is_pci_qualified(dev)) ? + + return [dev for dev in devs.itervalues() if is_eligible(dev)] + + +def get_affected_passthrough_devices(passthrough_dev): + devs = dict([(dev['name'], dev) for dev in _get_all_host_dev_infos()]) + + def get_iommu_group(dev_info): + try: + return int(dev_info['iommuGroup']) + except KeyError: + pass + + parent = dev_info['parent'] + while parent is not None: + try: + iommuGroup = int(devs[parent]['iommuGroup']) + except KeyError: + pass + else: + return iommuGroup + parent = devs[parent]['parent'] + + return -1 + + iommu_group = get_iommu_group(passthrough_dev) + + return [dev for dev in get_passthrough_dev_infos() + if dev['name'] != passthrough_dev['name'] and + get_iommu_group(dev) == iommu_group] + + def get_dev_info(node_dev): ''' Parse the node device XML string into dict according to http://libvirt.org/formatnode.html. ''' @@ -206,4 +310,7 @@ def _format_dev_node(node):
if __name__ == '__main__': + from pprint import pprint _print_host_dev_tree() + print 'Eligible passthrough devices:' + pprint(get_passthrough_dev_infos()) diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 1ea97f0..280aa53 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -247,7 +247,7 @@ class DevicesModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
- def get_list(self, _cap=None): + def get_list(self, _cap=None, _passthrough=None): conn = self.conn.get() if _cap is None: dev_names = [name.name() for name in conn.listAllDevices(0)] @@ -256,6 +256,11 @@ class DevicesModel(object): else: # Get devices with required capability dev_names = conn.listDevices(_cap, 0) + + if _passthrough is not None and _passthrough.lower() == 'true': + passthrough_names = [ + dev['name'] for dev in hostdev.get_passthrough_dev_infos()] + dev_names = list(set(dev_names) & set(passthrough_names)) Isn't passthrough_names a subset of dev_names? return dev_names
def _get_devices_fc_host(self): @@ -273,6 +278,16 @@ class DevicesModel(object): return conn.listDevices('fc_host', 0)
+class PassthroughAffectedDevicesModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + dev_info = DeviceModel(conn=self.conn).lookup(device_id) + affected = hostdev.get_affected_passthrough_devices(dev_info) + return [dev['name'] for dev in affected] + not sure if the client side can know what's the devices is just according to the name. + class DeviceModel(object): def __init__(self, **kargs): self.conn = kargs['conn'] Besides, the minor issues in comments, it looks good to me.

on 2014/06/16 16:36, Mark Wu wrote:
On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
This patch adds a '_passthrough=1' filter to /host/devices, so it can filter and shows all devices eligible to passthrough to guest. Theoretically, all PCI, USB and SCSI devices can be assigned to guest directly. However usually all host devices form a tree, if we assign a PCI port/SCSI controller/USB controller to guest, all devices/disks under the controller are assigned as well. In this patch we only present the "leaf" host devices to the user as potential passthrough devices. In other word, the possible devices are wireless network interface, SD card reader, camera, SCSI unit(disk or CD), and so on.
Linux kernel is able to recognize the host IOMMU group layout. If two PCI devices are in the same IOMMU group, it means there are possible interconnections between the devices, and the devices can talk to each other bypassing IOMMU. This implies isolation is not pefect between those devices, so all devices in a IOMMU group must be assigned to guest together. On host that recognizes IOMMU groups, by accessing the URI /host/devices/deviceX/passthrough_affected_devices, it returns a list containing the devices in the same IOMMU group as deviceX.
How to test:
List all types of devices to passthrough curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices?_passthrough=true'
List all eligible PCI devices to passthrough /host/devices?_passthrough=true&_cap=pci
List all USB devices to passthrough /host/devices?_passthrough=true&_cap=usb_device
List all SCSI devices to passthrough /host/devices?_passthrough=true&_cap=scsi
List devices in the same IOMMU group as pci_0000_00_19_0 /host/devices/pci_0000_00_19_0/passthrough_affected_devices
v1: v1 series does not contain this patch.
v2: Deal with calculation "leaf" device and "affected" device.
v3 v4: No change.
v5: Change _passthrough=1 to _passthrough=true in the URI scheme. Filter PCI devices according the PCI class.
v6: Don't passthrough PCI device of class code 07. In modern x86 machine, it's possible that "6 Series/C200 Series Chipset Family MEI Controller" and "6 Series/C200 Series Chipset Family KT Controller" are of this class code. These two devices are not suitable to passthrough to guest. We don't have simple and reliable way to distinguish normal serial controller and host chipset XXX controller. This type of PCI devices also include various serial, parallel, modem, communication controller. Serial and parallel controllers can be re-direct from ttyS0 to QEMU's pty using socat, and there is little performance benefit to directly assign to guest. So it'k ok not to passththrough PCI device of class code 07.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 9 ++++ src/kimchi/hostdev.py | 107 +++++++++++++++++++++++++++++++++++++++++++++ src/kimchi/model/host.py | 17 ++++++- 3 files changed, 132 insertions(+), 1 deletion(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ebf1bed..15f2343 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -103,9 +103,18 @@ class Devices(Collection): self.resource = Device
+class PassthroughAffectedDevices(Collection): + def __init__(self, model, device_id): + super(PassthroughAffectedDevices, self).__init__(model) + self.resource = Device + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) + self.passthrough_affected_devices = \ + PassthroughAffectedDevices(self.model, id)
@property def data(self): diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py index d4c142d..f70154f 100644 --- a/src/kimchi/hostdev.py +++ b/src/kimchi/hostdev.py @@ -17,6 +17,8 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import os + from kimchi.model.libvirtconnection import LibvirtConnection from kimchi.utils import kimchi_log from kimchi.xmlutils import dictize @@ -46,6 +48,108 @@ def _get_dev_info_tree(dev_infos): return root
+def _strip_parents(devs, dev): + parent = dev['parent'] + while parent is not None: + try: + parent_dev = devs.pop(parent) + except KeyError: + break + + if (parent_dev['device_type'], + dev['device_type']) == ('usb_device', 'scsi'): + # For USB device containing mass storage, passthrough the + # USB device itself, not the SCSI unit. + devs.pop(dev['name']) + break + + parent = parent_dev['parent'] + + +def _is_pci_qualified(pci_dev): + # PCI class such as bridge and storage controller are not suitable to + # passthrough to VM, so we make a whitelist and only passthrough PCI + # class in the list. + + whitelist_pci_classes = { + # Refer to Linux Kernel code include/linux/pci_ids.h + 0x000000: { # Old PCI devices + 0x000100: None}, # Old VGA devices + 0x020000: None, # Network controller + 0x030000: None, # Display controller + 0x040000: None, # Multimedia device + 0x090000: None, # Inupt device + 0x0d0000: None, # Wireless controller + 0x0f0000: None, # Satellite communication controller + 0x100000: None, # Cryption controller + 0x110000: None, # Signal Processing controller + } + + with open(os.path.join(pci_dev['path'], 'class')) as f: + pci_class = int(f.read(), 16) better to and use readline strip '\n' even though it will not break.
Good suggestion.
+ + try: + subclass = whitelist_pci_classes[pci_class & 0xff0000] I would like to suggest you separate the old pci device from the list. You can define two lists:
whitelist_pci_classes = (0x020000, 0x030000, 0x040000, ...) whilelist_old_pci_classes = (0x000100,)
if pci_class & 0xff0000 != 0: use whitelist_pci_classes else: whilelist_old_pci_classes
This is effective because currently only one sub-class in old PCI devices is eligible. For other classes, as I investigated, the whole class can passthrough. However I think the situation may change after it is used and tested more. We may discover some sub-classes should be filtered out in future. The data structure and the filtering algorithm is more robust for further modifications.
+ except KeyError: + return False + + if subclass is None: + return True + + if pci_class & 0xffff00 in subclass: + return True + + return False + + +def get_passthrough_dev_infos(): + ''' Get devices eligible to be passed through to VM. ''' + + dev_infos = _get_all_host_dev_infos() + devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos]) + + for dev in dev_infos: + if dev['device_type'] in ('pci', 'usb_device', 'scsi') and dev in devs:
and dev in devs ? skip the device already removed in devs?
This is a bit weird, I open my local workspace, there is no "and dev in devs" in this patch. Let me send a new version. Thanks for catching this.
+ _strip_parents(devs, dev) + + def is_eligible(dev): + if dev['device_type'] not in ('pci', 'usb_device', 'scsi'): + return False + if dev['device_type'] == 'pci': + return _is_pci_qualified(dev) + return True return dev['device_type'] in ('usb_device', scsi') or (dev['device_type'] == 'pci' and _is_pci_qualified(dev)) ?
Agree.
+ + return [dev for dev in devs.itervalues() if is_eligible(dev)] + + +def get_affected_passthrough_devices(passthrough_dev): + devs = dict([(dev['name'], dev) for dev in _get_all_host_dev_infos()]) + + def get_iommu_group(dev_info): + try: + return int(dev_info['iommuGroup']) + except KeyError: + pass + + parent = dev_info['parent'] + while parent is not None: + try: + iommuGroup = int(devs[parent]['iommuGroup']) + except KeyError: + pass + else: + return iommuGroup + parent = devs[parent]['parent'] + + return -1 + + iommu_group = get_iommu_group(passthrough_dev) + + return [dev for dev in get_passthrough_dev_infos() + if dev['name'] != passthrough_dev['name'] and + get_iommu_group(dev) == iommu_group] + + def get_dev_info(node_dev): ''' Parse the node device XML string into dict according to http://libvirt.org/formatnode.html. ''' @@ -206,4 +310,7 @@ def _format_dev_node(node):
if __name__ == '__main__': + from pprint import pprint _print_host_dev_tree() + print 'Eligible passthrough devices:' + pprint(get_passthrough_dev_infos()) diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 1ea97f0..280aa53 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -247,7 +247,7 @@ class DevicesModel(object): def __init__(self, **kargs): self.conn = kargs['conn']
- def get_list(self, _cap=None): + def get_list(self, _cap=None, _passthrough=None): conn = self.conn.get() if _cap is None: dev_names = [name.name() for name in conn.listAllDevices(0)] @@ -256,6 +256,11 @@ class DevicesModel(object): else: # Get devices with required capability dev_names = conn.listDevices(_cap, 0) + + if _passthrough is not None and _passthrough.lower() == 'true': + passthrough_names = [ + dev['name'] for dev in hostdev.get_passthrough_dev_infos()] + dev_names = list(set(dev_names) & set(passthrough_names)) Isn't passthrough_names a subset of dev_names?
No. This is because when there is both _cap and _passthrough, dev_names would be filtered firstly to leave only a sub set. So we have to make an intersection with the passthrough names.
return dev_names
def _get_devices_fc_host(self): @@ -273,6 +278,16 @@ class DevicesModel(object): return conn.listDevices('fc_host', 0)
+class PassthroughAffectedDevicesModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + dev_info = DeviceModel(conn=self.conn).lookup(device_id) + affected = hostdev.get_affected_passthrough_devices(dev_info) + return [dev['name'] for dev in affected] +
not sure if the client side can know what's the devices is just according to the name.
I think we already provide a host/devices/name URI, the front-end can fetch the details from that URI if necessary.
+ class DeviceModel(object): def __init__(self, **kargs): self.conn = kargs['conn'] Besides, the minor issues in comments, it looks good to me.
-- Zhou Zheng Sheng / 周征晟 E-mail: zhshzhou@linux.vnet.ibm.com Telephone: 86-10-82454397

This patch enbales Kimchi's VM to use host devices directly, and it greatly improves the related device performance. The user can assign PCI, USB and SCSI LUN directly to VM, as long as the host supports one of Intel VT-d, AMD IOMMU or POWER sPAPR technology and runs a recent release of Linux kernel. This patch adds a sub-collection "hostdevs" to the URI vms/vm-name/. The front-end can GET vms/vm-name/hostdevs and vms/vm-name/hostdevs/dev-name or POST (assign) vms/vm-name/hostdevs and DELETE (dismiss) vms/vm-name/hostdevs/dev-name The eligible devices to assign are the devices listed by the URI host/devices?_passthrough=1 When assigning a host PCI device to VM, all the eligible PCI devices in the same IOMMU group are also automatically assigned, and vice versa when dismissing a host PIC device from the VM. Some examples: Assign a USB device: curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ -X POST -d '{"name": "usb_1_1_6"}' \ 'https://127.0.0.1:8001/vms/rhel65/hostdevs' Assign a PCI device: -d '{"name": "pci_0000_0d_00_0"}' Assign a SCSI LUN: -d '{"name": "scsi_1_0_0_0"}' List assigned devices: curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/vms/rhel65/hostdevs' The above command should print following. [ { "type":"scsi", "name":"scsi_1_0_0_0" }, { "type":"usb", "name":"usb_1_1_6" }, { "type":"pci", "name":"pci_0000_0d_00_0" }, { "type":"pci", "name":"pci_0000_03_00_0" } ] Notice that the device pci_0000_03_00_0 is also assigned automatically. The assigned devices are hot-plugged to VM and also written to the domain XML. When it's possible, it enables VFIO for PCI device assignment. v1: Handle the devices in the VM template. v2: Handle the devices in the VM sub-resource "hostdevs". v3: No change. v4: Not all domain XMLs contain hostdev node. Deal with the case. v5: Change _passthrough='1' to _passthrough='true'. When attaching and detaching a device, do not use VIR_DOMAIN_AFFECT_CURRENT flag, instead, use kimchi.model.utils.get_vm_config_flag() to correctly set the device flag. Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/vm/hostdevs.py | 44 ++++++ src/kimchi/featuretests.py | 10 +- src/kimchi/i18n.py | 7 + src/kimchi/model/config.py | 2 + src/kimchi/model/vmhostdevs.py | 303 ++++++++++++++++++++++++++++++++++++++ src/kimchi/rollbackcontext.py | 3 + 6 files changed, 368 insertions(+), 1 deletion(-) create mode 100644 src/kimchi/control/vm/hostdevs.py create mode 100644 src/kimchi/model/vmhostdevs.py diff --git a/src/kimchi/control/vm/hostdevs.py b/src/kimchi/control/vm/hostdevs.py new file mode 100644 index 0000000..81fe8ec --- /dev/null +++ b/src/kimchi/control/vm/hostdevs.py @@ -0,0 +1,44 @@ +# +# Project Kimchi +# +# Copyright IBM, Corp. 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +from kimchi.control.base import Collection, Resource +from kimchi.control.utils import UrlSubNode + + +@UrlSubNode("hostdevs") +class VMHostDevs(Collection): + def __init__(self, model, vmid): + super(VMHostDevs, self).__init__(model) + self.resource = VMHostDev + self.vmid = vmid + self.resource_args = [self.vmid, ] + self.model_args = [self.vmid, ] + + +class VMHostDev(Resource): + def __init__(self, model, vmid, ident): + super(VMHostDev, self).__init__(model, ident) + self.vmid = vmid + self.ident = ident + self.info = {} + self.model_args = [self.vmid, self.ident] + + @property + def data(self): + return self.info diff --git a/src/kimchi/featuretests.py b/src/kimchi/featuretests.py index 5192361..74222bf 100644 --- a/src/kimchi/featuretests.py +++ b/src/kimchi/featuretests.py @@ -29,7 +29,7 @@ from lxml.builder import E from kimchi.rollbackcontext import RollbackContext -from kimchi.utils import kimchi_log +from kimchi.utils import kimchi_log, run_command ISO_STREAM_XML = """ @@ -206,3 +206,11 @@ class FeatureTests(object): return True except libvirt.libvirtError: return False + + @staticmethod + def kernel_support_vfio(): + out, err, rc = run_command(['modprobe', 'vfio-pci']) + if rc != 0: + kimchi_log.warning("Unable to load Kernal module vfio-pci.") + return False + return True diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py index 452ede2..4757001 100644 --- a/src/kimchi/i18n.py +++ b/src/kimchi/i18n.py @@ -90,6 +90,13 @@ messages = { "KCHVM0029E": _("Unable to shutdown virtual machine %(name)s. Details: %(err)s"), "KCHVM0030E": _("Unable to get access metadata of virtual machine %(name)s. Details: %(err)s"), + "KCHVMHDEV0001E": _("VM %(vmid)s does not contain directly assigned host device %(dev_name)s."), + "KCHVMHDEV0002E": _("The host device %(dev_name)s is not allowed to directly assign to VM."), + "KCHVMHDEV0003E": _("No IOMMU groups found. Host PCI pass through needs IOMMU group to function correctly. " + "Please enable Intel VT-d or AMD IOMMU in your BIOS, then verify the Kernel is compiled with IOMMU support. " + "For Intel CPU, add intel_iommu=on to your Kernel parameter in /boot/grub2/grub.conf. " + "For AMD CPU, add iommu=pt iommu=1."), + "KCHVMIF0001E": _("Interface %(iface)s does not exist in virtual machine %(name)s"), "KCHVMIF0002E": _("Network %(network)s specified for virtual machine %(name)s does not exist"), "KCHVMIF0003E": _("Do not support guest interface hot plug attachment"), diff --git a/src/kimchi/model/config.py b/src/kimchi/model/config.py index 0ef0855..95c8e7e 100644 --- a/src/kimchi/model/config.py +++ b/src/kimchi/model/config.py @@ -54,6 +54,7 @@ class CapabilitiesModel(object): self.libvirt_stream_protocols = [] self.fc_host_support = False self.metadata_support = False + self.kernel_vfio = False # Subscribe function to set host capabilities to be run when cherrypy # server is up @@ -67,6 +68,7 @@ class CapabilitiesModel(object): self.nfs_target_probe = FeatureTests.libvirt_support_nfs_probe() self.fc_host_support = FeatureTests.libvirt_support_fc_host() self.metadata_support = FeatureTests.has_metadata_support() + self.kernel_vfio = FeatureTests.kernel_support_vfio() self.libvirt_stream_protocols = [] for p in ['http', 'https', 'ftp', 'ftps', 'tftp']: diff --git a/src/kimchi/model/vmhostdevs.py b/src/kimchi/model/vmhostdevs.py new file mode 100644 index 0000000..9e59513 --- /dev/null +++ b/src/kimchi/model/vmhostdevs.py @@ -0,0 +1,303 @@ +# +# Project Kimchi +# +# Copyright IBM, Corp. 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +import glob + +import libvirt +from lxml import etree, objectify + +from kimchi.exception import InvalidOperation, InvalidParameter, NotFoundError +from kimchi.model.config import CapabilitiesModel +from kimchi.model.host import DeviceModel, DevicesModel +from kimchi.model.host import PassthroughAffectedDevicesModel +from kimchi.model.utils import get_vm_config_flag +from kimchi.model.vms import DOM_STATE_MAP, VMModel +from kimchi.rollbackcontext import RollbackContext +from kimchi.utils import kimchi_log, run_command + + +class VMHostDevsModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, vmid): + dom = VMModel.get_vm(vmid, self.conn) + xmlstr = dom.XMLDesc(0) + root = objectify.fromstring(xmlstr) + try: + hostdev = root.devices.hostdev + except AttributeError: + return [] + + return [self._deduce_dev_name(e) for e in hostdev] + + @staticmethod + def _toint(num_str): + if num_str.startswith('0x'): + return int(num_str, 16) + elif num_str.startswith('0'): + return int(num_str, 8) + else: + return int(num_str) + + def _deduce_dev_name(self, e): + dev_types = { + 'pci': self._deduce_dev_name_pci, + 'scsi': self._deduce_dev_name_scsi, + 'usb': self._deduce_dev_name_usb, + } + return dev_types[e.attrib['type']](e) + + def _deduce_dev_name_pci(self, e): + attrib = {} + for field in ('domain', 'bus', 'slot', 'function'): + attrib[field] = self._toint(e.source.address.attrib[field]) + return 'pci_%(domain)04x_%(bus)02x_%(slot)02x_%(function)x' % attrib + + def _deduce_dev_name_scsi(self, e): + attrib = {} + for field in ('bus', 'target', 'unit'): + attrib[field] = self._toint(e.source.address.attrib[field]) + attrib['host'] = self._toint( + e.source.adapter.attrib['name'][len('scsi_host'):]) + return 'scsi_%(host)d_%(bus)d_%(target)d_%(unit)d' % attrib + + def _deduce_dev_name_usb(self, e): + dev_names = DevicesModel(conn=self.conn).get_list(_cap='usb_device') + usb_infos = [DeviceModel(conn=self.conn).lookup(dev_name) + for dev_name in dev_names] + + unknown_dev = None + + try: + evendor = self._toint(e.source.vendor.attrib['id']) + eproduct = self._toint(e.source.product.attrib['id']) + except AttributeError: + evendor = 0 + eproduct = 0 + else: + unknown_dev = 'usb_vendor_%s_product_%s' % (evendor, eproduct) + + try: + ebus = self._toint(e.source.address.attrib['bus']) + edevice = self._toint(e.source.address.attrib['device']) + except AttributeError: + ebus = -1 + edevice = -1 + else: + unknown_dev = 'usb_bus_%s_device_%s' % (ebus, edevice) + + for usb_info in usb_infos: + ivendor = self._toint(usb_info['vendor']['id']) + iproduct = self._toint(usb_info['product']['id']) + if evendor == ivendor and eproduct == iproduct: + return usb_info['name'] + ibus = usb_info['bus'] + idevice = usb_info['device'] + if ebus == ibus and edevice == idevice: + return usb_info['name'] + return unknown_dev + + def _passthrough_device_validate(self, dev_name): + eligible_dev_names = \ + DevicesModel(conn=self.conn).get_list(_passthrough='true') + if dev_name not in eligible_dev_names: + raise InvalidParameter('KCHVMHDEV0002E', {'dev_name': dev_name}) + + def create(self, vmid, params): + dev_name = params['name'] + self._passthrough_device_validate(dev_name) + dev_info = DeviceModel(conn=self.conn).lookup(dev_name) + attach_device = { + 'pci': self._attach_pci_device, + 'scsi': self._attach_scsi_device, + 'usb_device': self._attach_usb_device, + }[dev_info['device_type']] + return attach_device(vmid, dev_info) + + def _get_pci_device_xml(self, dev_info): + if 'detach_driver' not in dev_info: + dev_info['detach_driver'] = 'kvm' + + xmlstr = ''' + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='%(domain)s' bus='%(bus)s' slot='%(slot)s' + function='%(function)s'/> + </source> + <driver name='%(detach_driver)s'/> + </hostdev>''' % dev_info + return xmlstr + + @staticmethod + def _validate_pci_passthrough_env(): + if not glob.glob('/sys/kernel/iommu_groups/*'): + raise InvalidOperation("KCHVMHDEV0003E") + + # Enable virt_use_sysfs on RHEL6 and older distributions + # In recent Fedora, there is no virt_use_sysfs. + out, err, rc = run_command(['getsebool', 'virt_use_sysfs']) + if rc == 0 and out.rstrip('\n') != "virt_use_sysfs --> on": + out, err, rc = run_command(['setsebool', '-P', + 'virt_use_sysfs=on']) + if rc != 0: + kimchi_log.warning("Unable to turn on sebool virt_use_sysfs") + + def _attach_pci_device(self, vmid, dev_info): + self._validate_pci_passthrough_env() + + dom = VMModel.get_vm(vmid, self.conn) + # Due to libvirt limitation, we don't support live assigne device to + # vfio driver. + driver = ('vfio' if DOM_STATE_MAP[dom.info()[0]] == "shutoff" and + CapabilitiesModel().kernel_vfio else 'kvm') + + # Attach all PCI devices in the same IOMMU group + dev_model = DeviceModel(conn=self.conn) + affected_devs_model = PassthroughAffectedDevicesModel(conn=self.conn) + dev_infos = [dev_model.lookup(dev_name) for dev_name in + affected_devs_model.get_list(dev_info['name'])] + pci_infos = [dev_info] + [info for info in dev_infos + if info['device_type'] == 'pci'] + + device_flags = get_vm_config_flag(dom, mode='all') + + with RollbackContext() as rollback: + for pci_info in pci_infos: + pci_info['detach_driver'] = driver + xmlstr = self._get_pci_device_xml(pci_info) + try: + dom.attachDeviceFlags(xmlstr, device_flags) + except libvirt.libvirtError: + kimchi_log.error( + 'Failed to attach host device %s to VM %s: \n%s', + pci_info['name'], vmid, xmlstr) + raise + rollback.prependDefer(dom.detachDeviceFlags, + xmlstr, device_flags) + rollback.commitAll() + + return dev_info['name'] + + def _get_scsi_device_xml(self, dev_info): + xmlstr = ''' + <hostdev mode='subsystem' type='scsi' sgio='unfiltered'> + <source> + <adapter name='scsi_host%(host)s'/> + <address type='scsi' bus='%(bus)s' target='%(target)s' + unit='%(lun)s'/> + </source> + </hostdev>''' % dev_info + return xmlstr + + def _attach_scsi_device(self, vmid, dev_info): + xmlstr = self._get_scsi_device_xml(dev_info) + dom = VMModel.get_vm(vmid, self.conn) + dom.attachDeviceFlags(xmlstr, get_vm_config_flag(dom, mode='all')) + return dev_info['name'] + + def _get_usb_device_xml(self, dev_info): + xmlstr = ''' + <hostdev mode='subsystem' type='usb' managed='yes'> + <source startupPolicy='optional'> + <vendor id='%s'/> + <product id='%s'/> + <address bus='%s' device='%s'/> + </source> + </hostdev>''' % (dev_info['vendor']['id'], dev_info['product']['id'], + dev_info['bus'], dev_info['device']) + return xmlstr + + def _attach_usb_device(self, vmid, dev_info): + xmlstr = self._get_usb_device_xml(dev_info) + dom = VMModel.get_vm(vmid, self.conn) + dom.attachDeviceFlags(xmlstr, get_vm_config_flag(dom, mode='all')) + return dev_info['name'] + + +class VMHostDevModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def lookup(self, vmid, dev_name): + dom = VMModel.get_vm(vmid, self.conn) + xmlstr = dom.XMLDesc(0) + root = objectify.fromstring(xmlstr) + try: + hostdev = root.devices.hostdev + except AttributeError: + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + devsmodel = VMHostDevsModel(conn=self.conn) + + for e in hostdev: + deduced_name = devsmodel._deduce_dev_name(e) + if deduced_name == dev_name: + return {'name': dev_name, 'type': e.attrib['type']} + + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + def delete(self, vmid, dev_name): + dom = VMModel.get_vm(vmid, self.conn) + xmlstr = dom.XMLDesc(0) + root = objectify.fromstring(xmlstr) + pci_devs = [] + + try: + hostdev = root.devices.hostdev + except AttributeError: + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + devsmodel = VMHostDevsModel(conn=self.conn) + + for e in hostdev: + deduced_name = devsmodel._deduce_dev_name(e) + if e.attrib['type'] == 'pci': + pci_devs.append((deduced_name, e)) + if deduced_name == dev_name: + dev_e = e + xmlstr = etree.tostring(e) + dom.detachDeviceFlags( + xmlstr, get_vm_config_flag(dom, mode='all')) + break + else: + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + if dev_e.attrib['type'] == 'pci': + self._delete_affected_pci_devices(dom, dev_name, pci_devs) + + def _delete_affected_pci_devices(self, dom, dev_name, pci_devs): + dev_model = DeviceModel(conn=self.conn) + try: + dev_model.lookup(dev_name) + except NotFoundError: + return + + affected_names = set( + PassthroughAffectedDevicesModel(conn=self.conn).get_list(dev_name)) + + for pci_name, e in pci_devs: + if pci_name in affected_names: + xmlstr = etree.tostring(e) + dom.detachDeviceFlags( + xmlstr, get_vm_config_flag(dom, mode='all')) diff --git a/src/kimchi/rollbackcontext.py b/src/kimchi/rollbackcontext.py index 70fcfeb..ba28999 100644 --- a/src/kimchi/rollbackcontext.py +++ b/src/kimchi/rollbackcontext.py @@ -64,3 +64,6 @@ class RollbackContext(object): def prependDefer(self, func, *args, **kwargs): self._finally.insert(0, (func, args, kwargs)) + + def commitAll(self): + self._finally = [] -- 1.9.3

On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
This patch enbales Kimchi's VM to use host devices directly, and it greatly improves the related device performance. The user can assign PCI, USB and SCSI LUN directly to VM, as long as the host supports one of Intel VT-d, AMD IOMMU or POWER sPAPR technology and runs a recent release of Linux kernel.
This patch adds a sub-collection "hostdevs" to the URI vms/vm-name/. The front-end can GET vms/vm-name/hostdevs and vms/vm-name/hostdevs/dev-name or POST (assign) vms/vm-name/hostdevs and DELETE (dismiss) vms/vm-name/hostdevs/dev-name
The eligible devices to assign are the devices listed by the URI host/devices?_passthrough=1 When assigning a host PCI device to VM, all the eligible PCI devices in the same IOMMU group are also automatically assigned, and vice versa when dismissing a host PIC device from the VM.
Some examples:
Assign a USB device: curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ -X POST -d '{"name": "usb_1_1_6"}' \ 'https://127.0.0.1:8001/vms/rhel65/hostdevs'
Assign a PCI device: -d '{"name": "pci_0000_0d_00_0"}'
Assign a SCSI LUN: -d '{"name": "scsi_1_0_0_0"}'
List assigned devices: curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/vms/rhel65/hostdevs' The above command should print following. [ { "type":"scsi", "name":"scsi_1_0_0_0" }, { "type":"usb", "name":"usb_1_1_6" }, { "type":"pci", "name":"pci_0000_0d_00_0" }, { "type":"pci", "name":"pci_0000_03_00_0" } ] Notice that the device pci_0000_03_00_0 is also assigned automatically.
The assigned devices are hot-plugged to VM and also written to the domain XML. When it's possible, it enables VFIO for PCI device assignment.
v1: Handle the devices in the VM template.
v2: Handle the devices in the VM sub-resource "hostdevs".
v3: No change.
v4: Not all domain XMLs contain hostdev node. Deal with the case.
v5: Change _passthrough='1' to _passthrough='true'. When attaching and detaching a device, do not use VIR_DOMAIN_AFFECT_CURRENT flag, instead, use kimchi.model.utils.get_vm_config_flag() to correctly set the device flag.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/vm/hostdevs.py | 44 ++++++ src/kimchi/featuretests.py | 10 +- src/kimchi/i18n.py | 7 + src/kimchi/model/config.py | 2 + src/kimchi/model/vmhostdevs.py | 303 ++++++++++++++++++++++++++++++++++++++ src/kimchi/rollbackcontext.py | 3 + 6 files changed, 368 insertions(+), 1 deletion(-) create mode 100644 src/kimchi/control/vm/hostdevs.py create mode 100644 src/kimchi/model/vmhostdevs.py
diff --git a/src/kimchi/control/vm/hostdevs.py b/src/kimchi/control/vm/hostdevs.py new file mode 100644 index 0000000..81fe8ec --- /dev/null +++ b/src/kimchi/control/vm/hostdevs.py @@ -0,0 +1,44 @@ +# +# Project Kimchi +# +# Copyright IBM, Corp. 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +from kimchi.control.base import Collection, Resource +from kimchi.control.utils import UrlSubNode + + +@UrlSubNode("hostdevs") +class VMHostDevs(Collection): + def __init__(self, model, vmid): + super(VMHostDevs, self).__init__(model) + self.resource = VMHostDev + self.vmid = vmid + self.resource_args = [self.vmid, ] + self.model_args = [self.vmid, ] + + +class VMHostDev(Resource): + def __init__(self, model, vmid, ident): + super(VMHostDev, self).__init__(model, ident) + self.vmid = vmid + self.ident = ident + self.info = {} + self.model_args = [self.vmid, self.ident] + + @property + def data(self): + return self.info diff --git a/src/kimchi/featuretests.py b/src/kimchi/featuretests.py index 5192361..74222bf 100644 --- a/src/kimchi/featuretests.py +++ b/src/kimchi/featuretests.py @@ -29,7 +29,7 @@ from lxml.builder import E
from kimchi.rollbackcontext import RollbackContext -from kimchi.utils import kimchi_log +from kimchi.utils import kimchi_log, run_command
ISO_STREAM_XML = """ @@ -206,3 +206,11 @@ class FeatureTests(object): return True except libvirt.libvirtError: return False + + @staticmethod + def kernel_support_vfio(): + out, err, rc = run_command(['modprobe', 'vfio-pci']) + if rc != 0: + kimchi_log.warning("Unable to load Kernal module vfio-pci.") + return False + return True diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py index 452ede2..4757001 100644 --- a/src/kimchi/i18n.py +++ b/src/kimchi/i18n.py @@ -90,6 +90,13 @@ messages = { "KCHVM0029E": _("Unable to shutdown virtual machine %(name)s. Details: %(err)s"), "KCHVM0030E": _("Unable to get access metadata of virtual machine %(name)s. Details: %(err)s"),
+ "KCHVMHDEV0001E": _("VM %(vmid)s does not contain directly assigned host device %(dev_name)s."), + "KCHVMHDEV0002E": _("The host device %(dev_name)s is not allowed to directly assign to VM."), + "KCHVMHDEV0003E": _("No IOMMU groups found. Host PCI pass through needs IOMMU group to function correctly. " + "Please enable Intel VT-d or AMD IOMMU in your BIOS, then verify the Kernel is compiled with IOMMU support. " + "For Intel CPU, add intel_iommu=on to your Kernel parameter in /boot/grub2/grub.conf. " + "For AMD CPU, add iommu=pt iommu=1."), + "KCHVMIF0001E": _("Interface %(iface)s does not exist in virtual machine %(name)s"), "KCHVMIF0002E": _("Network %(network)s specified for virtual machine %(name)s does not exist"), "KCHVMIF0003E": _("Do not support guest interface hot plug attachment"), diff --git a/src/kimchi/model/config.py b/src/kimchi/model/config.py index 0ef0855..95c8e7e 100644 --- a/src/kimchi/model/config.py +++ b/src/kimchi/model/config.py @@ -54,6 +54,7 @@ class CapabilitiesModel(object): self.libvirt_stream_protocols = [] self.fc_host_support = False self.metadata_support = False + self.kernel_vfio = False
# Subscribe function to set host capabilities to be run when cherrypy # server is up @@ -67,6 +68,7 @@ class CapabilitiesModel(object): self.nfs_target_probe = FeatureTests.libvirt_support_nfs_probe() self.fc_host_support = FeatureTests.libvirt_support_fc_host() self.metadata_support = FeatureTests.has_metadata_support() + self.kernel_vfio = FeatureTests.kernel_support_vfio()
self.libvirt_stream_protocols = [] for p in ['http', 'https', 'ftp', 'ftps', 'tftp']: diff --git a/src/kimchi/model/vmhostdevs.py b/src/kimchi/model/vmhostdevs.py new file mode 100644 index 0000000..9e59513 --- /dev/null +++ b/src/kimchi/model/vmhostdevs.py @@ -0,0 +1,303 @@ +# +# Project Kimchi +# +# Copyright IBM, Corp. 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +import glob + +import libvirt +from lxml import etree, objectify + +from kimchi.exception import InvalidOperation, InvalidParameter, NotFoundError +from kimchi.model.config import CapabilitiesModel +from kimchi.model.host import DeviceModel, DevicesModel +from kimchi.model.host import PassthroughAffectedDevicesModel +from kimchi.model.utils import get_vm_config_flag +from kimchi.model.vms import DOM_STATE_MAP, VMModel +from kimchi.rollbackcontext import RollbackContext +from kimchi.utils import kimchi_log, run_command + + +class VMHostDevsModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, vmid): + dom = VMModel.get_vm(vmid, self.conn) + xmlstr = dom.XMLDesc(0) + root = objectify.fromstring(xmlstr) + try: + hostdev = root.devices.hostdev + except AttributeError: + return [] + + return [self._deduce_dev_name(e) for e in hostdev] + + @staticmethod + def _toint(num_str): + if num_str.startswith('0x'): + return int(num_str, 16) + elif num_str.startswith('0'): + return int(num_str, 8) + else: + return int(num_str) + + def _deduce_dev_name(self, e): + dev_types = { + 'pci': self._deduce_dev_name_pci, + 'scsi': self._deduce_dev_name_scsi, + 'usb': self._deduce_dev_name_usb, + } + return dev_types[e.attrib['type']](e) + + def _deduce_dev_name_pci(self, e): + attrib = {} + for field in ('domain', 'bus', 'slot', 'function'): + attrib[field] = self._toint(e.source.address.attrib[field]) + return 'pci_%(domain)04x_%(bus)02x_%(slot)02x_%(function)x' % attrib + + def _deduce_dev_name_scsi(self, e): + attrib = {} + for field in ('bus', 'target', 'unit'): + attrib[field] = self._toint(e.source.address.attrib[field]) + attrib['host'] = self._toint( + e.source.adapter.attrib['name'][len('scsi_host'):]) + return 'scsi_%(host)d_%(bus)d_%(target)d_%(unit)d' % attrib + + def _deduce_dev_name_usb(self, e): + dev_names = DevicesModel(conn=self.conn).get_list(_cap='usb_device') + usb_infos = [DeviceModel(conn=self.conn).lookup(dev_name) + for dev_name in dev_names] + + unknown_dev = None + + try: + evendor = self._toint(e.source.vendor.attrib['id']) + eproduct = self._toint(e.source.product.attrib['id']) + except AttributeError: + evendor = 0 + eproduct = 0 + else: + unknown_dev = 'usb_vendor_%s_product_%s' % (evendor, eproduct) + + try: + ebus = self._toint(e.source.address.attrib['bus']) + edevice = self._toint(e.source.address.attrib['device']) + except AttributeError: + ebus = -1 + edevice = -1 + else: + unknown_dev = 'usb_bus_%s_device_%s' % (ebus, edevice) + + for usb_info in usb_infos: + ivendor = self._toint(usb_info['vendor']['id']) + iproduct = self._toint(usb_info['product']['id']) + if evendor == ivendor and eproduct == iproduct: + return usb_info['name'] + ibus = usb_info['bus'] + idevice = usb_info['device'] + if ebus == ibus and edevice == idevice: + return usb_info['name'] + return unknown_dev + + def _passthrough_device_validate(self, dev_name): + eligible_dev_names = \ + DevicesModel(conn=self.conn).get_list(_passthrough='true') + if dev_name not in eligible_dev_names: + raise InvalidParameter('KCHVMHDEV0002E', {'dev_name': dev_name}) + + def create(self, vmid, params): + dev_name = params['name'] + self._passthrough_device_validate(dev_name) + dev_info = DeviceModel(conn=self.conn).lookup(dev_name) + attach_device = { + 'pci': self._attach_pci_device, + 'scsi': self._attach_scsi_device, + 'usb_device': self._attach_usb_device, + }[dev_info['device_type']] + return attach_device(vmid, dev_info) + + def _get_pci_device_xml(self, dev_info): + if 'detach_driver' not in dev_info: + dev_info['detach_driver'] = 'kvm' + + xmlstr = ''' + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='%(domain)s' bus='%(bus)s' slot='%(slot)s' + function='%(function)s'/> + </source> + <driver name='%(detach_driver)s'/> + </hostdev>''' % dev_info + return xmlstr + + @staticmethod + def _validate_pci_passthrough_env(): + if not glob.glob('/sys/kernel/iommu_groups/*'): + raise InvalidOperation("KCHVMHDEV0003E") + + # Enable virt_use_sysfs on RHEL6 and older distributions + # In recent Fedora, there is no virt_use_sysfs. + out, err, rc = run_command(['getsebool', 'virt_use_sysfs']) + if rc == 0 and out.rstrip('\n') != "virt_use_sysfs --> on": + out, err, rc = run_command(['setsebool', '-P', + 'virt_use_sysfs=on']) + if rc != 0: + kimchi_log.warning("Unable to turn on sebool virt_use_sysfs") + + def _attach_pci_device(self, vmid, dev_info): + self._validate_pci_passthrough_env() + + dom = VMModel.get_vm(vmid, self.conn) + # Due to libvirt limitation, we don't support live assigne device to + # vfio driver. + driver = ('vfio' if DOM_STATE_MAP[dom.info()[0]] == "shutoff" and + CapabilitiesModel().kernel_vfio else 'kvm') + + # Attach all PCI devices in the same IOMMU group + dev_model = DeviceModel(conn=self.conn) + affected_devs_model = PassthroughAffectedDevicesModel(conn=self.conn) + dev_infos = [dev_model.lookup(dev_name) for dev_name in + affected_devs_model.get_list(dev_info['name'])] + pci_infos = [dev_info] + [info for info in dev_infos + if info['device_type'] == 'pci'] + + device_flags = get_vm_config_flag(dom, mode='all') + + with RollbackContext() as rollback: + for pci_info in pci_infos: + pci_info['detach_driver'] = driver + xmlstr = self._get_pci_device_xml(pci_info) + try: + dom.attachDeviceFlags(xmlstr, device_flags) + except libvirt.libvirtError: + kimchi_log.error( + 'Failed to attach host device %s to VM %s: \n%s', + pci_info['name'], vmid, xmlstr) + raise + rollback.prependDefer(dom.detachDeviceFlags, + xmlstr, device_flags) + rollback.commitAll() + + return dev_info['name'] + + def _get_scsi_device_xml(self, dev_info): + xmlstr = ''' + <hostdev mode='subsystem' type='scsi' sgio='unfiltered'> + <source> + <adapter name='scsi_host%(host)s'/> + <address type='scsi' bus='%(bus)s' target='%(target)s' + unit='%(lun)s'/> + </source> + </hostdev>''' % dev_info + return xmlstr + + def _attach_scsi_device(self, vmid, dev_info): + xmlstr = self._get_scsi_device_xml(dev_info) + dom = VMModel.get_vm(vmid, self.conn) + dom.attachDeviceFlags(xmlstr, get_vm_config_flag(dom, mode='all')) + return dev_info['name'] + + def _get_usb_device_xml(self, dev_info): + xmlstr = ''' + <hostdev mode='subsystem' type='usb' managed='yes'> + <source startupPolicy='optional'> + <vendor id='%s'/> + <product id='%s'/> + <address bus='%s' device='%s'/> + </source> + </hostdev>''' % (dev_info['vendor']['id'], dev_info['product']['id'], + dev_info['bus'], dev_info['device']) + return xmlstr + + def _attach_usb_device(self, vmid, dev_info): + xmlstr = self._get_usb_device_xml(dev_info) + dom = VMModel.get_vm(vmid, self.conn) + dom.attachDeviceFlags(xmlstr, get_vm_config_flag(dom, mode='all')) + return dev_info['name'] + + +class VMHostDevModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def lookup(self, vmid, dev_name): + dom = VMModel.get_vm(vmid, self.conn) + xmlstr = dom.XMLDesc(0) + root = objectify.fromstring(xmlstr) + try: + hostdev = root.devices.hostdev + except AttributeError: + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + devsmodel = VMHostDevsModel(conn=self.conn) + + for e in hostdev: + deduced_name = devsmodel._deduce_dev_name(e) + if deduced_name == dev_name: + return {'name': dev_name, 'type': e.attrib['type']} + + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + def delete(self, vmid, dev_name): + dom = VMModel.get_vm(vmid, self.conn) + xmlstr = dom.XMLDesc(0) + root = objectify.fromstring(xmlstr) + pci_devs = [] + + try: + hostdev = root.devices.hostdev + except AttributeError: + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + devsmodel = VMHostDevsModel(conn=self.conn) + + for e in hostdev: + deduced_name = devsmodel._deduce_dev_name(e) + if e.attrib['type'] == 'pci': + pci_devs.append((deduced_name, e)) + if deduced_name == dev_name: + dev_e = e + xmlstr = etree.tostring(e) + dom.detachDeviceFlags( + xmlstr, get_vm_config_flag(dom, mode='all')) + break + else: + raise NotFoundError('KCHVMHDEV0001E', + {'vmid': vmid, 'dev_name': dev_name}) + + if dev_e.attrib['type'] == 'pci': + self._delete_affected_pci_devices(dom, dev_name, pci_devs) + + def _delete_affected_pci_devices(self, dom, dev_name, pci_devs): + dev_model = DeviceModel(conn=self.conn) + try: + dev_model.lookup(dev_name) + except NotFoundError: + return + + affected_names = set( + PassthroughAffectedDevicesModel(conn=self.conn).get_list(dev_name)) + + for pci_name, e in pci_devs: + if pci_name in affected_names: + xmlstr = etree.tostring(e) + dom.detachDeviceFlags( + xmlstr, get_vm_config_flag(dom, mode='all')) diff --git a/src/kimchi/rollbackcontext.py b/src/kimchi/rollbackcontext.py index 70fcfeb..ba28999 100644 --- a/src/kimchi/rollbackcontext.py +++ b/src/kimchi/rollbackcontext.py @@ -64,3 +64,6 @@ class RollbackContext(object):
def prependDefer(self, func, *args, **kwargs): self._finally.insert(0, (func, args, kwargs)) + + def commitAll(self): + self._finally = [] Reviewed-by: Mark Wu<wudxw@linux.vnet.ibm.com>

Add a "vm_holders" sub-collection under host device resource, so the front-end can determine if a device is busy or not, and the user can know which VMs are holding the device. This patch scans all VM XML to check if a device is hold by a VM. Also adds a check to keep the host device assigned to only one VM. Example curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices/usb_1_1_6/vm_holders' Should output a list like following. [ { "state":"shutoff", "name":"fedora20" }, { "state":"running", "name":"f20xfce-slave" } ] If there is no VM holding the device, it prints an empty list []. v5: When assigning a device to VM, check if there are other VMs holding the device and raise an exception. Move the VMHoldersModel to vmhostdevs.py to avoid circular import problem. Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 7 +++++++ src/kimchi/i18n.py | 2 ++ src/kimchi/model/vmhostdevs.py | 24 ++++++++++++++++++++++++ 3 files changed, 33 insertions(+) diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index 15f2343..efc31ea 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -110,11 +110,18 @@ class PassthroughAffectedDevices(Collection): self.model_args = (device_id, ) +class VMHolders(SimpleCollection): + def __init__(self, model, device_id): + super(VMHolders, self).__init__(model) + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) self.passthrough_affected_devices = \ PassthroughAffectedDevices(self.model, id) + self.vm_holders = VMHolders(self.model, id) @property def data(self): diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py index 4757001..5dabc26 100644 --- a/src/kimchi/i18n.py +++ b/src/kimchi/i18n.py @@ -96,6 +96,8 @@ messages = { "Please enable Intel VT-d or AMD IOMMU in your BIOS, then verify the Kernel is compiled with IOMMU support. " "For Intel CPU, add intel_iommu=on to your Kernel parameter in /boot/grub2/grub.conf. " "For AMD CPU, add iommu=pt iommu=1."), + "KCHVMHDEV0004E": _("The host device %(dev_name)s should be assigned to just one VM. " + "Currently the following VM(s) are holding the device: %(names)s."), "KCHVMIF0001E": _("Interface %(iface)s does not exist in virtual machine %(name)s"), "KCHVMIF0002E": _("Network %(network)s specified for virtual machine %(name)s does not exist"), diff --git a/src/kimchi/model/vmhostdevs.py b/src/kimchi/model/vmhostdevs.py index 9e59513..817a054 100644 --- a/src/kimchi/model/vmhostdevs.py +++ b/src/kimchi/model/vmhostdevs.py @@ -119,6 +119,11 @@ class VMHostDevsModel(object): DevicesModel(conn=self.conn).get_list(_passthrough='true') if dev_name not in eligible_dev_names: raise InvalidParameter('KCHVMHDEV0002E', {'dev_name': dev_name}) + holders = VMHoldersModel(conn=self.conn).get_list(dev_name) + if holders: + names = ', '.join([holder['name'] for holder in holders]) + raise InvalidOperation('KCHVMHDEV0004E', {'dev_name': dev_name, + 'names': names}) def create(self, vmid, params): dev_name = params['name'] @@ -301,3 +306,22 @@ class VMHostDevModel(object): xmlstr = etree.tostring(e) dom.detachDeviceFlags( xmlstr, get_vm_config_flag(dom, mode='all')) + + +class VMHoldersModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + devsmodel = VMHostDevsModel(conn=self.conn) + + conn = self.conn.get() + doms = conn.listAllDomains(0) + + res = [] + for dom in doms: + dom_name = dom.name() + if device_id in devsmodel.get_list(dom_name): + state = DOM_STATE_MAP[dom.info()[0]] + res.append({"name": dom_name, "state": state}) + return res -- 1.9.3

On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
Add a "vm_holders" sub-collection under host device resource, so the front-end can determine if a device is busy or not, and the user can know which VMs are holding the device.
This patch scans all VM XML to check if a device is hold by a VM. Also adds a check to keep the host device assigned to only one VM.
Example curl -k -u root -H "Content-Type: application/json" \ -H "Accept: application/json" \ 'https://127.0.0.1:8001/host/devices/usb_1_1_6/vm_holders' Should output a list like following. [ { "state":"shutoff", "name":"fedora20" }, { "state":"running", "name":"f20xfce-slave" } ]
If there is no VM holding the device, it prints an empty list [].
v5: When assigning a device to VM, check if there are other VMs holding the device and raise an exception. Move the VMHoldersModel to vmhostdevs.py to avoid circular import problem.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 7 +++++++ src/kimchi/i18n.py | 2 ++ src/kimchi/model/vmhostdevs.py | 24 ++++++++++++++++++++++++ 3 files changed, 33 insertions(+)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index 15f2343..efc31ea 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -110,11 +110,18 @@ class PassthroughAffectedDevices(Collection): self.model_args = (device_id, )
+class VMHolders(SimpleCollection): + def __init__(self, model, device_id): + super(VMHolders, self).__init__(model) + self.model_args = (device_id, ) + + class Device(Resource): def __init__(self, model, id): super(Device, self).__init__(model, id) self.passthrough_affected_devices = \ PassthroughAffectedDevices(self.model, id) + self.vm_holders = VMHolders(self.model, id)
@property def data(self): diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py index 4757001..5dabc26 100644 --- a/src/kimchi/i18n.py +++ b/src/kimchi/i18n.py @@ -96,6 +96,8 @@ messages = { "Please enable Intel VT-d or AMD IOMMU in your BIOS, then verify the Kernel is compiled with IOMMU support. " "For Intel CPU, add intel_iommu=on to your Kernel parameter in /boot/grub2/grub.conf. " "For AMD CPU, add iommu=pt iommu=1."), + "KCHVMHDEV0004E": _("The host device %(dev_name)s should be assigned to just one VM. " + "Currently the following VM(s) are holding the device: %(names)s."),
"KCHVMIF0001E": _("Interface %(iface)s does not exist in virtual machine %(name)s"), "KCHVMIF0002E": _("Network %(network)s specified for virtual machine %(name)s does not exist"), diff --git a/src/kimchi/model/vmhostdevs.py b/src/kimchi/model/vmhostdevs.py index 9e59513..817a054 100644 --- a/src/kimchi/model/vmhostdevs.py +++ b/src/kimchi/model/vmhostdevs.py @@ -119,6 +119,11 @@ class VMHostDevsModel(object): DevicesModel(conn=self.conn).get_list(_passthrough='true') if dev_name not in eligible_dev_names: raise InvalidParameter('KCHVMHDEV0002E', {'dev_name': dev_name}) + holders = VMHoldersModel(conn=self.conn).get_list(dev_name) + if holders: + names = ', '.join([holder['name'] for holder in holders]) + raise InvalidOperation('KCHVMHDEV0004E', {'dev_name': dev_name, + 'names': names})
def create(self, vmid, params): dev_name = params['name'] @@ -301,3 +306,22 @@ class VMHostDevModel(object): xmlstr = etree.tostring(e) dom.detachDeviceFlags( xmlstr, get_vm_config_flag(dom, mode='all')) + + +class VMHoldersModel(object): + def __init__(self, **kargs): + self.conn = kargs['conn'] + + def get_list(self, device_id): + devsmodel = VMHostDevsModel(conn=self.conn) + + conn = self.conn.get() + doms = conn.listAllDomains(0) + + res = [] + for dom in doms: + dom_name = dom.name() + if device_id in devsmodel.get_list(dom_name): + state = DOM_STATE_MAP[dom.info()[0]] + res.append({"name": dom_name, "state": state}) + return res Reviewed-by: Mark Wu<wudxw@linux.vnet.ibm.com>
participants (3)
-
Mark Wu
-
Sheldon
-
Zhou Zheng Sheng