[Kimchi-devel] [PATCH v12 2/6] Host device passthrough: List eligible device to passthrough

Zhou Zheng Sheng zhshzhou at linux.vnet.ibm.com
Wed Oct 8 09:08:39 UTC 2014


This patch adds a '_passthrough=true' filter to /host/devices, so it can
filter and shows all devices eligible to passthrough to guest.
Theoretically, all PCI, USB and SCSI devices can be assigned to guest
directly.

Linux kernel is able to recognize the host IOMMU group layout. If two
PCI devices are in the same IOMMU group, it means there are possible
interconnections between the devices, and the devices can talk to each
other bypassing IOMMU. This implies isolation is not pefect between those
devices, so all devices in a IOMMU group must be assigned to guest
together. On host that recognizes IOMMU groups, by accessing the URI
/host/devices?_passthrough_affected_by=DEVICE_NAME, it returns a list
containing the devices in the same IOMMU group as DEVICE_NAME, and all
of the children devices of them. So the front-end can show all the
affected devices to user, and it helps the user to determine which
host devices are to be assigned to guest.

How to test:

List all types of devices to passthrough
  curl -k -u root -H "Content-Type: application/json" \
    -H "Accept: application/json" \
    'https://127.0.0.1:8001/host/devices?_passthrough=true'

List all eligible PCI devices to passthrough
  /host/devices?_passthrough=true&_cap=pci

List all USB devices to passthrough
  /host/devices?_passthrough=true&_cap=usb_device

List all SCSI devices to passthrough
  /host/devices?_passthrough=true&_cap=scsi

List devices in the same IOMMU group as pci_0000_00_19_0
  /host/devices?_passthrough_affected_by=pci_0000_00_19_0

v1:
  v1 series does not contain this patch.

v2:
  Deal with calculation "leaf" device and "affected" device.

v5:
  Change _passthrough=1 to _passthrough=true in the URI scheme. Filter
PCI devices according the PCI class.

v6:
  Don't passthrough PCI device of class code 07. In modern
x86 machine, it's possible that
"6 Series/C200 Series Chipset Family MEI Controller" and
"6 Series/C200 Series Chipset Family KT Controller"
are of this class code. These two devices are not suitable to
passthrough to guest. We don't have simple and reliable way to
distinguish normal serial controller and host chipset XXX controller.
This type of PCI devices also include various serial, parallel, modem,
communication controller. Serial and parallel controllers can be
re-direct from ttyS0 to QEMU's pty using socat, and there is little
performance benefit to directly assign to guest. So it'k ok not to
passththrough PCI device of class code 07.

v8:
  Use a new flag filter "_passthrough_group_by"
    /host/devices?_passthrough_group_by=pci_XXX
  instead of using sub-collection
    /host/devices/pci_XXX/passthrough_affected_devices

v9:
  Use the same LibvirtConnection object as the Model, so as to avoid
  connection exhausting.

v10:
  Adapt to RHEL 6. RHEL 6 does not provide iommu group information in
  sysfs. For now we just ignore this error and live with it. The device
  passthrough for PCI devices will not work, but the basic devices
  informations are still provided to the user. In future we'll develope
  code to gather iommu group information.

v11:
  In previous commits, we don't allow to passthrough device with children,
  and only passthrough the children. It proves it's inflexible and less
  useful. The PCI class code white list also filter out too much types of
  devices, and there is no way to cleanly differenciate devices suitable
  to passthrough using class code.

  In this patch, we allow Kimchi to passthrough a parent PCI device. The
  front-end can use the existing "_passthrough_group_by" filter to list
  the affected children devices and other PCI devices in the same group,
  as well as the children of those devices. We also drop the class code
  white list, and filter out only PCI bridge and video cards.

  libvirt uses domain, bus, slot and function to encode the device name,
  so sorting the device name results the same effect as sorting based on
  domain:bus:slot:function. This patch sorts all the devices based on the
  name.

  When an SCSI adapter is assigned to virtual machine, the previous
  node device scsi_host and scsi_target become stale for the host machine.
  Unfortunately, libvirt only removes the invalid scsi_host device, but
  not scsi_target. When Kimchi is looking up the parent scsi_host of a
  scsi_target, the scsi_host device actually does not exist, and it
  explodes.

  This patch catches such error and ignores those devices without a valid
  parent device.

v12:
  Coding style improvements. Use "None" instead of "-1".

Signed-off-by: Zhou Zheng Sheng <zhshzhou at linux.vnet.ibm.com>
---
 src/kimchi/i18n.py          |   1 +
 src/kimchi/model/host.py    |  33 ++++++++++--
 src/kimchi/model/hostdev.py | 121 ++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 147 insertions(+), 8 deletions(-)

diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py
index 1b543ce..98adc46 100644
--- a/src/kimchi/i18n.py
+++ b/src/kimchi/i18n.py
@@ -233,6 +233,7 @@ messages = {
     "KCHHOST0001E": _("Unable to shutdown host machine as there are running virtual machines"),
     "KCHHOST0002E": _("Unable to reboot host machine as there are running virtual machines"),
     "KCHHOST0003E": _("Node device '%(name)s' not found"),
+    "KCHHOST0004E": _("Conflicting flag filters specified."),
 
     "KCHPKGUPD0001E": _("No packages marked for update"),
     "KCHPKGUPD0002E": _("Package %(name)s is not marked to be updated."),
diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py
index 7d7cd66..5d31809 100644
--- a/src/kimchi/model/host.py
+++ b/src/kimchi/model/host.py
@@ -32,8 +32,9 @@ from kimchi import disks
 from kimchi import netinfo
 from kimchi import xmlutils
 from kimchi.basemodel import Singleton
-from kimchi.exception import InvalidOperation, NotFoundError, OperationFailed
 from kimchi.model import hostdev
+from kimchi.exception import InvalidOperation, InvalidParameter
+from kimchi.exception import NotFoundError, OperationFailed
 from kimchi.model.config import CapabilitiesModel
 from kimchi.model.tasks import TaskModel
 from kimchi.model.vms import DOM_STATE_MAP
@@ -299,10 +300,28 @@ class DevicesModel(object):
         except AttributeError:
             self.cap_map['fc_host'] = None
 
-    def get_list(self, _cap=None):
+    def get_list(self, _cap=None, _passthrough=None,
+                 _passthrough_affected_by=None):
+        if _passthrough_affected_by is not None:
+            # _passthrough_affected_by conflicts with _cap and _passthrough
+            if (_cap, _passthrough) != (None, None):
+                raise InvalidParameter("KCHHOST0004E")
+            return sorted(
+                self._get_passthrough_affected_devs(_passthrough_affected_by))
+
         if _cap == 'fc_host':
-            return self._get_devices_fc_host()
-        return self._get_devices_with_capability(_cap)
+            dev_names = self._get_devices_fc_host()
+        else:
+            dev_names = self._get_devices_with_capability(_cap)
+
+        if _passthrough is not None and _passthrough.lower() == 'true':
+            conn = self.conn.get()
+            passthrough_names = [
+                dev['name'] for dev in hostdev.get_passthrough_dev_infos(conn)]
+            dev_names = list(set(dev_names) & set(passthrough_names))
+
+        dev_names.sort()
+        return dev_names
 
     def _get_devices_with_capability(self, cap):
         conn = self.conn.get()
@@ -314,6 +333,12 @@ class DevicesModel(object):
                 return []
         return [name.name() for name in conn.listAllDevices(cap_flag)]
 
+    def _get_passthrough_affected_devs(self, dev_name):
+        conn = self.conn.get()
+        info = DeviceModel(conn=self.conn).lookup(dev_name)
+        affected = hostdev.get_affected_passthrough_devices(conn, info)
+        return [dev_info['name'] for dev_info in affected]
+
     def _get_devices_fc_host(self):
         conn = self.conn.get()
         # Libvirt < 1.0.5 does not support fc_host capability
diff --git a/src/kimchi/model/hostdev.py b/src/kimchi/model/hostdev.py
index 103c1e7..63cdb21 100644
--- a/src/kimchi/model/hostdev.py
+++ b/src/kimchi/model/hostdev.py
@@ -17,7 +17,9 @@
 # License along with this library; if not, write to the Free Software
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 USA
 
+import os
 from pprint import pformat
+from pprint import pprint
 
 from kimchi.model.libvirtconnection import LibvirtConnection
 from kimchi.utils import kimchi_log
@@ -36,7 +38,13 @@ def _get_dev_info_tree(dev_infos):
         if dev_info['parent'] is None:
             root = dev_info
             continue
-        parent = devs[dev_info['parent']]
+
+        try:
+            parent = devs[dev_info['parent']]
+        except KeyError:
+            kimchi_log.error('Parent %s of device %s does not exist.',
+                             dev_info['parent'], dev_info['name'])
+            continue
 
         try:
             children = parent['children']
@@ -47,6 +55,109 @@ def _get_dev_info_tree(dev_infos):
     return root
 
 
+def _is_pci_qualified(pci_dev):
+    # PCI bridge is not suitable to passthrough
+    # KVM does not support passthrough graphic card now
+    blacklist_classes = (0x030000, 0x060000)
+
+    with open(os.path.join(pci_dev['path'], 'class')) as f:
+        pci_class = int(f.readline().strip(), 16)
+
+    if pci_class & 0xff0000 in blacklist_classes:
+        return False
+
+    return True
+
+
+def get_passthrough_dev_infos(libvirt_conn):
+    ''' Get devices eligible to be passed through to VM. '''
+
+    def is_eligible(dev):
+        return dev['device_type'] in ('usb_device', 'scsi') or \
+            (dev['device_type'] == 'pci' and _is_pci_qualified(dev))
+
+    dev_infos = _get_all_host_dev_infos(libvirt_conn)
+
+    return [dev_info for dev_info in dev_infos if is_eligible(dev_info)]
+
+
+def _get_same_iommugroup_devices(dev_infos, device_info):
+    dev_dict = dict([(dev_info['name'], dev_info) for dev_info in dev_infos])
+
+    def get_iommu_group(dev_info):
+        # Find out the iommu group of a given device.
+        # Child device belongs to the same iommu group as the parent device.
+        try:
+            return dev_info['iommuGroup']
+        except KeyError:
+            pass
+
+        parent = dev_info['parent']
+        while parent is not None:
+            try:
+                parent_info = dev_dict[parent]
+            except KeyError:
+                kimchi_log.error("Parent %s of device %s does not exist",
+                                 parent, dev_info['name'])
+                break
+
+            try:
+                iommuGroup = parent_info['iommuGroup']
+            except KeyError:
+                pass
+            else:
+                return iommuGroup
+
+            parent = parent_info['parent']
+
+        return None
+
+    iommu_group = get_iommu_group(device_info)
+
+    if iommu_group is None:
+        return []
+
+    return [dev_info for dev_info in dev_infos
+            if dev_info['name'] != device_info['name'] and
+            get_iommu_group(dev_info) == iommu_group]
+
+
+def _get_children_devices(dev_infos, device_info):
+    def get_children_recursive(parent):
+        try:
+            children = parent['children']
+        except KeyError:
+            return []
+
+        result = []
+        for child in children:
+            result.append(child)
+            result.extend(get_children_recursive(child))
+
+        return result
+
+    # Annotate every the dev_info element with children information
+    _get_dev_info_tree(dev_infos)
+
+    for dev_info in dev_infos:
+        if dev_info['name'] == device_info['name']:
+            return get_children_recursive(dev_info)
+
+    return []
+
+
+def get_affected_passthrough_devices(libvirt_conn, passthrough_dev):
+    dev_infos = _get_all_host_dev_infos(libvirt_conn)
+
+    group_devices = _get_same_iommugroup_devices(dev_infos, passthrough_dev)
+    if not group_devices:
+        # On host without iommu group support, the affected devices should
+        # at least include all children devices
+        group_devices.extend(_get_children_devices(dev_infos, passthrough_dev))
+
+    return group_devices
+
+
 def get_dev_info(node_dev):
     ''' Parse the node device XML string into dict according to
     http://libvirt.org/formatnode.html.
@@ -168,8 +279,7 @@ def _get_usb_device_dev_info(info):
 
 
 # For test and debug
-def _print_host_dev_tree():
-    libvirt_conn = LibvirtConnection('qemu:///system').get()
+def _print_host_dev_tree(libvirt_conn):
     dev_infos = _get_all_host_dev_infos(libvirt_conn)
     root = _get_dev_info_tree(dev_infos)
     if root is None:
@@ -207,4 +317,7 @@ def _format_dev_node(node):
 
 
 if __name__ == '__main__':
-    _print_host_dev_tree()
+    libvirt_conn = LibvirtConnection('qemu:///system').get()
+    _print_host_dev_tree(libvirt_conn)
+    print 'Eligible passthrough devices:'
+    pprint(get_passthrough_dev_infos(libvirt_conn))
-- 
1.9.3




More information about the Kimchi-devel mailing list