[Kimchi-devel] [PATCH v10 2/5] Host device passthrough: List eligible device to passthrough

Zhou Zheng Sheng zhshzhou at linux.vnet.ibm.com
Fri Aug 1 03:19:48 UTC 2014


This patch adds a '_passthrough=true' filter to /host/devices, so it can
filter and shows all devices eligible to passthrough to guest.
Theoretically, all PCI, USB and SCSI devices can be assigned to guest
directly. However usually all host devices form a tree, if we assign a
PCI port/SCSI controller/USB controller to guest, all devices/disks
under the controller are assigned as well. In this patch we only present
the "leaf" host devices to the user as potential passthrough devices.
In other word, the possible devices are wireless network interface, SD
card reader, camera, SCSI unit(disk or CD), and so on.

Linux kernel is able to recognize the host IOMMU group layout. If two
PCI devices are in the same IOMMU group, it means there are possible
interconnections between the devices, and the devices can talk to each
other bypassing IOMMU. This implies isolation is not pefect between those
devices, so all devices in a IOMMU group must be assigned to guest
together. On host that recognizes IOMMU groups, by accessing the URI
/host/devices/deviceX/passthrough_affected_devices, it returns a list
containing the devices in the same IOMMU group as deviceX.

How to test:

List all types of devices to passthrough
  curl -k -u root -H "Content-Type: application/json" \
    -H "Accept: application/json" \
    'https://127.0.0.1:8001/host/devices?_passthrough=true'

List all eligible PCI devices to passthrough
  /host/devices?_passthrough=true&_cap=pci

List all USB devices to passthrough
  /host/devices?_passthrough=true&_cap=usb_device

List all SCSI devices to passthrough
  /host/devices?_passthrough=true&_cap=scsi

List devices in the same IOMMU group as pci_0000_00_19_0
  /host/devices?_passthrough_group_by=pci_0000_00_19_0

v1:
  v1 series does not contain this patch.

v2:
  Deal with calculation "leaf" device and "affected" device.

v3 v4:
  No change.

v5:
  Change _passthrough=1 to _passthrough=true in the URI scheme. Filter
PCI devices according the PCI class.

v6:
  Don't passthrough PCI device of class code 07. In modern
x86 machine, it's possible that
"6 Series/C200 Series Chipset Family MEI Controller" and
"6 Series/C200 Series Chipset Family KT Controller"
are of this class code. These two devices are not suitable to
passthrough to guest. We don't have simple and reliable way to
distinguish normal serial controller and host chipset XXX controller.
This type of PCI devices also include various serial, parallel, modem,
communication controller. Serial and parallel controllers can be
re-direct from ttyS0 to QEMU's pty using socat, and there is little
performance benefit to directly assign to guest. So it'k ok not to
passththrough PCI device of class code 07.

v8:
  Use a new flag filter "_passthrough_group_by"
    /host/devices?_passthrough_group_by=pci_XXX
  instead of using sub-collection
    /host/devices/pci_XXX/passthrough_affected_devices

v9:
  Use the same LibvirtConnection object as the Model, so as to avoid
  connection exhausting.

v10:
  Adapt to RHEL 6. RHEL 6 does not provide iommu group information in
  sysfs. For now we just ignore this error and live with it. The device
  passthrough for PCI devices will not work, but the basic devices
  informations are still provided to the user. In future we'll develope
  code to gather iommu group information.

Signed-off-by: Zhou Zheng Sheng <zhshzhou at linux.vnet.ibm.com>
---
 src/kimchi/i18n.py          |   1 +
 src/kimchi/model/host.py    |  24 ++++++++-
 src/kimchi/model/hostdev.py | 115 ++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 135 insertions(+), 5 deletions(-)

diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py
index a34ab21..eea7deb 100644
--- a/src/kimchi/i18n.py
+++ b/src/kimchi/i18n.py
@@ -221,6 +221,7 @@ messages = {
     "KCHHOST0001E": _("Unable to shutdown host machine as there are running virtual machines"),
     "KCHHOST0002E": _("Unable to reboot host machine as there are running virtual machines"),
     "KCHHOST0003E": _("Node device '%(name)s' not found"),
+    "KCHHOST0004E": _("Conflicting flag filters specified."),
 
     "KCHPKGUPD0001E": _("No packages marked for update"),
     "KCHPKGUPD0002E": _("Package %(name)s is not marked to be updated."),
diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py
index 97adeed..3035e00 100644
--- a/src/kimchi/model/host.py
+++ b/src/kimchi/model/host.py
@@ -31,8 +31,9 @@ from kimchi import disks
 from kimchi import netinfo
 from kimchi import xmlutils
 from kimchi.basemodel import Singleton
-from kimchi.exception import InvalidOperation, NotFoundError, OperationFailed
 from kimchi.model import hostdev
+from kimchi.exception import InvalidOperation, InvalidParameter
+from kimchi.exception import NotFoundError, OperationFailed
 from kimchi.model.config import CapabilitiesModel
 from kimchi.model.tasks import TaskModel
 from kimchi.model.vms import DOM_STATE_MAP
@@ -279,8 +280,16 @@ class DevicesModel(object):
     def __init__(self, **kargs):
         self.conn = kargs['conn']
 
-    def get_list(self, _cap=None):
+    def get_list(self, _cap=None, _passthrough=None,
+                 _passthrough_group_by=None):
         conn = self.conn.get()
+
+        if _passthrough_group_by is not None:
+            # _passthrough_group_by conflicts with _cap and _passthrough
+            if (_cap, _passthrough) != (None, None):
+                raise InvalidParameter("KCHHOST0004E")
+            return self._get_passthrough_affected_devs(_passthrough_group_by)
+
         if _cap is None:
             dev_names = [name.name() for name in conn.listAllDevices(0)]
         elif _cap == 'fc_host':
@@ -288,8 +297,19 @@ class DevicesModel(object):
         else:
             # Get devices with required capability
             dev_names = conn.listDevices(_cap, 0)
+
+        if _passthrough is not None and _passthrough.lower() == 'true':
+            passthrough_names = [
+                dev['name'] for dev in hostdev.get_passthrough_dev_infos(conn)]
+            dev_names = list(set(dev_names) & set(passthrough_names))
         return dev_names
 
+    def _get_passthrough_affected_devs(self, dev_name):
+        conn = self.conn.get()
+        info = DeviceModel(conn=self.conn).lookup(dev_name)
+        affected = hostdev.get_affected_passthrough_devices(conn, info)
+        return [dev_info['name'] for dev_info in affected]
+
     def _get_devices_fc_host(self):
         conn = self.conn.get()
         # Libvirt < 1.0.5 does not support fc_host capability
diff --git a/src/kimchi/model/hostdev.py b/src/kimchi/model/hostdev.py
index 002b16c..a8d1b9a 100644
--- a/src/kimchi/model/hostdev.py
+++ b/src/kimchi/model/hostdev.py
@@ -17,7 +17,9 @@
 # License along with this library; if not, write to the Free Software
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 USA
 
+import os
 from pprint import pformat
+from pprint import pprint
 
 from kimchi.model.libvirtconnection import LibvirtConnection
 from kimchi.utils import kimchi_log
@@ -47,6 +49,111 @@ def _get_dev_info_tree(dev_infos):
     return root
 
 
+def _strip_parents(devs, dev):
+    parent = dev['parent']
+    while parent is not None:
+        try:
+            parent_dev = devs.pop(parent)
+        except KeyError:
+            break
+
+        if (parent_dev['device_type'],
+                dev['device_type']) == ('usb_device', 'scsi'):
+            # For USB device containing mass storage, passthrough the
+            # USB device itself, not the SCSI unit.
+            devs.pop(dev['name'])
+            break
+
+        parent = parent_dev['parent']
+
+
+def _is_pci_qualified(pci_dev):
+    # PCI class such as bridge and storage controller are not suitable to
+    # passthrough to VM, so we make a whitelist and only passthrough PCI
+    # class in the list.
+
+    whitelist_pci_classes = {
+        # Refer to Linux Kernel code include/linux/pci_ids.h
+        0x000000: {  # Old PCI devices
+            0x000100: None},  # Old VGA devices
+        0x020000: None,  # Network controller
+        0x030000: None,  # Display controller
+        0x040000: None,  # Multimedia device
+        0x080000: {  # System Peripheral
+            0x088000: None},  # Misc Peripheral, such as SDXC/MMC Controller
+        0x090000: None,  # Inupt device
+        0x0d0000: None,  # Wireless controller
+        0x0f0000: None,  # Satellite communication controller
+        0x100000: None,  # Cryption controller
+        0x110000: None,  # Signal Processing controller
+        }
+
+    with open(os.path.join(pci_dev['path'], 'class')) as f:
+        pci_class = int(f.readline().strip(), 16)
+
+    try:
+        subclasses = whitelist_pci_classes[pci_class & 0xff0000]
+    except KeyError:
+        return False
+
+    if subclasses is None:
+        return True
+
+    if pci_class & 0xffff00 in subclasses:
+        return True
+
+    return False
+
+
+def get_passthrough_dev_infos(libvirt_conn):
+    ''' Get devices eligible to be passed through to VM. '''
+
+    dev_infos = _get_all_host_dev_infos(libvirt_conn)
+    devs = dict([(dev_info['name'], dev_info) for dev_info in dev_infos])
+
+    for dev in dev_infos:
+        if dev['device_type'] in ('pci', 'usb_device', 'scsi'):
+            _strip_parents(devs, dev)
+
+    def is_eligible(dev):
+        return dev['device_type'] in ('usb_device', 'scsi') or \
+            (dev['device_type'] == 'pci' and _is_pci_qualified(dev))
+
+    return [dev for dev in devs.itervalues() if is_eligible(dev)]
+
+
+def get_affected_passthrough_devices(libvirt_conn, passthrough_dev):
+    devs = dict([(dev['name'], dev) for dev in
+                 _get_all_host_dev_infos(libvirt_conn)])
+
+    def get_iommu_group(dev_info):
+        try:
+            return dev_info['iommuGroup']
+        except KeyError:
+            pass
+
+        parent = dev_info['parent']
+        while parent is not None:
+            try:
+                iommuGroup = devs[parent]['iommuGroup']
+            except KeyError:
+                pass
+            else:
+                return iommuGroup
+            parent = devs[parent]['parent']
+
+        return -1
+
+    iommu_group = get_iommu_group(passthrough_dev)
+
+    if iommu_group == -1:
+        return []
+
+    return [dev for dev in get_passthrough_dev_infos(libvirt_conn)
+            if dev['name'] != passthrough_dev['name'] and
+            get_iommu_group(dev) == iommu_group]
+
+
 def get_dev_info(node_dev):
     ''' Parse the node device XML string into dict according to
     http://libvirt.org/formatnode.html. '''
@@ -187,8 +294,7 @@ def _get_usb_device_dev_info(info):
 
 
 # For test and debug
-def _print_host_dev_tree():
-    libvirt_conn = LibvirtConnection('qemu:///system').get()
+def _print_host_dev_tree(libvirt_conn):
     dev_infos = _get_all_host_dev_infos(libvirt_conn)
     root = _get_dev_info_tree(dev_infos)
     if root is None:
@@ -226,4 +332,7 @@ def _format_dev_node(node):
 
 
 if __name__ == '__main__':
-    _print_host_dev_tree()
+    libvirt_conn = LibvirtConnection('qemu:///system').get()
+    _print_host_dev_tree(libvirt_conn)
+    print 'Eligible passthrough devices:'
+    pprint(get_passthrough_dev_infos(libvirt_conn))
-- 
1.9.3




More information about the Kimchi-devel mailing list