[Kimchi-devel] [PATCH v11 2/6] Host device passthrough: List eligible device to passthrough

Aline Manera alinefm at linux.vnet.ibm.com
Fri Oct 3 16:53:35 UTC 2014


On 09/30/2014 07:00 AM, Zhou Zheng Sheng wrote:
> This patch adds a '_passthrough=true' filter to /host/devices, so it can
> filter and shows all devices eligible to passthrough to guest.
> Theoretically, all PCI, USB and SCSI devices can be assigned to guest
> directly.
>
> Linux kernel is able to recognize the host IOMMU group layout. If two
> PCI devices are in the same IOMMU group, it means there are possible
> interconnections between the devices, and the devices can talk to each
> other bypassing IOMMU. This implies isolation is not pefect between those
> devices, so all devices in a IOMMU group must be assigned to guest
> together. On host that recognizes IOMMU groups, by accessing the URI
> /host/devices?_passthrough_affected_by=DEVICE_NAME, it returns a list
> containing the devices in the same IOMMU group as DEVICE_NAME, and all
> of the children devices of them. So the front-end can show all the
> affected devices to user, and it helps the user to determine which
> host devices are to be assigned to guest.
>
> How to test:
>
> List all types of devices to passthrough
>    curl -k -u root -H "Content-Type: application/json" \
>      -H "Accept: application/json" \
>      'https://127.0.0.1:8001/host/devices?_passthrough=true'
>
> List all eligible PCI devices to passthrough
>    /host/devices?_passthrough=true&_cap=pci
>
> List all USB devices to passthrough
>    /host/devices?_passthrough=true&_cap=usb_device
>
> List all SCSI devices to passthrough
>    /host/devices?_passthrough=true&_cap=scsi
>
> List devices in the same IOMMU group as pci_0000_00_19_0
>    /host/devices?_passthrough_affected_by=pci_0000_00_19_0
>
> v1:
>    v1 series does not contain this patch.
>
> v2:
>    Deal with calculation "leaf" device and "affected" device.
>
> v5:
>    Change _passthrough=1 to _passthrough=true in the URI scheme. Filter
> PCI devices according the PCI class.
>
> v6:
>    Don't passthrough PCI device of class code 07. In modern
> x86 machine, it's possible that
> "6 Series/C200 Series Chipset Family MEI Controller" and
> "6 Series/C200 Series Chipset Family KT Controller"
> are of this class code. These two devices are not suitable to
> passthrough to guest. We don't have simple and reliable way to
> distinguish normal serial controller and host chipset XXX controller.
> This type of PCI devices also include various serial, parallel, modem,
> communication controller. Serial and parallel controllers can be
> re-direct from ttyS0 to QEMU's pty using socat, and there is little
> performance benefit to directly assign to guest. So it'k ok not to
> passththrough PCI device of class code 07.
>
> v8:
>    Use a new flag filter "_passthrough_group_by"
>      /host/devices?_passthrough_group_by=pci_XXX
>    instead of using sub-collection
>      /host/devices/pci_XXX/passthrough_affected_devices
>
> v9:
>    Use the same LibvirtConnection object as the Model, so as to avoid
>    connection exhausting.
>
> v10:
>    Adapt to RHEL 6. RHEL 6 does not provide iommu group information in
>    sysfs. For now we just ignore this error and live with it. The device
>    passthrough for PCI devices will not work, but the basic devices
>    informations are still provided to the user. In future we'll develope
>    code to gather iommu group information.
>
> v11:
>    In previous commits, we don't allow to passthrough device with children,
>    and only passthrough the children. It proves it's inflexible and less
>    useful. The PCI class code white list also filter out too much types of
>    devices, and there is no way to cleanly differenciate devices suitable
>    to passthrough using class code.
>
>    In this patch, we allow Kimchi to passthrough a parent PCI device. The
>    front-end can use the existing "_passthrough_group_by" filter to list
>    the affected children devices and other PCI devices in the same group,
>    as well as the children of those devices. We also drop the class code
>    white list, and filter out only PCI bridge and video cards.
>
>    libvirt uses domain, bus, slot and function to encode the device name,
>    so sorting the device name results the same effect as sorting based on
>    domain:bus:slot:function. This patch sorts all the devices based on the
>    name.
>
>    When an SCSI adapter is assigned to virtual machine, the previous
>    node device scsi_host and scsi_target become stale for the host machine.
>    Unfortunately, libvirt only removes the invalid scsi_host device, but
>    not scsi_target. When Kimchi is looking up the parent scsi_host of a
>    scsi_target, the scsi_host device actually does not exist, and it
>    explodes.
>
>    This patch catches such error and ignores those devices without a valid
>    parent device.
>
> Signed-off-by: Zhou Zheng Sheng <zhshzhou at linux.vnet.ibm.com>
> ---
>   src/kimchi/i18n.py          |   1 +
>   src/kimchi/model/host.py    |  33 ++++++++++--
>   src/kimchi/model/hostdev.py | 121 ++++++++++++++++++++++++++++++++++++++++++--
>   3 files changed, 147 insertions(+), 8 deletions(-)
>
> diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py
> index 1b543ce..98adc46 100644
> --- a/src/kimchi/i18n.py
> +++ b/src/kimchi/i18n.py
> @@ -233,6 +233,7 @@ messages = {
>       "KCHHOST0001E": _("Unable to shutdown host machine as there are running virtual machines"),
>       "KCHHOST0002E": _("Unable to reboot host machine as there are running virtual machines"),
>       "KCHHOST0003E": _("Node device '%(name)s' not found"),
> +    "KCHHOST0004E": _("Conflicting flag filters specified."),
>
>       "KCHPKGUPD0001E": _("No packages marked for update"),
>       "KCHPKGUPD0002E": _("Package %(name)s is not marked to be updated."),
> diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py
> index 7d7cd66..5d31809 100644
> --- a/src/kimchi/model/host.py
> +++ b/src/kimchi/model/host.py
> @@ -32,8 +32,9 @@ from kimchi import disks
>   from kimchi import netinfo
>   from kimchi import xmlutils
>   from kimchi.basemodel import Singleton
> -from kimchi.exception import InvalidOperation, NotFoundError, OperationFailed
>   from kimchi.model import hostdev
> +from kimchi.exception import InvalidOperation, InvalidParameter
> +from kimchi.exception import NotFoundError, OperationFailed
>   from kimchi.model.config import CapabilitiesModel
>   from kimchi.model.tasks import TaskModel
>   from kimchi.model.vms import DOM_STATE_MAP
> @@ -299,10 +300,28 @@ class DevicesModel(object):
>           except AttributeError:
>               self.cap_map['fc_host'] = None
>
> -    def get_list(self, _cap=None):
> +    def get_list(self, _cap=None, _passthrough=None,
> +                 _passthrough_affected_by=None):
> +        if _passthrough_affected_by is not None:
> +            # _passthrough_affected_by conflicts with _cap and _passthrough
> +            if (_cap, _passthrough) != (None, None):
> +                raise InvalidParameter("KCHHOST0004E")
> +            return sorted(
> +                self._get_passthrough_affected_devs(_passthrough_affected_by))
> +
>           if _cap == 'fc_host':
> -            return self._get_devices_fc_host()
> -        return self._get_devices_with_capability(_cap)
> +            dev_names = self._get_devices_fc_host()
> +        else:
> +            dev_names = self._get_devices_with_capability(_cap)
> +
> +        if _passthrough is not None and _passthrough.lower() == 'true':
> +            conn = self.conn.get()
> +            passthrough_names = [
> +                dev['name'] for dev in hostdev.get_passthrough_dev_infos(conn)]
> +            dev_names = list(set(dev_names) & set(passthrough_names))
> +
> +        dev_names.sort()
> +        return dev_names
>
>       def _get_devices_with_capability(self, cap):
>           conn = self.conn.get()
> @@ -314,6 +333,12 @@ class DevicesModel(object):
>                   return []
>           return [name.name() for name in conn.listAllDevices(cap_flag)]
>
> +    def _get_passthrough_affected_devs(self, dev_name):
> +        conn = self.conn.get()
> +        info = DeviceModel(conn=self.conn).lookup(dev_name)
> +        affected = hostdev.get_affected_passthrough_devices(conn, info)
> +        return [dev_info['name'] for dev_info in affected]
> +
>       def _get_devices_fc_host(self):
>           conn = self.conn.get()
>           # Libvirt < 1.0.5 does not support fc_host capability
> diff --git a/src/kimchi/model/hostdev.py b/src/kimchi/model/hostdev.py
> index 1d660d8..bf95678 100644
> --- a/src/kimchi/model/hostdev.py
> +++ b/src/kimchi/model/hostdev.py
> @@ -17,7 +17,9 @@
>   # License along with this library; if not, write to the Free Software
>   # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 USA
>
> +import os
>   from pprint import pformat
> +from pprint import pprint
>
>   from kimchi.model.libvirtconnection import LibvirtConnection
>   from kimchi.utils import kimchi_log
> @@ -36,7 +38,13 @@ def _get_dev_info_tree(dev_infos):
>           if dev_info['parent'] is None:
>               root = dev_info
>               continue
> -        parent = devs[dev_info['parent']]
> +
> +        try:
> +            parent = devs[dev_info['parent']]
> +        except KeyError:
> +            kimchi_log.error('Parent %s of device %s does not exist.',
> +                             dev_info['parent'], dev_info['name'])
> +            continue
>
>           try:
>               children = parent['children']
> @@ -47,6 +55,109 @@ def _get_dev_info_tree(dev_infos):
>       return root
>
>
> +def _is_pci_qualified(pci_dev):
> +    # PCI bridge is not suitable to passthrough
> +    # KVM does not support passthrough graphic card now
> +    blacklist_classes = (0x030000, 0x060000)
> +
> +    with open(os.path.join(pci_dev['path'], 'class')) as f:
> +        pci_class = int(f.readline().strip(), 16)
> +
> +    if pci_class & 0xff0000 in blacklist_classes:
> +        return False
> +
> +    return True
> +
> +
> +def get_passthrough_dev_infos(libvirt_conn):
> +    ''' Get devices eligible to be passed through to VM. '''
> +
> +    def is_eligible(dev):
> +        return dev['device_type'] in ('usb_device', 'scsi') or \
> +            (dev['device_type'] == 'pci' and _is_pci_qualified(dev))
> +
> +    dev_infos = _get_all_host_dev_infos(libvirt_conn)
> +
> +    return [dev_info for dev_info in dev_infos if is_eligible(dev_info)]
> +
> +
> +def _get_same_iommugroup_devices(dev_infos, device_info):
> +    dev_dict = dict([(dev_info['name'], dev_info) for dev_info in dev_infos])
> +
> +    def get_iommu_group(dev_info):
> +        # Find out the iommu group of a given device.
> +        # Child device belongs to the same iommu group as the parent device.
> +        try:
> +            return dev_info['iommuGroup']
> +        except KeyError:
> +            pass
> +
> +        parent = dev_info['parent']
> +        while parent is not None:
> +            try:
> +                parent_info = dev_dict[parent]
> +            except KeyError:
> +                kimchi_log.error("Parent %s of device %s does not exist",
> +                                 parent, dev_info['name'])
> +                break
> +
> +            try:
> +                iommuGroup = parent_info['iommuGroup']
> +            except KeyError:
> +                pass
> +            else:
> +                return iommuGroup
> +
> +            parent = parent_info['parent']
> +
> +        return -1
> +

minor comment: true/false (or None) is more like python

> +    iommu_group = get_iommu_group(device_info)
> +
> +    if iommu_group == -1:
> +        return []
> +
> +    return [dev_info for dev_info in dev_infos
> +            if dev_info['name'] != device_info['name'] and
> +            get_iommu_group(dev_info) == iommu_group]
> +
> +
> +def _get_children_devices(dev_infos, device_info):
> +    def get_children_recursive(parent):
> +        try:
> +            children = parent['children']
> +        except KeyError:
> +            return []
> +
> +        result = []
> +        for child in children:
> +            result.append(child)
> +            result.extend(get_children_recursive(child))
> +
> +        return result
> +
> +    # Annotate every the dev_info element with children information
> +    _get_dev_info_tree(dev_infos)
> +
> +    for dev_info in dev_infos:
> +        if dev_info['name'] == device_info['name']:
> +            return get_children_recursive(dev_info)
> +
> +    return []
> +
> +
> +def get_affected_passthrough_devices(libvirt_conn, passthrough_dev):
> +    dev_infos = _get_all_host_dev_infos(libvirt_conn)
> +
> +    group_devices = _get_same_iommugroup_devices(dev_infos, passthrough_dev)
> +    if not group_devices:
> +        # On host without iommu group support, the affected devices should
> +        # at least include all children devices
> +        group_devices.extend(_get_children_devices(dev_infos, passthrough_dev))
> +
> +    return group_devices
> +
> +
>   def get_dev_info(node_dev):
>       ''' Parse the node device XML string into dict according to
>       http://libvirt.org/formatnode.html. '''
> @@ -196,8 +307,7 @@ def _get_usb_device_dev_info(info):
>
>
>   # For test and debug
> -def _print_host_dev_tree():
> -    libvirt_conn = LibvirtConnection('qemu:///system').get()
> +def _print_host_dev_tree(libvirt_conn):
>       dev_infos = _get_all_host_dev_infos(libvirt_conn)
>       root = _get_dev_info_tree(dev_infos)
>       if root is None:
> @@ -235,4 +345,7 @@ def _format_dev_node(node):
>
>
>   if __name__ == '__main__':
> -    _print_host_dev_tree()
> +    libvirt_conn = LibvirtConnection('qemu:///system').get()
> +    _print_host_dev_tree(libvirt_conn)
> +    print 'Eligible passthrough devices:'
> +    pprint(get_passthrough_dev_infos(libvirt_conn))




More information about the Kimchi-devel mailing list