[Kimchi-devel] [PATCH v6 2/4] Host device passthrough: List eligible device to passthrough

Zhou Zheng Sheng zhshzhou at linux.vnet.ibm.com
Mon Jun 23 03:17:05 UTC 2014


on 2014/06/16 16:36, Mark Wu wrote:
> On 06/09/2014 05:28 PM, Zhou Zheng Sheng wrote:
>> This patch adds a '_passthrough=1' filter to /host/devices, so it can
>> filter and shows all devices eligible to passthrough to guest.
>> Theoretically, all PCI, USB and SCSI devices can be assigned to guest
>> directly. However usually all host devices form a tree, if we assign a
>> PCI port/SCSI controller/USB controller to guest, all devices/disks
>> under the controller are assigned as well. In this patch we only present
>> the "leaf" host devices to the user as potential passthrough devices.
>> In other word, the possible devices are wireless network interface, SD
>> card reader, camera, SCSI unit(disk or CD), and so on.
>>
>> Linux kernel is able to recognize the host IOMMU group layout. If two
>> PCI devices are in the same IOMMU group, it means there are possible
>> interconnections between the devices, and the devices can talk to each
>> other bypassing IOMMU. This implies isolation is not pefect between those
>> devices, so all devices in a IOMMU group must be assigned to guest
>> together. On host that recognizes IOMMU groups, by accessing the URI
>> /host/devices/deviceX/passthrough_affected_devices, it returns a list
>> containing the devices in the same IOMMU group as deviceX.
>>
>> How to test:
>>
>> List all types of devices to passthrough
>>    curl -k -u root -H "Content-Type: application/json" \
>>      -H "Accept: application/json" \
>>      'https://127.0.0.1:8001/host/devices?_passthrough=true'
>>
>> List all eligible PCI devices to passthrough
>>    /host/devices?_passthrough=true&_cap=pci
>>
>> List all USB devices to passthrough
>>    /host/devices?_passthrough=true&_cap=usb_device
>>
>> List all SCSI devices to passthrough
>>    /host/devices?_passthrough=true&_cap=scsi
>>
>> List devices in the same IOMMU group as pci_0000_00_19_0
>>    /host/devices/pci_0000_00_19_0/passthrough_affected_devices
>>
>> v1:
>>    v1 series does not contain this patch.
>>
>> v2:
>>    Deal with calculation "leaf" device and "affected" device.
>>
>> v3 v4:
>>    No change.
>>
>> v5:
>>    Change _passthrough=1 to _passthrough=true in the URI scheme. Filter
>> PCI devices according the PCI class.
>>
>> v6:
>>    Don't passthrough PCI device of class code 07. In modern
>> x86 machine, it's possible that
>> "6 Series/C200 Series Chipset Family MEI Controller" and
>> "6 Series/C200 Series Chipset Family KT Controller"
>> are of this class code. These two devices are not suitable to
>> passthrough to guest. We don't have simple and reliable way to
>> distinguish normal serial controller and host chipset XXX controller.
>> This type of PCI devices also include various serial, parallel, modem,
>> communication controller. Serial and parallel controllers can be
>> re-direct from ttyS0 to QEMU's pty using socat, and there is little
>> performance benefit to directly assign to guest. So it'k ok not to
>> passththrough PCI device of class code 07.
>>
>> Signed-off-by: Zhou Zheng Sheng <zhshzhou at linux.vnet.ibm.com>
>> ---
>>   src/kimchi/control/host.py |   9 ++++
>>   src/kimchi/hostdev.py      | 107
>> +++++++++++++++++++++++++++++++++++++++++++++
>>   src/kimchi/model/host.py   |  17 ++++++-
>>   3 files changed, 132 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py
>> index ebf1bed..15f2343 100644
>> --- a/src/kimchi/control/host.py
>> +++ b/src/kimchi/control/host.py
>> @@ -103,9 +103,18 @@ class Devices(Collection):
>>           self.resource = Device
>>
>>
>> +class PassthroughAffectedDevices(Collection):
>> +    def __init__(self, model, device_id):
>> +        super(PassthroughAffectedDevices, self).__init__(model)
>> +        self.resource = Device
>> +        self.model_args = (device_id, )
>> +
>> +
>>   class Device(Resource):
>>       def __init__(self, model, id):
>>           super(Device, self).__init__(model, id)
>> +        self.passthrough_affected_devices = \
>> +            PassthroughAffectedDevices(self.model, id)
>>
>>       @property
>>       def data(self):
>> diff --git a/src/kimchi/hostdev.py b/src/kimchi/hostdev.py
>> index d4c142d..f70154f 100644
>> --- a/src/kimchi/hostdev.py
>> +++ b/src/kimchi/hostdev.py
>> @@ -17,6 +17,8 @@
>>   # License along with this library; if not, write to the Free Software
>>   # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 
>> 02110-1301 USA
>>
>> +import os
>> +
>>   from kimchi.model.libvirtconnection import LibvirtConnection
>>   from kimchi.utils import kimchi_log
>>   from kimchi.xmlutils import dictize
>> @@ -46,6 +48,108 @@ def _get_dev_info_tree(dev_infos):
>>       return root
>>
>>
>> +def _strip_parents(devs, dev):
>> +    parent = dev['parent']
>> +    while parent is not None:
>> +        try:
>> +            parent_dev = devs.pop(parent)
>> +        except KeyError:
>> +            break
>> +
>> +        if (parent_dev['device_type'],
>> +                dev['device_type']) == ('usb_device', 'scsi'):
>> +            # For USB device containing mass storage, passthrough the
>> +            # USB device itself, not the SCSI unit.
>> +            devs.pop(dev['name'])
>> +            break
>> +
>> +        parent = parent_dev['parent']
>> +
>> +
>> +def _is_pci_qualified(pci_dev):
>> +    # PCI class such as bridge and storage controller are not
>> suitable to
>> +    # passthrough to VM, so we make a whitelist and only passthrough PCI
>> +    # class in the list.
>> +
>> +    whitelist_pci_classes = {
>> +        # Refer to Linux Kernel code include/linux/pci_ids.h
>> +        0x000000: {  # Old PCI devices
>> +            0x000100: None},  # Old VGA devices
>> +        0x020000: None,  # Network controller
>> +        0x030000: None,  # Display controller
>> +        0x040000: None,  # Multimedia device
>> +        0x090000: None,  # Inupt device
>> +        0x0d0000: None,  # Wireless controller
>> +        0x0f0000: None,  # Satellite communication controller
>> +        0x100000: None,  # Cryption controller
>> +        0x110000: None,  # Signal Processing controller
>> +        }
>> +
>> +    with open(os.path.join(pci_dev['path'], 'class')) as f:
>> +        pci_class = int(f.read(), 16)
> better to and use readline strip '\n'  even though it will not break.

Good suggestion.

>> +
>> +    try:
>> +        subclass = whitelist_pci_classes[pci_class & 0xff0000]
> I would like to suggest you separate the old pci device from the list. 
> You can define two lists:
> 
> whitelist_pci_classes = (0x020000, 0x030000, 0x040000, ...)
> whilelist_old_pci_classes = (0x000100,)
> 
> if pci_class & 0xff0000 != 0:
>    use whitelist_pci_classes
> else:
>    whilelist_old_pci_classes
> 
> 

This is effective because currently only one sub-class in old PCI
devices is eligible. For other classes, as I investigated, the whole
class can passthrough.

However I think the situation may change after it is used and tested
more. We may discover some sub-classes should be filtered out in future.
The data structure and the filtering algorithm is more robust for
further modifications.

>> +    except KeyError:
>> +        return False
>> +
>> +    if subclass is None:
>> +        return True
>> +
>> +    if pci_class & 0xffff00 in subclass:
>> +        return True
>> +
>> +    return False
>> +
>> +
>> +def get_passthrough_dev_infos():
>> +    ''' Get devices eligible to be passed through to VM. '''
>> +
>> +    dev_infos = _get_all_host_dev_infos()
>> +    devs = dict([(dev_info['name'], dev_info) for dev_info in
>> dev_infos])
>> +
>> +    for dev in dev_infos:
>> +        if dev['device_type'] in ('pci', 'usb_device', 'scsi') and
>> dev in devs:
> 
> and dev in devs ?   skip the device already removed in devs?
> 

This is a bit weird, I open my local workspace, there is no "and dev in
devs" in this patch. Let me send a new version. Thanks for catching this.

>> +            _strip_parents(devs, dev)
>> +
>> +    def is_eligible(dev):
>> +        if dev['device_type'] not in ('pci', 'usb_device', 'scsi'):
>> +            return False
>> +        if dev['device_type'] == 'pci':
>> +            return _is_pci_qualified(dev)
>> +        return True
>                 return dev['device_type'] in ('usb_device', scsi') or
> (dev['device_type'] == 'pci' and _is_pci_qualified(dev))
> ?

Agree.

>> +
>> +    return [dev for dev in devs.itervalues() if is_eligible(dev)]
>> +
>> +
>> +def get_affected_passthrough_devices(passthrough_dev):
>> +    devs = dict([(dev['name'], dev) for dev in
>> _get_all_host_dev_infos()])
>> +
>> +    def get_iommu_group(dev_info):
>> +        try:
>> +            return int(dev_info['iommuGroup'])
>> +        except KeyError:
>> +            pass
>> +
>> +        parent = dev_info['parent']
>> +        while parent is not None:
>> +            try:
>> +                iommuGroup = int(devs[parent]['iommuGroup'])
>> +            except KeyError:
>> +                pass
>> +            else:
>> +                return iommuGroup
>> +            parent = devs[parent]['parent']
>> +
>> +        return -1
>> +
>> +    iommu_group = get_iommu_group(passthrough_dev)
>> +
>> +    return [dev for dev in get_passthrough_dev_infos()
>> +            if dev['name'] != passthrough_dev['name'] and
>> +            get_iommu_group(dev) == iommu_group]
>> +
>> +
>>   def get_dev_info(node_dev):
>>       ''' Parse the node device XML string into dict according to
>>       http://libvirt.org/formatnode.html. '''
>> @@ -206,4 +310,7 @@ def _format_dev_node(node):
>>
>>
>>   if __name__ == '__main__':
>> +    from pprint import pprint
>>       _print_host_dev_tree()
>> +    print 'Eligible passthrough devices:'
>> +    pprint(get_passthrough_dev_infos())
>> diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py
>> index 1ea97f0..280aa53 100644
>> --- a/src/kimchi/model/host.py
>> +++ b/src/kimchi/model/host.py
>> @@ -247,7 +247,7 @@ class DevicesModel(object):
>>       def __init__(self, **kargs):
>>           self.conn = kargs['conn']
>>
>> -    def get_list(self, _cap=None):
>> +    def get_list(self, _cap=None, _passthrough=None):
>>           conn = self.conn.get()
>>           if _cap is None:
>>               dev_names = [name.name() for name in
>> conn.listAllDevices(0)]
>> @@ -256,6 +256,11 @@ class DevicesModel(object):
>>           else:
>>               # Get devices with required capability
>>               dev_names = conn.listDevices(_cap, 0)
>> +
>> +        if _passthrough is not None and _passthrough.lower() == 'true':
>> +            passthrough_names = [
>> +                dev['name'] for dev in
>> hostdev.get_passthrough_dev_infos()]
>> +            dev_names = list(set(dev_names) & set(passthrough_names))
> Isn't passthrough_names a subset of dev_names?

No. This is because when there is both _cap and _passthrough, dev_names
would be filtered firstly to leave only a sub set. So we have to make an
intersection with the passthrough names.

>>           return dev_names
>>
>>       def _get_devices_fc_host(self):
>> @@ -273,6 +278,16 @@ class DevicesModel(object):
>>           return conn.listDevices('fc_host', 0)
>>
>>
>> +class PassthroughAffectedDevicesModel(object):
>> +    def __init__(self, **kargs):
>> +        self.conn = kargs['conn']
>> +
>> +    def get_list(self, device_id):
>> +        dev_info = DeviceModel(conn=self.conn).lookup(device_id)
>> +        affected = hostdev.get_affected_passthrough_devices(dev_info)
>> +        return [dev['name'] for dev in affected]
>> +
> not sure if the client side can know what's the devices is just
> according to the name.

I think we already provide a host/devices/name URI, the front-end can
fetch the details from that URI if necessary.

>> +
>>   class DeviceModel(object):
>>       def __init__(self, **kargs):
>>           self.conn = kargs['conn']
> Besides, the minor issues in comments,  it looks good to me.


-- 
Zhou Zheng Sheng / 周征晟
E-mail: zhshzhou at linux.vnet.ibm.com
Telephone: 86-10-82454397




More information about the Kimchi-devel mailing list