This patch adds a '_passthrough=true' filter to /host/devices, so it can
filter and shows all devices eligible to passthrough to guest.
Theoretically, all PCI, USB and SCSI devices can be assigned to guest
directly.
Linux kernel is able to recognize the host IOMMU group layout. If two
PCI devices are in the same IOMMU group, it means there are possible
interconnections between the devices, and the devices can talk to each
other bypassing IOMMU. This implies isolation is not pefect between those
devices, so all devices in a IOMMU group must be assigned to guest
together. On host that recognizes IOMMU groups, by accessing the URI
/host/devices?_passthrough_affected_by=DEVICE_NAME, it returns a list
containing the devices in the same IOMMU group as DEVICE_NAME, and all
of the children devices of them. So the front-end can show all the
affected devices to user, and it helps the user to determine which
host devices are to be assigned to guest.
How to test:
List all types of devices to passthrough
curl -k -u root -H "Content-Type: application/json" \
-H "Accept: application/json" \
'https://127.0.0.1:8001/host/devices?_passthrough=true'
List all eligible PCI devices to passthrough
/host/devices?_passthrough=true&_cap=pci
List all USB devices to passthrough
/host/devices?_passthrough=true&_cap=usb_device
List all SCSI devices to passthrough
/host/devices?_passthrough=true&_cap=scsi
List devices in the same IOMMU group as pci_0000_00_19_0
/host/devices?_passthrough_affected_by=pci_0000_00_19_0
v1:
v1 series does not contain this patch.
v2:
Deal with calculation "leaf" device and "affected" device.
v5:
Change _passthrough=1 to _passthrough=true in the URI scheme. Filter
PCI devices according the PCI class.
v6:
Don't passthrough PCI device of class code 07. In modern
x86 machine, it's possible that
"6 Series/C200 Series Chipset Family MEI Controller" and
"6 Series/C200 Series Chipset Family KT Controller"
are of this class code. These two devices are not suitable to
passthrough to guest. We don't have simple and reliable way to
distinguish normal serial controller and host chipset XXX controller.
This type of PCI devices also include various serial, parallel, modem,
communication controller. Serial and parallel controllers can be
re-direct from ttyS0 to QEMU's pty using socat, and there is little
performance benefit to directly assign to guest. So it'k ok not to
passththrough PCI device of class code 07.
v8:
Use a new flag filter "_passthrough_group_by"
/host/devices?_passthrough_group_by=pci_XXX
instead of using sub-collection
/host/devices/pci_XXX/passthrough_affected_devices
v9:
Use the same LibvirtConnection object as the Model, so as to avoid
connection exhausting.
v10:
Adapt to RHEL 6. RHEL 6 does not provide iommu group information in
sysfs. For now we just ignore this error and live with it. The device
passthrough for PCI devices will not work, but the basic devices
informations are still provided to the user. In future we'll develope
code to gather iommu group information.
v11:
In previous commits, we don't allow to passthrough device with children,
and only passthrough the children. It proves it's inflexible and less
useful. The PCI class code white list also filter out too much types of
devices, and there is no way to cleanly differenciate devices suitable
to passthrough using class code.
In this patch, we allow Kimchi to passthrough a parent PCI device. The
front-end can use the existing "_passthrough_group_by" filter to list
the affected children devices and other PCI devices in the same group,
as well as the children of those devices. We also drop the class code
white list, and filter out only PCI bridge and video cards.
libvirt uses domain, bus, slot and function to encode the device name,
so sorting the device name results the same effect as sorting based on
domain:bus:slot:function. This patch sorts all the devices based on the
name.
When an SCSI adapter is assigned to virtual machine, the previous
node device scsi_host and scsi_target become stale for the host machine.
Unfortunately, libvirt only removes the invalid scsi_host device, but
not scsi_target. When Kimchi is looking up the parent scsi_host of a
scsi_target, the scsi_host device actually does not exist, and it
explodes.
This patch catches such error and ignores those devices without a valid
parent device.
Signed-off-by: Zhou Zheng Sheng <zhshzhou(a)linux.vnet.ibm.com>
---
src/kimchi/i18n.py | 1 +
src/kimchi/model/host.py | 33 ++++++++++--
src/kimchi/model/hostdev.py | 121 ++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 147 insertions(+), 8 deletions(-)
diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py
index 1b543ce..98adc46 100644
--- a/src/kimchi/i18n.py
+++ b/src/kimchi/i18n.py
@@ -233,6 +233,7 @@ messages = {
"KCHHOST0001E": _("Unable to shutdown host machine as there are
running virtual machines"),
"KCHHOST0002E": _("Unable to reboot host machine as there are running
virtual machines"),
"KCHHOST0003E": _("Node device '%(name)s' not found"),
+ "KCHHOST0004E": _("Conflicting flag filters specified."),
"KCHPKGUPD0001E": _("No packages marked for update"),
"KCHPKGUPD0002E": _("Package %(name)s is not marked to be
updated."),
diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py
index 7d7cd66..5d31809 100644
--- a/src/kimchi/model/host.py
+++ b/src/kimchi/model/host.py
@@ -32,8 +32,9 @@ from kimchi import disks
from kimchi import netinfo
from kimchi import xmlutils
from kimchi.basemodel import Singleton
-from kimchi.exception import InvalidOperation, NotFoundError, OperationFailed
from kimchi.model import hostdev
+from kimchi.exception import InvalidOperation, InvalidParameter
+from kimchi.exception import NotFoundError, OperationFailed
from kimchi.model.config import CapabilitiesModel
from kimchi.model.tasks import TaskModel
from kimchi.model.vms import DOM_STATE_MAP
@@ -299,10 +300,28 @@ class DevicesModel(object):
except AttributeError:
self.cap_map['fc_host'] = None
- def get_list(self, _cap=None):
+ def get_list(self, _cap=None, _passthrough=None,
+ _passthrough_affected_by=None):
+ if _passthrough_affected_by is not None:
+ # _passthrough_affected_by conflicts with _cap and _passthrough
+ if (_cap, _passthrough) != (None, None):
+ raise InvalidParameter("KCHHOST0004E")
+ return sorted(
+ self._get_passthrough_affected_devs(_passthrough_affected_by))
+
if _cap == 'fc_host':
- return self._get_devices_fc_host()
- return self._get_devices_with_capability(_cap)
+ dev_names = self._get_devices_fc_host()
+ else:
+ dev_names = self._get_devices_with_capability(_cap)
+
+ if _passthrough is not None and _passthrough.lower() == 'true':
+ conn = self.conn.get()
+ passthrough_names = [
+ dev['name'] for dev in hostdev.get_passthrough_dev_infos(conn)]
+ dev_names = list(set(dev_names) & set(passthrough_names))
+
+ dev_names.sort()
+ return dev_names
def _get_devices_with_capability(self, cap):
conn = self.conn.get()
@@ -314,6 +333,12 @@ class DevicesModel(object):
return []
return [name.name() for name in conn.listAllDevices(cap_flag)]
+ def _get_passthrough_affected_devs(self, dev_name):
+ conn = self.conn.get()
+ info = DeviceModel(conn=self.conn).lookup(dev_name)
+ affected = hostdev.get_affected_passthrough_devices(conn, info)
+ return [dev_info['name'] for dev_info in affected]
+
def _get_devices_fc_host(self):
conn = self.conn.get()
# Libvirt < 1.0.5 does not support fc_host capability
diff --git a/src/kimchi/model/hostdev.py b/src/kimchi/model/hostdev.py
index 1d660d8..bf95678 100644
--- a/src/kimchi/model/hostdev.py
+++ b/src/kimchi/model/hostdev.py
@@ -17,7 +17,9 @@
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import os
from pprint import pformat
+from pprint import pprint
from kimchi.model.libvirtconnection import LibvirtConnection
from kimchi.utils import kimchi_log
@@ -36,7 +38,13 @@ def _get_dev_info_tree(dev_infos):
if dev_info['parent'] is None:
root = dev_info
continue
- parent = devs[dev_info['parent']]
+
+ try:
+ parent = devs[dev_info['parent']]
+ except KeyError:
+ kimchi_log.error('Parent %s of device %s does not exist.',
+ dev_info['parent'], dev_info['name'])
+ continue
try:
children = parent['children']
@@ -47,6 +55,109 @@ def _get_dev_info_tree(dev_infos):
return root
+def _is_pci_qualified(pci_dev):
+ # PCI bridge is not suitable to passthrough
+ # KVM does not support passthrough graphic card now
+ blacklist_classes = (0x030000, 0x060000)
+
+ with open(os.path.join(pci_dev['path'], 'class')) as f:
+ pci_class = int(f.readline().strip(), 16)
+
+ if pci_class & 0xff0000 in blacklist_classes:
+ return False
+
+ return True
+
+
+def get_passthrough_dev_infos(libvirt_conn):
+ ''' Get devices eligible to be passed through to VM. '''
+
+ def is_eligible(dev):
+ return dev['device_type'] in ('usb_device', 'scsi') or \
+ (dev['device_type'] == 'pci' and _is_pci_qualified(dev))
+
+ dev_infos = _get_all_host_dev_infos(libvirt_conn)
+
+ return [dev_info for dev_info in dev_infos if is_eligible(dev_info)]
+
+
+def _get_same_iommugroup_devices(dev_infos, device_info):
+ dev_dict = dict([(dev_info['name'], dev_info) for dev_info in dev_infos])
+
+ def get_iommu_group(dev_info):
+ # Find out the iommu group of a given device.
+ # Child device belongs to the same iommu group as the parent device.
+ try:
+ return dev_info['iommuGroup']
+ except KeyError:
+ pass
+
+ parent = dev_info['parent']
+ while parent is not None:
+ try:
+ parent_info = dev_dict[parent]
+ except KeyError:
+ kimchi_log.error("Parent %s of device %s does not exist",
+ parent, dev_info['name'])
+ break
+
+ try:
+ iommuGroup = parent_info['iommuGroup']
+ except KeyError:
+ pass
+ else:
+ return iommuGroup
+
+ parent = parent_info['parent']
+
+ return -1
+
+ iommu_group = get_iommu_group(device_info)
+
+ if iommu_group == -1:
+ return []
+
+ return [dev_info for dev_info in dev_infos
+ if dev_info['name'] != device_info['name'] and
+ get_iommu_group(dev_info) == iommu_group]
+
+
+def _get_children_devices(dev_infos, device_info):
+ def get_children_recursive(parent):
+ try:
+ children = parent['children']
+ except KeyError:
+ return []
+
+ result = []
+ for child in children:
+ result.append(child)
+ result.extend(get_children_recursive(child))
+
+ return result
+
+ # Annotate every the dev_info element with children information
+ _get_dev_info_tree(dev_infos)
+
+ for dev_info in dev_infos:
+ if dev_info['name'] == device_info['name']:
+ return get_children_recursive(dev_info)
+
+ return []
+
+
+def get_affected_passthrough_devices(libvirt_conn, passthrough_dev):
+ dev_infos = _get_all_host_dev_infos(libvirt_conn)
+
+ group_devices = _get_same_iommugroup_devices(dev_infos, passthrough_dev)
+ if not group_devices:
+ # On host without iommu group support, the affected devices should
+ # at least include all children devices
+ group_devices.extend(_get_children_devices(dev_infos, passthrough_dev))
+
+ return group_devices
+
+
def get_dev_info(node_dev):
''' Parse the node device XML string into dict according to
http://libvirt.org/formatnode.html. '''
@@ -196,8 +307,7 @@ def _get_usb_device_dev_info(info):
# For test and debug
-def _print_host_dev_tree():
- libvirt_conn = LibvirtConnection('qemu:///system').get()
+def _print_host_dev_tree(libvirt_conn):
dev_infos = _get_all_host_dev_infos(libvirt_conn)
root = _get_dev_info_tree(dev_infos)
if root is None:
@@ -235,4 +345,7 @@ def _format_dev_node(node):
if __name__ == '__main__':
- _print_host_dev_tree()
+ libvirt_conn = LibvirtConnection('qemu:///system').get()
+ _print_host_dev_tree(libvirt_conn)
+ print 'Eligible passthrough devices:'
+ pprint(get_passthrough_dev_infos(libvirt_conn))
--
1.9.3