On Tue, Jul 14, 2020 at 10:16:16AM -0600, Alex Williamson wrote:
On Tue, 14 Jul 2020 11:21:29 +0100
Daniel P. Berrangé <berrange(a)redhat.com> wrote:
> On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> > The string read from migration_version attribute is defined by device vendor
> > driver and is completely opaque to the userspace.
> > for a Intel vGPU, string format can be defined like
> > "parent device PCI ID" + "version of gvt driver" +
"mdev type" + "aggregator count".
> > for an NVMe VF connecting to a remote storage. it could be
> > "PCI ID" + "driver version" + "configured remote
> > for a QAT VF, it may be
> > "PCI ID" + "driver version" + "supported encryption
> > (to avoid namespace confliction from each vendor, we may prefix a driver name
> > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1)
It's very strange to define it as opaque and then proceed to describe
the contents of that opaque string. The point is that its contents
are defined by the vendor driver to describe the device, driver version,
and possibly metadata about the configuration of the device. One
instance of a device might generate a different string from another.
The string that a device produces is not necessarily the only string
the vendor driver will accept, for example the driver might support
backwards compatible migrations.
> IMHO there needs to be a mechanism for the kernel to report via
> what versions are supported on a given device. This puts the job of
> reporting compatible versions directly under the responsibility of the
> vendor who writes the kernel driver for it. They are the ones with the
> best knowledge of the hardware they've built and the rules around its
The version string discussed previously is the version string that
represents a given device, possibly including driver information,
configuration, etc. I think what you're asking for here is an
enumeration of every possible version string that a given device could
accept as an incoming migration stream. If we consider the string as
opaque, that means the vendor driver needs to generate a separate
string for every possible version it could accept, for every possible
configuration option. That potentially becomes an excessive amount of
data to either generate or manage.
Am I overestimating how vendors intend to use the version string?
If I'm interpreting your reply & the quoted text orrectly, the version
string isn't really a version string in any normal sense of the word
Instead it sounds like string encoding a set of features in some arbitrary
vendor specific format, which they parse and do compatibility checks on
individual pieces ? One or more parts may contain a version number, but
its much more than just a version.
If that's correct, then I'd prefer we didn't call it a version string,
instead call it a "capability string" to make it clear it is expressing
a much more general concept, but...
We'd also need to consider devices that we could create, for
providing the same interface enumeration prior to creating an mdev
device to have a confidence level that the new device would be a valid
We defined the string as opaque to allow vendor flexibility and because
defining a common format is hard. Do we need to revisit this part of
the discussion to define the version string as non-opaque with parsing
rules, probably with separate incoming vs outgoing interfaces? Thanks,
..even if the huge amount of flexibility is technically relevant from the
POV of the hardware/drivers, we should consider whether management apps
actually want, or can use, that level of flexibility.
The task of picking which host to place a VM on has alot of factors to
consider, and when there are a large number of hosts, the total amount
of information to check gets correspondingly large. The placement
process is also fairly performance critical.
Running complex algorithmic logic to check compatibility of devices
based on a arbitrary set of rules is likely to be a performance
challenge. A flat list of supported strings is a much simpler
thing to check as it reduces down to a simple set membership test.
IOW, even if there's some complex set of device type / vendor specific
rules to check for compatibility, I fear apps will ignore them and
just define a very simplified list of compatible string, and ignore
all the extra flexibility.
I'm sure OpenStack maintainers can speak to this more, as they've put
alot of work into their scheduling engine to optimize the way it places
VMs largely driven from simple structured data reported from hosts.