[ovirt-devel] [vdsm] Infrastructure design for node (host) devices

Tue Jul 1 06:36:11 UTC 2014

On Jun 29, 2014, at 16:55 , Saggi Mizrahi <smizrahi at redhat.com> wrote:

> 
> 
> ----- Original Message -----
>> From: "Martin Polednik" <mpoledni at redhat.com>
>> To: devel at ovirt.org
>> Sent: Tuesday, June 24, 2014 1:26:17 PM
>> Subject: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices
>> 
>> Hello,
>> 
>> I'm actively working on getting host device passthrough (pci, usb and scsi)
>> exposed in VDSM, but I've encountered growing complexity of this feature.
>> 
>> The devices are currently created in the same manner as virtual devices and
>> their reporting is done via hostDevices list in getCaps. As I implemented
>> usb and scsi devices, the size of this list grew almost twice - and that is
>> on a laptop.
> There should be a separate verb with ability to filter by type.

+1

>> 
>> Similar problem is with the devices themselves, they are closely tied to host
>> and currently, engine would have to keep their mapping to VMs, reattach back
>> loose devices and handle all of this in case of migration.
> Migration sound very complicated, especially at the phase where the VM actually
> starts running on the target host. The hardware state is completely different
> but the guest OS wouldn't have any idea that happened.
> So detaching before migration and than reattaching on the destination is a must
> but that could cause issues in the guest. I'd imaging that this would be an issue
> when hibernating on one host and waking up on another.

If qemu actually supports this at all it would need to be very specific for each device, restoring/setting a concrete HW state is a challenging task.
I would also see it as pin to host and then on specific cases detach&attach (or that sr-iov's fancy temporary emulated device)

>> 
>> I would like to hear your opinion on building something like host device pool
>> in VDSM. The pool would be populated and periodically updated (to handle
>> hot(un)plugs) and VMs/engine could query it for free/assigned/possibly
>> problematic
>> devices (which could be reattached by the pool). This has added benefit of
>> requiring fewer libvirt calls, but a bit more complexity and possibly one
>> thread.
>> The persistence of the pool on VDSM restart could be kept in config or
>> constructed
>> from XML.
> I'd much rather VDSM not cache state unless this is absolutely necessary.
> This sounds like something that doesn't need to be queried every 3 seconds
> so it's best if we just get to ask libvirt.

well, unless we try to persist it a cache doesn't hurt
I don't see a particular problem in reconstructing the structures on startup

> 
> I do wonder how that kind of thing can be configured in the VM creation
> phase as you would sometimes want to just specify a type of device and
> sometimes specify a specific one. Also, I'd assume there will be a
> fallback policy stating if the VM should run if said resource is unavailable.
>> 
>> I'd need new API verbs to allow engine to communicate with the pool,
>> possibly leaving caps as they are and engine could detect the presence of
>> newer
>> vdsm by presence of these API verbs.
> Again, I think that getting a list of devices filterable by kind\type might
> be best than a real pool. We might want to return if a device is in use
> (could also be in use by the host operating system and not just VMs)
>> The vmCreate call would remain almost
>> the
>> same, only with the addition of new device for VMs (where the detach and
>> tracking
>> routine would be communicated with the pool).
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>> 
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel