
That's exactly the direction I originally understood oVirt would go, with the ability to run VMs and container side-by-side on the bare metal or nested with containers inside VMs for stronger resource or security isolation and network virtualization. To me it sounded especially attractive with an HCI underpinning so you could deploy it also in the field with small 3 node clusters. But combining all those features evidently comes at too high a cost for all the integration and the customer base is either too small or too poor: the cloud players are all out on making sure you no longer run any hardware and then it's really just about pushing your applications there as cloud native or "IaaS" compatible as needed. E.g. I don't see PCI pass-through coming to kubevirt to enable GPU use, because it ties the machine to a specific host and goes against the grain of K8 as I understand it. Memory overcommit is quite funny, really, because it's the same issue as the original virtual memory: essentially you lie to your consumer about the resources available and then swap pages forth and back in an attempt to make all your consumers happy. It was processes for virtual memory, it's VMs now for the hypervisor and in both cases it's about the consumer and the provider not continously negotiating for the resources they need and the price they are willing to pay. That negotiation is always better at the highest level of abstraction, the application itself, which why implementing it at the lower levels (e.g. VMs) becomes less useful and needed. And then there is technology like CXL which essentially turns RAM in to a fabric and your local CPU will just get RAM from another piece of hardware when your application needs more RAM and is willing to pay the premium something will charge for it. With that type of hardware much of what hypervisors used to do goes into DPUs/IPUs and CPUs are just running applications making hypercalls. The kernel is just there to bootstrap. Not sure we'll see that type of hardware at home or in the edge, though...

On Mon, Feb 21, 2022 at 12:27 PM Thomas Hoberg <thomas@hoberg.net> wrote:
That's exactly the direction I originally understood oVirt would go, with the ability to run VMs and container side-by-side on the bare metal or nested with containers inside VMs for stronger resource or security isolation and network virtualization. To me it sounded especially attractive with an HCI underpinning so you could deploy it also in the field with small 3 node clusters.
But combining all those features evidently comes at too high a cost for all the integration and the customer base is either too small or too poor: the cloud players are all out on making sure you no longer run any hardware and then it's really just about pushing your applications there as cloud native or "IaaS" compatible as needed.
E.g. I don't see PCI pass-through coming to kubevirt to enable GPU use, because it ties the machine to a specific host and goes against the grain of K8 as I understand it.
technically it's already there: https://kubevirt.io/user-guide/virtual_machines/host-devices/
Memory overcommit is quite funny, really, because it's the same issue as the original virtual memory: essentially you lie to your consumer about the resources available and then swap pages forth and back in an attempt to make all your consumers happy. It was processes for virtual memory, it's VMs now for the hypervisor and in both cases it's about the consumer and the provider not continously negotiating for the resources they need and the price they are willing to pay.
That negotiation is always better at the highest level of abstraction, the application itself, which why implementing it at the lower levels (e.g. VMs) becomes less useful and needed.
And then there is technology like CXL which essentially turns RAM in to a fabric and your local CPU will just get RAM from another piece of hardware when your application needs more RAM and is willing to pay the premium something will charge for it.
With that type of hardware much of what hypervisors used to do goes into DPUs/IPUs and CPUs are just running applications making hypercalls. The kernel is just there to bootstrap.
Not sure we'll see that type of hardware at home or in the edge, though... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PC5SDUMCPUEHQC...

On Tue, Feb 22, 2022 at 9:48 AM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Mon, Feb 21, 2022 at 12:27 PM Thomas Hoberg <thomas@hoberg.net> wrote:
That's exactly the direction I originally understood oVirt would go, with the ability to run VMs and container side-by-side on the bare metal or nested with containers inside VMs for stronger resource or security isolation and network virtualization. To me it sounded especially attractive with an HCI underpinning so you could deploy it also in the field with small 3 node clusters.
But combining all those features evidently comes at too high a cost for all the integration and the customer base is either too small or too poor: the cloud players are all out on making sure you no longer run any hardware and then it's really just about pushing your applications there as cloud native or "IaaS" compatible as needed.
E.g. I don't see PCI pass-through coming to kubevirt to enable GPU use, because it ties the machine to a specific host and goes against the grain of K8 as I understand it.
technically it's already there: https://kubevirt.io/user-guide/virtual_machines/host-devices/
Just to clarify the state of things a little: It is not only technically there. KubeVirt supports pci passthrough, GPU passthrough and SRIOV (including live-migration for SRIOV). I can't say if the OpenShift UI can compete with oVirt at this stage. Best regards, Roman
Memory overcommit is quite funny, really, because it's the same issue as the original virtual memory: essentially you lie to your consumer about the resources available and then swap pages forth and back in an attempt to make all your consumers happy. It was processes for virtual memory, it's VMs now for the hypervisor and in both cases it's about the consumer and the provider not continously negotiating for the resources they need and the price they are willing to pay.
That negotiation is always better at the highest level of abstraction, the application itself, which why implementing it at the lower levels (e.g. VMs) becomes less useful and needed.
And then there is technology like CXL which essentially turns RAM in to a fabric and your local CPU will just get RAM from another piece of hardware when your application needs more RAM and is willing to pay the premium something will charge for it.
With that type of hardware much of what hypervisors used to do goes into DPUs/IPUs and CPUs are just running applications making hypercalls. The kernel is just there to bootstrap.
Not sure we'll see that type of hardware at home or in the edge, though... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PC5SDUMCPUEHQC...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/YDNE7EL3R36WGY...

On Tue, Feb 22, 2022 at 9:48 AM Simone Tiraboschi <stirabos(a)redhat.com> wrote:
Just to clarify the state of things a little: It is not only technically there. KubeVirt supports pci passthrough, GPU passthrough and SRIOV (including live-migration for SRIOV). I can't say if the OpenShift UI can compete with oVirt at this stage.
Best regards, Roman
Well, I guess it's there, mostly because they didn't have to do anything new, it's part of KVM/libvirt and more inherited than added. The main reason I "don't see it coming" is that may create more problems than it solves. To my understanding K8 is all about truly elastic workloads, including "mobility" to avoid constraints (including memory overcommit). Mobility in quotes, because I don't even know if it migrates containers or just shuts instances down in one place and launches them in another: migration itself has a significant cost after all. But if it were to migrate them (e.g. via CRIU for containers and "vMotion" for VMs) it would then to also have to understand (via KubeVirt), which devices are tied, because they use a device that has too big a state (e.g. a multi-gig CUDA workloads), a hard physical dependence (e.g. USB with connected devices) or something that could move with the VM (e.g. SR-IOV FC/NIC/INF with a fabric that can be re-configured to match or is also virtualized). A proper negotiation between the not-so-dynamic physically available assets of the DC and the much more dynamic resources required by the application are the full scope of a virt-stack/k8 hybrid, encompassing a DC/Cloud-OS (infrastructure) and K8 (platform) aspects. While I'd love to have that, I can see how that won't be maintained by anyone as a full free-to-use open-souce turn-key solution.

On Tue, Feb 22, 2022 at 1:25 PM Thomas Hoberg <thomas@hoberg.net> wrote:
On Tue, Feb 22, 2022 at 9:48 AM Simone Tiraboschi <stirabos(a)redhat.com > wrote:
Just to clarify the state of things a little: It is not only technically there. KubeVirt supports pci passthrough, GPU passthrough and SRIOV (including live-migration for SRIOV). I can't say if the OpenShift UI can compete with oVirt at this stage.
Best regards, Roman
Well, I guess it's there, mostly because they didn't have to do anything new, it's part of KVM/libvirt and more inherited than added.
The main reason I "don't see it coming" is that may create more problems than it solves.
To my understanding K8 is all about truly elastic workloads, including "mobility" to avoid constraints (including memory overcommit). Mobility in quotes, because I don't even know if it migrates containers or just shuts instances down in one place and launches them in another: migration itself has a significant cost after all.
We implemented live migrations for VMs quite some time ago. In practice that means that we are migrating qemu processes between pods on different nodes. k8s does not dictate anything regarding the workload. There is just a scheduler which can or can not schedule your workload to nodes.
But if it were to migrate them (e.g. via CRIU for containers and "vMotion" for VMs) it would then to also have to understand (via KubeVirt), which devices are tied,
As far as I know pci passthrough and live migration do not mix well in general because neither oVirt nor OpenStack or other platforms can migrate the pci device state, since it is not in a place where it can be copied. Only SRIOV allows that via explicit unplug and re-plug.
because they use a device that has too big a state (e.g. a multi-gig CUDA workloads), a hard physical dependence (e.g. USB with connected devices) or something that could move with the VM (e.g. SR-IOV FC/NIC/INF with a fabric that can be re-configured to match or is also virtualized).
A proper negotiation between the not-so-dynamic physically available assets of the DC and the much more dynamic resources required by the application are the full scope of a virt-stack/k8 hybrid, encompassing a DC/Cloud-OS (infrastructure) and K8 (platform) aspects.
While KubeVirt does not offer everything which oVirt has at the moment, like Sandro indicated, the cases you mentioned are mostly solved and considered stable.
While I'd love to have that, I can see how that won't be maintained by anyone as a full free-to-use open-souce turn-key solution.
There are nice projects to install k8s easily, for installing kubevirt with its operator you just apply some manifests on the (bare-metal) cluster and you can start right away. I can understand that a new system like k8s may look intimidating. Best regards, Roman
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OPAKEXBS4LZSG2...

On Tue, Feb 22, 2022 at 1:25 PM Thomas Hoberg <thomas(a)hoberg.net> wrote: k8s does not dictate anything regarding the workload. There is just a scheduler which can or can not schedule your workload to nodes.
One of these days I'll have to dig deep and see what it does. "Scheduling" can encompass quite a few activities and I don't know which of them K8 covers. Batch scheduling (also Singularity/HPC) type scheduling involves also the creation (and thus consumption) of RAM/storage/GPUs, so real instances/reservations are created, which in the case of pass-through would include some hard dependencies. Normal OS scheduling is mostly CPU, while processes and stroage are alrady there and occupy resources and could find an equivalent in traffic steering, where the number of nodes that receive traffic is expanded or reduced. K8 to my understanding would do the traffic steering as a minimum and then have actions for instance creation and deletions. But given a host with hybrid loads, some with tied resources, others generic without: to manage the decisions/allocations properly you need to come up with a plan that includes 1. Given a capacity bottleneck on a host, do I ask the lower-layer (DC-OS) to create additional containers elsewhere and shut down the local ones or do I migrate the running one on a new host? 2. Given a capacity underutilization on a host, how to go best about shutting down hosts, that aren't going to be needed for the next hours in a way where the migration cost do not exceed the power savings? To my naive current understanding virt-stacks won't create and shut-down VMs, their typical (or only?) load management instrument is VM migration. Kubernetes (and Docker swarm etc.) won't migrate node instances (nor VMs), but create and destory them to manage load. At large scales (scale out) this swarm approach is obviously better, migration creates too much of an overhead. In the home domain of the virt-stacks (scale in), live migration is perhaps necessary, because the application stacks aren't ready to deal with instance destruction without service disruption or just because it is rare enough to be cheaper than instance re-creation. In the past it was more clear cut, because there was no live migration support for containers. But with CRIU (and its predecessors in OpenVZ), that could be done just as seamless as with VMs. And instance creation/deletions were more like fail-over scenarios, where (rare) service disruptions were accepted. Today the two approaches can more easily be mingled but they don't easily mix yet, and the negotiating element between them is missing.
I can understand that a new system like k8s may look intimidating.
Just understanding the two approaches and how they mix is already filling the brain capacity I can give them. Operating that mix is currently quite beyond the fact that it's only a small part of my job.
Best regards, Roman
Gleichfalls!

On 22. 2. 2022, at 14:16, Roman Mohr <rmohr@redhat.com> wrote:
On Tue, Feb 22, 2022 at 1:25 PM Thomas Hoberg <thomas@hoberg.net <mailto:thomas@hoberg.net>> wrote:
On Tue, Feb 22, 2022 at 9:48 AM Simone Tiraboschi <stirabos(a)redhat.com <http://redhat.com/>> wrote:
Just to clarify the state of things a little: It is not only technically there. KubeVirt supports pci passthrough, GPU passthrough and SRIOV (including live-migration for SRIOV). I can't say if the OpenShift UI can compete with oVirt at this stage.
Best regards, Roman
Well, I guess it's there, mostly because they didn't have to do anything new, it's part of KVM/libvirt and more inherited than added.
That you can say about oVirt and Openstack and basically anyone else as well. The foundation for basically every virtualization features is always in qemu-kvm
The main reason I "don't see it coming" is that may create more problems than it solves.
To my understanding K8 is all about truly elastic workloads, including "mobility" to avoid constraints (including memory overcommit). Mobility in quotes, because I don't even know if it migrates containers or just shuts instances down in one place and launches them in another: migration itself has a significant cost after all.
We implemented live migrations for VMs quite some time ago. In practice that means that we are migrating qemu processes between pods on different nodes. k8s does not dictate anything regarding the workload. There is just a scheduler which can or can not schedule your workload to nodes.
But if it were to migrate them (e.g. via CRIU for containers and "vMotion" for VMs) it would then to also have to understand (via KubeVirt), which devices are tied,
As far as I know pci passthrough and live migration do not mix well in general because neither oVirt nor OpenStack or other platforms can migrate the pci device state, since it is not in a place where it can be copied. Only SRIOV allows that via explicit unplug and re-plug.
Slowly but surely it is coming for other devices as well, it's in development for couple years now, and a topic for every KVMForum in the past ~5 years.
because they use a device that has too big a state (e.g. a multi-gig CUDA workloads), a hard physical dependence (e.g. USB with connected devices) or something that could move with the VM (e.g. SR-IOV FC/NIC/INF with a fabric that can be re-configured to match or is also virtualized).
A proper negotiation between the not-so-dynamic physically available assets of the DC and the much more dynamic resources required by the application are the full scope of a virt-stack/k8 hybrid, encompassing a DC/Cloud-OS (infrastructure) and K8 (platform) aspects.
While KubeVirt does not offer everything which oVirt has at the moment, like Sandro indicated, the cases you mentioned are mostly solved and considered stable.
indeed! When speaking of kubevirt itself I really think it's "just" the lack of virt-specific UI that makes it look like a too low level tool compared to the oVirt experience. Openshift/OKD UI is fixing this gap...and there's always a long way towards more maturity and more niche features and use cases to add, sure, but it is getting better and better every day. Thanks, michal
While I'd love to have that, I can see how that won't be maintained by anyone as a full free-to-use open-souce turn-key solution.
There are nice projects to install k8s easily, for installing kubevirt with its operator you just apply some manifests on the (bare-metal) cluster and you can start right away.
I can understand that a new system like k8s may look intimidating.
Best regards, Roman
_______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OPAKEXBS4LZSG2... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/OPAKEXBS4LZSG2HIU3AWGJJCLT22FFGF/> _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RXSWPU5MQHNRXB...

On Mon, Feb 21, 2022 at 12:27 PM Thomas Hoberg <thomas@hoberg.net> wrote:
That's exactly the direction I originally understood oVirt would go, with the ability to run VMs and container side-by-side on the bare metal or nested with containers inside VMs for stronger resource or security isolation and network virtualization. To me it sounded especially attractive with an HCI underpinning so you could deploy it also in the field with small 3 node clusters.
I think in general a big part of the industry is going down the path of moving most things behind the k8s API/resource model. This means different things for different companies. For instance vmware keeps its traditional virt-stack, adding additional k8s apis in front of it, and crossing the bridges to k8s clusters behind the scenes to get a unified view, while other parts are choosing k8s (be it vanilla k8s, openshift, harvester, ...) and then take for instance KubeVirt to deploy additional k8s clusters on top of it, unifying the stack this way. It is definitely true that k8s works significantly different to other solutions like oVirt or OpenStack, but once you get into it, I think one would be surprised how simple the architecture of k8s actually is, and also how little resources core k8s actually takes. Having said that, as an ex oVirt engineer I would be glad to see oVirt continue to thrive. The simplicity of oVirt was always appealing to me. Best regards, Roman
But combining all those features evidently comes at too high a cost for all the integration and the customer base is either too small or too poor: the cloud players are all out on making sure you no longer run any hardware and then it's really just about pushing your applications there as cloud native or "IaaS" compatible as needed.
E.g. I don't see PCI pass-through coming to kubevirt to enable GPU use, because it ties the machine to a specific host and goes against the grain of K8 as I understand it.
Memory overcommit is quite funny, really, because it's the same issue as the original virtual memory: essentially you lie to your consumer about the resources available and then swap pages forth and back in an attempt to make all your consumers happy. It was processes for virtual memory, it's VMs now for the hypervisor and in both cases it's about the consumer and the provider not continously negotiating for the resources they need and the price they are willing to pay.
That negotiation is always better at the highest level of abstraction, the application itself, which why implementing it at the lower levels (e.g. VMs) becomes less useful and needed.
And then there is technology like CXL which essentially turns RAM in to a fabric and your local CPU will just get RAM from another piece of hardware when your application needs more RAM and is willing to pay the premium something will charge for it.
With that type of hardware much of what hypervisors used to do goes into DPUs/IPUs and CPUs are just running applications making hypercalls. The kernel is just there to bootstrap.
Not sure we'll see that type of hardware at home or in the edge, though... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PC5SDUMCPUEHQC...
participants (4)
-
Michal Skrivanek
-
Roman Mohr
-
Simone Tiraboschi
-
Thomas Hoberg