On Tue, Feb 22, 2022 at 1:25 PM Thomas Hoberg
<thomas(a)hoberg.net> wrote:
k8s does not dictate anything regarding the workload. There is just a
scheduler which can or can not schedule your workload to nodes.
One of these days I'll have to dig deep and see what it does.
"Scheduling" can encompass quite a few activities and I don't know which of
them K8 covers.
Batch scheduling (also Singularity/HPC) type scheduling involves also the creation (and
thus consumption) of RAM/storage/GPUs, so real instances/reservations are created, which
in the case of pass-through would include some hard dependencies.
Normal OS scheduling is mostly CPU, while processes and stroage are alrady there and
occupy resources and could find an equivalent in traffic steering, where the number of
nodes that receive traffic is expanded or reduced. K8 to my understanding would do the
traffic steering as a minimum and then have actions for instance creation and deletions.
But given a host with hybrid loads, some with tied resources, others generic without: to
manage the decisions/allocations properly you need to come up with a plan that includes
1. Given a capacity bottleneck on a host, do I ask the lower-layer (DC-OS) to create
additional containers elsewhere and shut down the local ones or do I migrate the running
one on a new host?
2. Given a capacity underutilization on a host, how to go best about shutting down hosts,
that aren't going to be needed for the next hours in a way where the migration cost do
not exceed the power savings?
To my naive current understanding virt-stacks won't create and shut-down VMs, their
typical (or only?) load management instrument is VM migration.
Kubernetes (and Docker swarm etc.) won't migrate node instances (nor VMs), but create
and destory them to manage load.
At large scales (scale out) this swarm approach is obviously better, migration creates too
much of an overhead.
In the home domain of the virt-stacks (scale in), live migration is perhaps necessary,
because the application stacks aren't ready to deal with instance destruction without
service disruption or just because it is rare enough to be cheaper than instance
re-creation.
In the past it was more clear cut, because there was no live migration support for
containers. But with CRIU (and its predecessors in OpenVZ), that could be done just as
seamless as with VMs. And instance creation/deletions were more like fail-over scenarios,
where (rare) service disruptions were accepted.
Today the two approaches can more easily be mingled but they don't easily mix yet, and
the negotiating element between them is missing.
I can understand that a new system like k8s may look intimidating.
Just understanding the two approaches and how they mix is already filling the brain
capacity I can give them.
Operating that mix is currently quite beyond the fact that it's only a small part of
my job.
Best regards,
Roman
Gleichfalls!