Our STDCI V2 system make extensive use of the so-called 'loader' PODs
except for very special cases, every job we have allocates and uses such a
POD at least once. Those PODs are used for among other things:
1. Loading Jenkins pipeline groovy code
2. Loading the scripts from our `jenkins` repo
3. Running the STDCI V2 logic to parse the YAML files and figure out
what to run on which resources
4. Rendering the STDCI V2 graphical report
The PODs are configured to require 500Mib of ram and run on the zone:ci,
type:vm hosts. This means they end up running on one of the following VMs:
name memory
shift-n04.phx.ovirt.org 16264540Ki
shift-n05.phx.ovirt.org 16264540Ki
shift-n06.phx.ovirt.org 16264528Ki
So if we make the simple calculation of how many such pods can run on 16Gib
vms, we come up with the theoretical result of 96, but Running a query like
the following on one of those hosts reveals that we share those hosts with
many other containers:
oc get --all-namespaces pods --field-selector=spec.nodeName=
shift-n04.phx.ovirt.org,status.phase==Running
I suspect allocation of the loader container is starting to be a
bottleneck. I think we might have to either increase the amount of RAM the
VMs have, or make the loader containers require less RAM. But we need to be
able to measure some things better to make a decision. Do we have ongoing
metrics for:
- What does the RAM utilization on the relevant VMs looks like
- How much ram is actually used inside the loader containers
WDYT?
--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. |
redhat.com/trusted