Hi,

recently I've been working on a PoC for OST that replaces the usage
of lago templates with pre-built, layered VM images packed in RPMs [2][7].


What's the motivation?

There are two big pains around OST - first one is that it's slow
and the second one is it uses lago, which is unmaintained.


How is OST working currently?

Lago launches VMs based on templates. It actually has its own mechanism for VM
templating - you can find the ones that we currently use here [1]. How these
templates are created? There is a multiple-page doc somewhere that describes the process,
but few are familiar with it. These templates are nothing special really - just a xzipped
qcow with some metadata attached. The proposition here is to replace those templates with
RPMs with qcows inside. The RPMs themselves would be built by a CI pipeline. An example
of a pipeline like this can be found here [2].


Why RPMs?

It ticks all the boxes really. RPMs provide:
- tried and well known mechanisms for packaging, versioning, and distribution instead
  of lago's custom ones
- dependencies which permit to layer the VM images in a controllable way
- we already install RPMs when running OST, so using the new ones is a matter of adding
  some dependencies


How the image building pipeline works? [3]

- we download a dvd iso for installation of the distro
- we use 'virt-install' with the dvd iso + kickstart file to build a 'base' layer
  qcow image
- we create another qcow image that has the 'base' image as the backing one. In this
  image we use 'virt-customize' to run 'dnf upgrade'.  This is our 'upgrade' layer.
- we create two more qcow images that have the 'upgrade' image as the backing one. On one
  of them we install the 'ovirt-host' package and on the other the 'ovirt-engine'. These are
  our 'host-installed' and 'engine-installed' layers.
- we create 4 RPMs for these qcows:
  * ost-images-base
  * ost-images-upgrade
  * ost-images-host-installed
  * ost-images-engine-installed
- we publish the RPMs to templates.ovirt.org/yum/ DNF repository (not implemented yet)

Each of those RPMs holds their respective qcow image. They also have proper dependencies
set up - since 'upgrade' layer requires 'base' layer to be functional, it has an RPM
requirement to that package. Same thing happens for '*-installed' packages which depend on
'upgrade' package.

Since this is only a PoC there's still a lot of room for improvement around the pipeline.
The 'base' RPM would be actually built very rarely, since it's a bare distro, and the
'upgrade' and '*-installed' RPMs would be built nightly. This would allow us to simply
type 'dnf upgrade' on any machine and have a fresh set of VMs ready to be used with OST.


Advantages:

- we have CI for building OST images instead of current, obscure template creating process
- we get rid of lots of unnecessary preparations that are done during each OST run
  by moving stuff from 'deploy scripts' [4] to image-building pipeline - this should
  speed up the runs a lot
- if the nightly pipeline for building images is not successful, the RPMs won't be
  published - OST will use the older ones. This makes a nice "early error detection"
  mechanism and can partially mitigate situations where everything is blocked because
  of some, i.e. dependency issues.
- it's another step for removing responsibilities from lago
- the pre-built VM images can be used for much more than OST - functional testing of
  vdsm/engine on a VM? We have an image for that
- we can build images for multiple distros, both u/s and d/s, easily


Caveats:

- we have to download the RPMs before running OST and that takes time, since they're big.
  This can be handled by having them cached on the CI slaves though.
- current limitations of CI and lago force us to make a copy of the images after
  installation so they can be seen both by the processes in the chroot and libvirt, which
  is running outside of chroot. Right now they're placed in '/dev/shm' (which would
  actually make some sense if they could be shared among all OST runs on the slave, but
  that's another story). There are some possible workarounds around that problem too (like
  running pipelines on bare metal machines with libvirt running inside chroot)
- multiple qcow layers can slow down the runs because there's a lot of jumping around.
  This can be handled by i.e. introducing a meta package that squashes all the layers into
  one.
- we need a way to run OST with custom-built artifacts. There are multiple ways we can
  approach it:
  * use 'upgrade' layer and not '*-installed' one
  * first build your artifacts, then build VM image RPMs that have your artifacts
    installed and pass those RPMs to OST run
  * add 'ci build vms' that will do both ^^^ steps for you
  Even here we can still benefit from using pre-built images - we can create
  a 'deps-installed' layer that sits between 'upgrade' and '*-installed' and contains
  all vdsm's/engine's dependencies.


Some numbers

So let's take a look at two OST runs - first one that uses the old way of working [5]
and one that uses the new pre-built VM images [6]. The hacky change that allows us to
use the pre-built images is here [7]. Here are some running times:

- chroot init: 00:34 for the old way vs 14:03 for pre-built images

This happens because the slave didn't have the new RPMs and chroot cached, so it took a lot
of time to download them - the RPMs are ~2GB currently. When they will be available
in cache it will get much closer to the old-way timing.

- deployment times:
  * engine 08:09 for the old way vs 03:31 for pre-built images
  * host-1 05:05 for the old way vs 02:00 for pre-built images

Here we can clearly see the benefits. This is without any special fine tuning really -
even when using pre-built images there's still some deployment done, which can be moved
to image-creating pipeline.


Further improvements

We could probably get rid of all the funny custom repository stuff that we're
doing right now because we can put everything that's necessary to pre-built VM images.

We can ship the images with ssh key injected - currently lago injects an ssh
key for root user in each run, which requires selinux relabeling, which takes a lot
of time.

We can try creating 'ovirt-deployed' images, where the whole ovirt solution would
be already deployed for some tests.

WDYT?

Regards, Marcin

[1] https://templates.ovirt.org/repo/
[2] https://gerrit.ovirt.org/#/c/108430/
[3] https://gerrit.ovirt.org/#/c/108430/6/ost-images/Makefile.am
[4] https://github.com/oVirt/ovirt-system-tests/tree/master/common/deploy-scripts
[5] https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/6793/consoleFull
[6] https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/9027/consoleFull
[7] https://gerrit.ovirt.org/#/c/108610/