בתאריך יום ב׳, 27 באפר׳ 2020, 17:15, מאת Marcin Sobczyk <msobczyk(a)redhat.com>:
>
> Hi,
>
> recently I've been working on a PoC for OST that replaces the usage
> of lago templates with pre-built, layered VM images packed in RPMs [2][7].
>
>
> What's the motivation?
>
> There are two big pains around OST - first one is that it's slow
> and the second one is it uses lago, which is unmaintained.
>
>
> How is OST working currently?
>
> Lago launches VMs based on templates. It actually has its own mechanism for VM
> templating - you can find the ones that we currently use here [1]. How these
> templates are created? There is a multiple-page doc somewhere that describes the
process,
> but few are familiar with it. These templates are nothing special really - just a
xzipped
> qcow with some metadata attached. The proposition here is to replace those templates
with
> RPMs with qcows inside. The RPMs themselves would be built by a CI pipeline. An
example
> of a pipeline like this can be found here [2].
>
>
> Why RPMs?
>
> It ticks all the boxes really. RPMs provide:
> - tried and well known mechanisms for packaging, versioning, and distribution
instead
> of lago's custom ones
> - dependencies which permit to layer the VM images in a controllable way
> - we already install RPMs when running OST, so using the new ones is a matter of
adding
> some dependencies
>
>
> How the image building pipeline works? [3]
>
> - we download a dvd iso for installation of the distro
> - we use 'virt-install' with the dvd iso + kickstart file to build a
'base' layer
> qcow image
> - we create another qcow image that has the 'base' image as the backing one.
In this
> image we use 'virt-customize' to run 'dnf upgrade'. This is our
'upgrade' layer.
> - we create two more qcow images that have the 'upgrade' image as the backing
one. On one
> of them we install the 'ovirt-host' package and on the other the
'ovirt-engine'. These are
> our 'host-installed' and 'engine-installed' layers.
> - we create 4 RPMs for these qcows:
> * ost-images-base
> * ost-images-upgrade
> * ost-images-host-installed
> * ost-images-engine-installed
> - we publish the RPMs to
templates.ovirt.org/yum/ DNF repository (not implemented
yet)
>
> Each of those RPMs holds their respective qcow image. They also have proper
dependencies
> set up - since 'upgrade' layer requires 'base' layer to be
functional, it has an RPM
> requirement to that package. Same thing happens for '*-installed' packages
which depend on
> 'upgrade' package.
>
> Since this is only a PoC there's still a lot of room for improvement around the
pipeline.
> The 'base' RPM would be actually built very rarely, since it's a bare
distro, and the
> 'upgrade' and '*-installed' RPMs would be built nightly. This would
allow us to simply
> type 'dnf upgrade' on any machine and have a fresh set of VMs ready to be
used with OST.
>
>
> Advantages:
>
> - we have CI for building OST images instead of current, obscure template creating
process
> - we get rid of lots of unnecessary preparations that are done during each OST run
> by moving stuff from 'deploy scripts' [4] to image-building pipeline - this
should
> speed up the runs a lot
> - if the nightly pipeline for building images is not successful, the RPMs won't
be
> published - OST will use the older ones. This makes a nice "early error
detection"
> mechanism and can partially mitigate situations where everything is blocked
because
> of some, i.e. dependency issues.
> - it's another step for removing responsibilities from lago
> - the pre-built VM images can be used for much more than OST - functional testing of
> vdsm/engine on a VM? We have an image for that
> - we can build images for multiple distros, both u/s and d/s, easily
>
>
> Caveats:
>
> - we have to download the RPMs before running OST and that takes time, since
they're big.
> This can be handled by having them cached on the CI slaves though.
> - current limitations of CI and lago force us to make a copy of the images after
> installation so they can be seen both by the processes in the chroot and libvirt,
which
> is running outside of chroot. Right now they're placed in '/dev/shm'
(which would
> actually make some sense if they could be shared among all OST runs on the slave,
but
> that's another story). There are some possible workarounds around that problem
too (like
> running pipelines on bare metal machines with libvirt running inside chroot)
> - multiple qcow layers can slow down the runs because there's a lot of jumping
around.
> This can be handled by i.e. introducing a meta package that squashes all the layers
into
> one.
> - we need a way to run OST with custom-built artifacts. There are multiple ways we
can
> approach it:
> * use 'upgrade' layer and not '*-installed' one
> * first build your artifacts, then build VM image RPMs that have your artifacts
> installed and pass those RPMs to OST run
> * add 'ci build vms' that will do both ^^^ steps for you
> Even here we can still benefit from using pre-built images - we can create
> a 'deps-installed' layer that sits between 'upgrade' and
'*-installed' and contains
> all vdsm's/engine's dependencies.
>
>
> Some numbers
>
> So let's take a look at two OST runs - first one that uses the old way of working
[5]
> and one that uses the new pre-built VM images [6]. The hacky change that allows us
to
> use the pre-built images is here [7]. Here are some running times:
>
> - chroot init: 00:34 for the old way vs 14:03 for pre-built images
>
> This happens because the slave didn't have the new RPMs and chroot cached, so it
took a lot
> of time to download them - the RPMs are ~2GB currently. When they will be available
> in cache it will get much closer to the old-way timing.
>
> - deployment times:
> * engine 08:09 for the old way vs 03:31 for pre-built images
> * host-1 05:05 for the old way vs 02:00 for pre-built images
>
> Here we can clearly see the benefits. This is without any special fine tuning really
-
> even when using pre-built images there's still some deployment done, which can be
moved
> to image-creating pipeline.
>
>
> Further improvements
>
> We could probably get rid of all the funny custom repository stuff that we're
> doing right now because we can put everything that's necessary to pre-built VM
images.
>
> We can ship the images with ssh key injected - currently lago injects an ssh
> key for root user in each run, which requires selinux relabeling, which takes a lot
> of time.
>
> We can try creating 'ovirt-deployed' images, where the whole ovirt solution
would
> be already deployed for some tests.
>
> WDYT?
We should not reinvent packer.io. It's bad enough we're reinventing Vagrant with
Lago.