ngn build jobs take more than twice (x) as long as in the last days

Eyal Edri eedri at redhat.com
Wed May 25 15:21:13 UTC 2016


On Wed, May 25, 2016 at 6:17 PM, David Caro <dcaro at redhat.com> wrote:

> On 05/25 17:06, David Caro wrote:
> > On 05/25 16:09, Barak Korren wrote:
> > > On 25 May 2016 at 14:52, David Caro <dcaro at redhat.com> wrote:
> > > > On 05/25 14:42, Barak Korren wrote:
> > > >> On 25 May 2016 at 12:44, Eyal Edri <eedri at redhat.com> wrote:
> > > >> > OK,
> > > >> > I suggest to test using a VM with local disk (preferably on a
> host with SSD
> > > >> > configured), if its working,
> > > >> > lets expedite moving all VMs or at least a large amount of VMs to
> it until
> > > >> > we see network load reduced.
> > > >> >
> > > >>
> > > >> This is not that easy, oVirt doesn't support mixing local disk and
> > > >> storage in the same cluster, so we will need to move hosts to a new
> > > >> cluster for this.
> > > >> Also we will lose the ability to use templates, or otherwise have to
> > > >> create the templates on each and every disk.
> > > >>
> > > >> The scratch disk is a good solution for this, where you can have the
> > > >> OS image on the central storage and the ephemeral data on the local
> > > >> disk.
> > > >>
> > > >> WRT to the storage architecture - a single huge (10.9T) ext4 is used
> > > >> as the FS on top of the DRBD, this is probably not the most
> efficient
> > > >> thing one can do (XFS would probably have been better, RAW via
> iSCSI -
> > > >> even better).
> > > >
> > > > That was done >3 years ago, xfs was not quite stable and widely used
> and
> > > > supported back then.
> > > >
> > > AFAIK it pre-dates EXT4
> >
> > It does, but for el6, it was performing way poorly, and with more bugs
> (for
> > what the reviews of it said at the time).
> >
> > > in any case this does not detract from the
> > > fact that the current configuration in not as efficient as we can make
> > > it.
> > >
> >
> > It does not, I agree to better focus on what we can do now on, now what
> should
> > have been done then.
> >
> > >
> > > >>
> > > >> I'm guessing that those 10/9TB are not made from a single disk but
> > > >> with a hardware RAID of some sort. In this case deactivating the
> > > >> hardware RAID and re-exposing it as multiple separate iSCSI LUNs
> (That
> > > >> are then re-joined to a single sotrage domain in oVirt) will enable
> > > >> different VMs to concurrently work on different disks. This should
> > > >> lower the per-vm storage latency.
> > > >
> > > > That would get rid of the drbd too, it's a totally different setup,
> from
> > > > scratch (no nfs either).
> > >
> > > We can and should still use DRBD, just setup a device for each disk.
> > > But yeah, NFS should probably go away.
> > > (We are seeing dramatically better performance for iSCSI in
> > > integration-engine)
> >
> > I don't understand then what you said about splitting the hardware
> raids, you
> > mean to setup one drdb device on top of each hard drive instead?
>
>
> Though I really think we should move to gluster/ceph instead for the
> jenkins
> vms, anyone knows what's the current status of the hyperconverge?
>
>
Neither Gluster nor Hyper-converge I think is stable enough to move all
production infra into.
Hyperconverged is not supported yet for oVirt as well (might be a 4.x
feature)


> That would allow us for better scalable distributed storage, and properly
> use
> the hosts local disks (we have more space on the combined hosts right now
> that
> on the storage servers).
>

I agree a stable distributed storage solution is the way to go if we can
find one :)


>
> >
> >
> > btw. I think that the nfs is used also for something more than just the
> engine
> > storage domain (just to keep it in mind that it has to be checked if we
> are
> > going to get rid of it)
> >
> > >
> > > >
> > > >>
> > > >> Looking at the storage machine I see strong indication it is IO
> bound
> > > >> - the load average is ~12 while there are just 1-5 working processes
> > > >> and the CPU is ~80% idle and the rest is IO wait.
> > > >>
> > > >> Running 'du *' at:
> > > >>
> /srv/ovirt_storage/jenkins-dc/658e5b87-1207-4226-9fcc-4e5fa02b86b4/images
> > > >> one can see that most images are ~40G in size (that is _real_ 40G
> not
> > > >> sparse!). This means that despite having most VMs created based on
> > > >> templates, the VMs are full template copies rather then COW clones.
> > > >
> > > > That should not be like that, maybe the templates are wrongly
> configured? or
> > > > foreman images?
> > >
> > > This is the expected behaviour when creating a VM from template in the
> > > oVirt admin UI. I thought Foreman might behave differently, but it
> > > seems it does not.
> > >
> > > This behaviour is determined by the parameters you pass to the engine
> > > API when instantiating a VM, so it most probably doesn't have anything
> > > to do with the template configuration.
> >
> > So maybe a misconfiguration in foreman?
> >
> > >
> > > >
> > > >> What this means is that using pools (where all VMs are COW copies of
> > > >> the single pool template) is expected to significantly reduce the
> > > >> storage utilization and therefore the IO load on it (the less you
> > > >> store, the less you need to read back).
> > > >
> > > > That should happen too without pools, with normal qcow templates.
> > >
> > > Not unless you create all the VMs via the API and pass the right
> > > parameters. Pools are the easiest way to ensure you never mess that
> > > up...
> >
> > That was the idea
> >
> > >
> > > > And in any case, that will not lower the normal io, when not actually
> > > > creating vms, as any read and write will still hit the disk anyhow,
> it
> > > > only alleviates the io when creating new vms.
> > >
> > > Since you are reading the same bits over and over (for different VMs)
> > > you enable the various buffer caches along the way (in the storage
> > > machines and in the hypevirsors) to do what they are supposed to.
> >
> >
> > Once the vm is started, mostly all that's needed is on ram, so there are
> not
> > that much reads from disk, unless you start writing down to it, and
> that's
> > mostly what we are hitting, lots of writes.
> >
> > >
> > > > The local disk (scratch disk) is the best option
> > > > imo, now and for the foreseeable future.
> > >
> > > This is not an either/or thing, IMO we need to do both.
> >
> > I think that it's way more useful, because it will solve our current
> issues
> > faster and for longer, so IMO it should get more attention sooner.
> >
> > Any improvement that does not remove the current bottleneck, is not
> really
> > giving any value to the overall infra (even if it might become valuable
> later).
> >
> > >
> > > --
> > > Barak Korren
> > > bkorren at redhat.com
> > > RHEV-CI Team
> >
> > --
> > David Caro
> >
> > Red Hat S.L.
> > Continuous Integration Engineer - EMEA ENG Virtualization R&D
> >
> > Tel.: +420 532 294 605
> > Email: dcaro at redhat.com
> > IRC: dcaro|dcaroest@{freenode|oftc|redhat}
> > Web: www.redhat.com
> > RHT Global #: 82-62605
>
>
>
> --
> David Caro
>
> Red Hat S.L.
> Continuous Integration Engineer - EMEA ENG Virtualization R&D
>
> Tel.: +420 532 294 605
> Email: dcaro at redhat.com
> IRC: dcaro|dcaroest@{freenode|oftc|redhat}
> Web: www.redhat.com
> RHT Global #: 82-62605
>



-- 
Eyal Edri
Associate Manager
RHEV DevOps
EMEA ENG Virtualization R&D
Red Hat Israel

phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/infra/attachments/20160525/70610af3/attachment.html>


More information about the Infra mailing list