Infra issu retrospective
eedri at redhat.com
Wed Jan 22 16:05:29 UTC 2014
----- Original Message -----
> From: "R P Herrold" <herrold at owlriver.com>
> To: "oVirt infrastructure ML" <infra at ovirt.org>
> Sent: Wednesday, January 22, 2014 5:35:31 PM
> Subject: Infra issu retrospective
> for the weekly sync, I see the following matters
> I was absent Monday for an appt, and do not see an email with
i was absent as well, but i think there was a meeting held,
maybe summary wasn't sent, kiril/dcaro?
> Prior week was skipped becuase of member availability issues
> as well.
> So this is a summary from the list traffic for the last few
> In no particular order:
> - Kimchi asks for jenkins coverage #105
you mixed up 2 requests:
#105 Jenkins server for oVirt Kimchi incubator project
personally, i'm not familiar with Kimchi, but considering our very limited resources now on ovirt
(both physical resources like servers/storage/etc... and especially human resources which right now is
mostly dcaro handling multiple failures on infra issues on jenkins).
but if they are willing to pitch in with resources such as hosts and people to support jenkins failures,
we can consider integrating them into jenkins.ovirt.org, otherwise i think we can mostly give support
#107 Enable coverage report during vdsm unit and functional tests,
again, will be handles after most issues will be resolved with infra,
unless someone from the vdsm team power users is ready to take this on.
> - ditto standing up an Ubuntu test instance was requested
at first, a minidell was thought to be added, but due to the lack of resources
for running findbugs/other per patch tests it was decided to allocate it to fedora/centos for now.
we can reinstall on of the rackspace vms for that.
> - Disk space issues on lists were hit on a transient basis
this is a well known hurting issue, i think we should address that ASAP,
either buy purchasing a storage server running SSD's from softlayer or expanding the 50GB disk we have now
on linode (how much it costs to add a 100-200 GB disk there?)
> - I have observed wink outages on gerrit, and lists of less
> than an hour's duration
gerrit is a major issue which we suffer almost on a daily basis, not sure if it's from tlv slow network
or the VM itself needs to migrate to a much more strong infra with high-availability feature.
> - linnode PTR and it turns out A and AAAA record have not
> proceded, as the request was being 'sat on'
> This is really needed to solve an email filtering issue at
> Comcast, and one assumes other ISPs, They also examine this data,
> along with _SPF TXT records.
> DNS management is weak as responsibility and capability to
> solve are not unified here
> - the gerrit is sluggish for unknown reasons doing version
> control CO's (several reports)
> ... possible BW limitations on some link paths?
possible post the upgrade to 2.8?
> - jenkins got a 'just in case' reboot last week, but no root
> cause analysis was performed
i belive i know the reason for the reboot, which was jobs being stuck and running for hours.
this issue was solved by dcaro for finding root cause on findbugs job running for more than one hour
due to an option enabled on the job, comparing to older builds results, disabling that reduced time to 15 min.
> Personally I am building a 'knock-off' iscsi and NFS unit,
> based on the QNAP doco and git content, for oVirt testing
> locally, ... particularly performance timing trials
all in all, we're suffering from a limited infra on jenkins due to slow network connection i belive to tlv office (mini dells)
and high load on rackspace servers (which we are schedule to migrate from).
decision on migration to softlayer is on halt due to limited budget and consideration on the optimal layout,
i'll try to bring up a suggestion on the next meeting.
> .-- -... ---.. ... -.- -.--
> Copyright (C) 2014 R P Herrold
> herrold at owlriver.com
> My words are not deathless prose,
> but they are mine.
> Infra mailing list
> Infra at ovirt.org
More information about the Infra