This is IMO miss-diagnosing the issue - the problem is not a failed
mirror - the problem is 404 on getting a metadata file from a mirror
that was updated because you have a stale repomd.xml file on your
local cache. Another mirror will not help there because it would
probably be updated as well.
I do not think there is anything wrong with mirrors occasionally failing.
Nobody promised that they will always be working. Exactly for that all
repos provide a large network of distributed mirrors and this failover
functionality in yum so it is able to find the working mirror in the list.
If we do have stale repomd.xml somewhere than it should not happen and it
is a bug in our system. We need to invalidate it.
Also Fedora and most other mirrors use pull (the only exception I am aware
of are Debian and Ubuntu security mirrors that use push for speed reason,
but they are atomic) so they are not updated all at once. If need we can
open this thread ion some relevant mailing list to confirm or deny my
assumptions.
You could also solve it by running 'yum clean' all the time
but that
would severely slow things down.
I am not very good and yum/rpm stuff, but how it worked on my normal Fedora
system without me executing any yum cleans?
The best solution is IMO to have our own "stable" mirror that _never_
changes while jobs are running.
Why would our mirror be any more stable? We already have over 100 mirrors
on the internet, so why we think that our mirror will be any more stable
then those? We already developed repoproxy that is that "stable" mirror
you are looking for and as we see it occasionally fails. And it is normal.
Every single point that you use will fail. The only way to provide a
resiliency is to be able to get rid of a central point.
--
Anton Marchukov
Senior Software Engineer - RHEV CI - Red Hat