ovirt tests failing on missing libxml2-python

2016-06-24 17:28:17,843 DEBUG [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] Got: ***L:ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3'] 2016-06-24 17:28:17,844 DEBUG [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] nextEvent: Log ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3'] 2016-06-24 17:28:17,860 ERROR This is failing 3.6,4.0 and master tests, anyone knows on a recent dependency change in vdsm or other host level pkg? -- Eyal Edri Associate Manager RHEV DevOps EMEA ENG Virtualization R&D Red Hat Israel phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

Il 24/Giu/2016 11:44 PM, "Eyal Edri" <eedri@redhat.com> ha scritto:
2016-06-24 17:28:17,843 DEBUG
[org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] Got: ***L:ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3'] This is a repository refresh issue. Maybe a corrupted mirror.
2016-06-24 17:28:17,844 DEBUG [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] nextEvent: Log ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3'] 2016-06-24 17:28:17,860 ERROR
This is failing 3.6,4.0 and master tests, anyone knows on a recent dependency change in vdsm or other host level pkg?
-- Eyal Edri Associate Manager RHEV DevOps EMEA ENG Virtualization R&D Red Hat Israel
phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Barak fixed it with [1]. Seems that we were missing libxml2-python from reposync. Which brings us to the though of maybe dropping the reposync from jenkins jobs to avoid such errors and to speed the jobs, if we'll see a lot of errors coming from the change we can consider reverting, but I think adding this as option should be good. A lago issue will be opened on it (I think the logic is inside lago-ovirt plugin) [1] https://gerrit.ovirt.org/#/c/59771/ On Sat, Jun 25, 2016 at 12:59 AM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:
Il 24/Giu/2016 11:44 PM, "Eyal Edri" <eedri@redhat.com> ha scritto:
2016-06-24 17:28:17,843 DEBUG
[org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] Got: ***L:ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3']
This is a repository refresh issue. Maybe a corrupted mirror.
2016-06-24 17:28:17,844 DEBUG [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] nextEvent: Log ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3'] 2016-06-24 17:28:17,860 ERROR
This is failing 3.6,4.0 and master tests, anyone knows on a recent dependency change in vdsm or other host level pkg?
-- Eyal Edri Associate Manager RHEV DevOps EMEA ENG Virtualization R&D Red Hat Israel
phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Eyal Edri Associate Manager RHEV DevOps EMEA ENG Virtualization R&D Red Hat Israel phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

On Sun, Jun 26, 2016 at 4:29 PM, Eyal Edri <eedri@redhat.com> wrote:
Barak fixed it with [1]. Seems that we were missing libxml2-python from reposync. Which brings us to the though of maybe dropping the reposync from jenkins jobs to avoid such errors and to speed the jobs, if we'll see a lot of errors coming from the change we can consider reverting, but I think adding this as option should be good.
It means that packages will be fetched EVERY time from outside, which may be slow(er). Y.
A lago issue will be opened on it (I think the logic is inside lago-ovirt plugin)
[1] https://gerrit.ovirt.org/#/c/59771/
On Sat, Jun 25, 2016 at 12:59 AM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:
Il 24/Giu/2016 11:44 PM, "Eyal Edri" <eedri@redhat.com> ha scritto:
2016-06-24 17:28:17,843 DEBUG
[org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] Got: ***L:ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3']
This is a repository refresh issue. Maybe a corrupted mirror.
2016-06-24 17:28:17,844 DEBUG [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [796140d9] nextEvent: Log ERROR Failed to execute stage 'Package installation': [u'libxml2-python-2.9.1-6.el7_2.3.x86_64 requires libxml2 = 2.9.1-6.el7_2.3'] 2016-06-24 17:28:17,860 ERROR
This is failing 3.6,4.0 and master tests, anyone knows on a recent dependency change in vdsm or other host level pkg?
-- Eyal Edri Associate Manager RHEV DevOps EMEA ENG Virtualization R&D Red Hat Israel
phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Eyal Edri Associate Manager RHEV DevOps EMEA ENG Virtualization R&D Red Hat Israel
phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ lago-devel mailing list lago-devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/lago-devel

It means that packages will be fetched EVERY time from outside, which may be slow(er). Y.
We can (and mostly already have) setup simple caches to prevent that. AFAIK CI slaves are cleaned every time anyway, so in practice there wouldn`t be much difference except we will have less hard-coding and perhaps be more efficient (are we certain we only download what we need atm?) The existing solution looks more like premature optimization gone badly IMO. -- Barak Korren bkorren@redhat.com RHEV-CI Team

On Mon, Jun 27, 2016 at 9:45 AM, Barak Korren <bkorren@redhat.com> wrote:
It means that packages will be fetched EVERY time from outside, which
may be
slow(er). Y.
We can (and mostly already have) setup simple caches to prevent that.
How do you set up cache on a developer's laptop?
AFAIK CI slaves are cleaned every time anyway, so in practice there wouldn`t be much difference except we will have less hard-coding and perhaps be more efficient (are we certain we only download what we need atm?)
The repo directory does not need to be cleaned every time. It can also be resync'ed from a central repo - which still going to be faster than any other fetching. (hopefully sync'ed into the slave /dev/shm btw).
The existing solution looks more like premature optimization gone badly IMO.
Try to run ovirt-system-tests, clean the repo and re-run - it's 20-30 minutes at least longer - which is far more than what it takes to run the whole test suite. I completely agree the manual maintenance is an annoyance, wish we had something in between. Y.
-- Barak Korren bkorren@redhat.com RHEV-CI Team

On 29 June 2016 at 21:45, Yaniv Kaul <ykaul@redhat.com> wrote:
On Mon, Jun 27, 2016 at 9:45 AM, Barak Korren <bkorren@redhat.com> wrote:
It means that packages will be fetched EVERY time from outside, which may be slow(er). Y.
We can (and mostly already have) setup simple caches to prevent that.
How do you set up cache on a developer's laptop?
We may have been unclear in our intentions, we want to make the pre-syncing optional not remove it completely. It does make sense on the laptop (sometimes), but not so much in the CI env.
The repo directory does not need to be cleaned every time.
This is an assumption that may break if we end up having any corrupt or failing packages in the cache. It also make it hard to "go back in time" if we want to test without some update. (Cleaning corrupt caches an re-running is easy in a local setting, in CI you end up dealing with angry devs getting false '-1's)
It can also be resync'ed from a central repo - which still going to be faster than any other fetching. (hopefully sync'ed into the slave /dev/shm btw).
It could be faster, but could also be slower if you end up fetching more then you have to. (if engine setup fails on missing dependency, you just spent needless time fetching VDSM dpes) Also fetching by itself may not be the bottleneck in all cases, it is surely slow when fetching from PHX to TLV, but when fetching from the Squid proxy's RAM inside PHX it can actually end up being faster then copying from the local disk.
The existing solution looks more like premature optimization gone badly IMO.
Try to run ovirt-system-tests, clean the repo and re-run - it's 20-30 minutes at least longer - which is far more than what it takes to run the whole test suite.
I wonder how many of those minutes are spend on fetching things we actually need, and how much is spent on overhead. I suspect that without a local cache, the test run will be longer, but not as long as the pre-fetching+tests takes currently. More importantly, this may allow the CI to fail faster. I think we should at least test that.
I completely agree the manual maintenance is an annoyance, wish we had something in between.
Maybe we can take a middle ground, pre-fetch, but also enable external repos in CI (perhaps with some way to log and find out what was not pre-fetched). -- Barak Korren bkorren@redhat.com RHEV-CI Team

On Wed, Jun 29, 2016 at 11:15 PM, Barak Korren <bkorren@redhat.com> wrote:
On 29 June 2016 at 21:45, Yaniv Kaul <ykaul@redhat.com> wrote:
On Mon, Jun 27, 2016 at 9:45 AM, Barak Korren <bkorren@redhat.com> wrote:
It means that packages will be fetched EVERY time from outside, which may be slow(er). Y.
We can (and mostly already have) setup simple caches to prevent that.
How do you set up cache on a developer's laptop?
We may have been unclear in our intentions, we want to make the pre-syncing optional not remove it completely. It does make sense on the laptop (sometimes), but not so much in the CI env.
The repo directory does not need to be cleaned every time.
This is an assumption that may break if we end up having any corrupt or failing packages in the cache. It also make it hard to "go back in time" if we want to test without some update. (Cleaning corrupt caches an re-running is easy in a local setting, in CI you end up dealing with angry devs getting false '-1's)
True, and we don't want that. Developers have to trust the CI system. This is an important point.
It can also be resync'ed from a central repo - which still going to be faster than any other fetching. (hopefully sync'ed into the slave /dev/shm btw).
It could be faster, but could also be slower if you end up fetching more then you have to. (if engine setup fails on missing dependency, you just spent needless time fetching VDSM dpes) Also fetching by itself may not be the bottleneck in all cases, it is surely slow when fetching from PHX to TLV, but when fetching from the Squid proxy's RAM inside PHX it can actually end up being faster then copying from the local disk.
I always fetch and store on /dev/shm/repostore It's faster than anything else. I did copy its content once to the disk, so when the host reboots, it rsync's this to /dev/shm/repostore , then tests begin. That perhaps is indeed not very needed in CI.
The existing solution looks more like premature optimization gone badly IMO.
Try to run ovirt-system-tests, clean the repo and re-run - it's 20-30 minutes at least longer - which is far more than what it takes to run the whole test suite.
I wonder how many of those minutes are spend on fetching things we actually need, and how much is spent on overhead. I suspect that without a local cache, the test run will be longer, but not as long as the pre-fetching+tests takes currently. More importantly, this may allow the CI to fail faster. I think we should at least test that.
I completely agree the manual maintenance is an annoyance, wish we had something in between.
Maybe we can take a middle ground, pre-fetch, but also enable external repos in CI (perhaps with some way to log and find out what was not pre-fetched).
This is what the code is supposed to do, I suspect. reposync syncs between what you already have and what you fetch, no? Y.
-- Barak Korren bkorren@redhat.com RHEV-CI Team

Maybe we can take a middle ground, pre-fetch, but also enable external repos in CI (perhaps with some way to log and find out what was not pre-fetched).
This is what the code is supposed to do, I suspect. reposync syncs between what you already have and what you fetch, no? Y.
I was referring to the deployment/test code. AFAIK right now the external repos are disabled before the test starts -- Barak Korren bkorren@redhat.com RHEV-CI Team

On Sun, Jul 3, 2016 at 3:31 PM, Barak Korren <bkorren@redhat.com> wrote:
Maybe we can take a middle ground, pre-fetch, but also enable external repos in CI (perhaps with some way to log and find out what was not pre-fetched).
This is what the code is supposed to do, I suspect. reposync syncs between what you already have and what you fetch, no? Y.
I was referring to the deployment/test code. AFAIK right now the external repos are disabled before the test starts
Indeed, to save time not fetching their metadata, and ensure we have the correct deps. I guess in some cases we can enable them - need to see how much time it wastes. Y.
-- Barak Korren bkorren@redhat.com RHEV-CI Team
participants (4)
-
Barak Korren
-
Eyal Edri
-
Sandro Bonazzola
-
Yaniv Kaul