Barak Korren created OVIRT-1984:

   Summary: Create "out-of-band" slave cleanup and setup jobs
       Key: OVIRT-1984
       URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1984
   Project: oVirt - virtualization made easy
Issue Type: New Feature
Components: Jenkins Slaves
  Reporter: Barak Korren
  Assignee: infra

Right now, we run slave cleaup and setup steps as part or every single job we run. This has several shortcomings: # It takes a long time from the point a user submitted a patch to the point his actual test or build code runs # If slave setup or cleanup steps fail – they fail the whole job for the user # If slave setup or cleanup steps fail – they can keep failing for many jobs until the CI team intervenes manually # There is a “chicken and an egg” issue where some parts of the CI code have to run before the slave was properly cleaned up and configured. This makes if harder to add new slaves for the system.

Here is a suggested scheme to fix all this: # Label all slaves that should be cleaned up automatically as ‘cleanable’. This is mostly to prevent the jobs described here from operating on the master node. # Have a “cleanup scheduler” job that finds all slaves labelled as “cleanable” but not as “dirty” or “clean”, labels them as “dirty” and runs a cleanup job on them. # Have a “cleanup” job that is triggered on particular slaves by the “cleanup scheduler” job, runs cleaup and setup steps on them and then labels them as “clean” and removes the “dirty” label. # Have all other CI jobs only use slaves with the “clean” label.

Notes: # The “dirty” label is there to make the “cleanup scheduler” job not trigger twice on the same slave before the"cleanup" job started cleaning it up. # Since all slaves used by the real jobs will always be clean – there will no longer be a need to run cleanup steps in the real jobs, thus saving time. # If cleanup steps fail – the cleanup job will fail and the slave will not be marked as “clean” so real jobs will never try to use it. # To solve the “chicken and egg” issue, the cleanup job probably must be a FreeStyle jobs and all the cleanup and setup code must be embedded into it by JJB. This will probably require a newer version of JJB then what we have so setting OVIRT-1983 as a blocker. # There is an issue of how to make CI for this – if cleanup and setup steps are removed from the normal STDCI jobs, they they will not be checked by the “check-patch” job of the “jenkins repo”. Here is a suggested scheme to solve this: ## Have a way to “loan” slaves from the production jenkins to other Jenkins instances – this could be done by having a job that starts up the Jenkins JNLP client and tells it to connect to another Jenkins master. ## As part of the “check-patch” job for the ‘jenkins’ repo – start a Jenkins master in a container – attach some production slaves to it and have it run cleanup and setup steps on them

— This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100083)