[JIRA] (OVIRT-2922) Jobs failing in collecting artifacts
by Roberto Ciatti (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2922?page=com.atlassian.jir... ]
Roberto Ciatti commented on OVIRT-2922:
---------------------------------------
Hi,
yes was aborted, not by me ‘cause I don’t have rights, but please look at the fact that before aborting it the job was blocked. And it was launched after one that worked correctly.
This morning happened the same:
* \[[https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/19/|https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/19/]] SUCCESS
* \[[https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/|https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/19/]20/] BLOCKED
And there are no reason to block 'cause the patch phase is working correctly and the merge is doing the same thing (only pom.xml version changes or small changes to doc files).
Something is blocking a merge job after a succeeding one.
I opened an issue for that but please keep me updated.
thanks and regards
Roberto
> Jobs failing in collecting artifacts
> ------------------------------------
>
> Key: OVIRT-2922
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2922
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Components: Jenkins Master, Jenkins Slaves
> Reporter: Roberto Ciatti
> Assignee: infra
> Priority: High
>
> Hi,
> I have some merge jobs failing collecting artifacts
> For example the [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/17] is blocked in
> [Pipeline] writeFile
> [Pipeline] archiveArtifacts
> 20:37:00 Archiving artifacts
> [Pipeline] }
> [Pipeline] // node
> [Pipeline] }
> The thread dump shows
> Thread #108
> at DSL.sh(awaiting process completion in /home/jenkins/workspace/ovirt-engine-sdk-ruby_standard-on-merge/ovirt-engine-sdk-ruby@tmp/durable-4c24beab on vm0045.workers-phx.ovirt.org; recurrence period: 15000ms; check task scheduled; cancelled? false done? false)
> at Script4.mock_runner(Script4.groovy:450)
> at Script4.run_std_ci_in_mock(Script4.groovy:408)
> at Script7.withHook(Script7.groovy:20)
> at Script4.run_std_ci_in_mock(Script4.groovy:404)
> at DSL.dir(Native Method)
> at Script4.run_std_ci_in_mock(Script4.groovy:397)
> at Script4.run_std_ci_on_node(Script4.groovy:367)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:90)
> at DSL.node(running on vm0045.workers-phx.ovirt.org)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:89)
> at DSL.parallel(Native Method)
> at Script4.run_std_ci_jobs(Script4.groovy:64)
> at Script1.main(Script1.groovy:130)
> at DSL.stage(Native Method)
> at Script1.main(Script1.groovy:129)
> at WorkflowScript.main(WorkflowScript:81)
> at DSL.withEnv(Native Method)
> at WorkflowScript.main(WorkflowScript:80)
> at WorkflowScript.run(WorkflowScript:17)
> at DSL.timestamps(Native Method)
> at WorkflowScript.run(WorkflowScript:17)
> The same happened to
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/16/]
> But before another job finished correctly
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/15/]
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100125)
4 years, 7 months
[JIRA] (OVIRT-2922) Jobs failing in collecting artifacts
by Ehud Yonasi (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2922?page=com.atlassian.jir... ]
Ehud Yonasi commented on OVIRT-2922:
------------------------------------
Hey,
I can see that run #16 was aborted and the slave was vm0045. because of that cleanup scripts were not run and the slave needs to be fixed.
I will offline the slave and ask to clean it up.
Thanks for reporting.
> Jobs failing in collecting artifacts
> ------------------------------------
>
> Key: OVIRT-2922
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2922
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Components: Jenkins Master, Jenkins Slaves
> Reporter: Roberto Ciatti
> Assignee: infra
> Priority: High
>
> Hi,
> I have some merge jobs failing collecting artifacts
> For example the [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/17] is blocked in
> [Pipeline] writeFile
> [Pipeline] archiveArtifacts
> 20:37:00 Archiving artifacts
> [Pipeline] }
> [Pipeline] // node
> [Pipeline] }
> The thread dump shows
> Thread #108
> at DSL.sh(awaiting process completion in /home/jenkins/workspace/ovirt-engine-sdk-ruby_standard-on-merge/ovirt-engine-sdk-ruby@tmp/durable-4c24beab on vm0045.workers-phx.ovirt.org; recurrence period: 15000ms; check task scheduled; cancelled? false done? false)
> at Script4.mock_runner(Script4.groovy:450)
> at Script4.run_std_ci_in_mock(Script4.groovy:408)
> at Script7.withHook(Script7.groovy:20)
> at Script4.run_std_ci_in_mock(Script4.groovy:404)
> at DSL.dir(Native Method)
> at Script4.run_std_ci_in_mock(Script4.groovy:397)
> at Script4.run_std_ci_on_node(Script4.groovy:367)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:90)
> at DSL.node(running on vm0045.workers-phx.ovirt.org)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:89)
> at DSL.parallel(Native Method)
> at Script4.run_std_ci_jobs(Script4.groovy:64)
> at Script1.main(Script1.groovy:130)
> at DSL.stage(Native Method)
> at Script1.main(Script1.groovy:129)
> at WorkflowScript.main(WorkflowScript:81)
> at DSL.withEnv(Native Method)
> at WorkflowScript.main(WorkflowScript:80)
> at WorkflowScript.run(WorkflowScript:17)
> at DSL.timestamps(Native Method)
> at WorkflowScript.run(WorkflowScript:17)
> The same happened to
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/16/]
> But before another job finished correctly
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/15/]
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100125)
4 years, 7 months
[JIRA] (OVIRT-2922) Jobs failing in collecting artifacts
by Roberto Ciatti (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2922?page=com.atlassian.jir... ]
Roberto Ciatti commented on OVIRT-2922:
---------------------------------------
Hi all,
this morning happened the same thing as yesterday evening.
I relaunched (via ‘ci re-merge please’) the \[[https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/1|https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/17]6] that yesterday was aborted and this time went fine.
But launching another job that failed yesterday (always with ‘ci re-merge please’), this time the job hangs like yesterday \[[https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/20|https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/20]].
Can be something that remains dirty in the CI env after the first job execution until JOB receive a SIGKILL (i guess from a timeout check… after 3 hours …)?
Thanks for the help
Kind regards
Roberto
> Jobs failing in collecting artifacts
> ------------------------------------
>
> Key: OVIRT-2922
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2922
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Components: Jenkins Master, Jenkins Slaves
> Reporter: Roberto Ciatti
> Assignee: infra
> Priority: High
>
> Hi,
> I have some merge jobs failing collecting artifacts
> For example the [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/17] is blocked in
> [Pipeline] writeFile
> [Pipeline] archiveArtifacts
> 20:37:00 Archiving artifacts
> [Pipeline] }
> [Pipeline] // node
> [Pipeline] }
> The thread dump shows
> Thread #108
> at DSL.sh(awaiting process completion in /home/jenkins/workspace/ovirt-engine-sdk-ruby_standard-on-merge/ovirt-engine-sdk-ruby@tmp/durable-4c24beab on vm0045.workers-phx.ovirt.org; recurrence period: 15000ms; check task scheduled; cancelled? false done? false)
> at Script4.mock_runner(Script4.groovy:450)
> at Script4.run_std_ci_in_mock(Script4.groovy:408)
> at Script7.withHook(Script7.groovy:20)
> at Script4.run_std_ci_in_mock(Script4.groovy:404)
> at DSL.dir(Native Method)
> at Script4.run_std_ci_in_mock(Script4.groovy:397)
> at Script4.run_std_ci_on_node(Script4.groovy:367)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:90)
> at DSL.node(running on vm0045.workers-phx.ovirt.org)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:89)
> at DSL.parallel(Native Method)
> at Script4.run_std_ci_jobs(Script4.groovy:64)
> at Script1.main(Script1.groovy:130)
> at DSL.stage(Native Method)
> at Script1.main(Script1.groovy:129)
> at WorkflowScript.main(WorkflowScript:81)
> at DSL.withEnv(Native Method)
> at WorkflowScript.main(WorkflowScript:80)
> at WorkflowScript.run(WorkflowScript:17)
> at DSL.timestamps(Native Method)
> at WorkflowScript.run(WorkflowScript:17)
> The same happened to
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/16/]
> But before another job finished correctly
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/15/]
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100125)
4 years, 7 months
[JIRA] (OVIRT-2922) Jobs failing in collecting artifacts
by Roberto Ciatti (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2922?page=com.atlassian.jir... ]
Roberto Ciatti updated OVIRT-2922:
----------------------------------
Priority: High (was: Medium)
> Jobs failing in collecting artifacts
> ------------------------------------
>
> Key: OVIRT-2922
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2922
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Components: Jenkins Master, Jenkins Slaves
> Reporter: Roberto Ciatti
> Assignee: infra
> Priority: High
>
> Hi,
> I have some merge jobs failing collecting artifacts
> For example the [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/17] is blocked in
> [Pipeline] writeFile
> [Pipeline] archiveArtifacts
> 20:37:00 Archiving artifacts
> [Pipeline] }
> [Pipeline] // node
> [Pipeline] }
> The thread dump shows
> Thread #108
> at DSL.sh(awaiting process completion in /home/jenkins/workspace/ovirt-engine-sdk-ruby_standard-on-merge/ovirt-engine-sdk-ruby@tmp/durable-4c24beab on vm0045.workers-phx.ovirt.org; recurrence period: 15000ms; check task scheduled; cancelled? false done? false)
> at Script4.mock_runner(Script4.groovy:450)
> at Script4.run_std_ci_in_mock(Script4.groovy:408)
> at Script7.withHook(Script7.groovy:20)
> at Script4.run_std_ci_in_mock(Script4.groovy:404)
> at DSL.dir(Native Method)
> at Script4.run_std_ci_in_mock(Script4.groovy:397)
> at Script4.run_std_ci_on_node(Script4.groovy:367)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:90)
> at DSL.node(running on vm0045.workers-phx.ovirt.org)
> at Script4.mk_mock_std_ci_runner(Script4.groovy:89)
> at DSL.parallel(Native Method)
> at Script4.run_std_ci_jobs(Script4.groovy:64)
> at Script1.main(Script1.groovy:130)
> at DSL.stage(Native Method)
> at Script1.main(Script1.groovy:129)
> at WorkflowScript.main(WorkflowScript:81)
> at DSL.withEnv(Native Method)
> at WorkflowScript.main(WorkflowScript:80)
> at WorkflowScript.run(WorkflowScript:17)
> at DSL.timestamps(Native Method)
> at WorkflowScript.run(WorkflowScript:17)
> The same happened to
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/16/]
> But before another job finished correctly
> [https://jenkins.ovirt.org/job/ovirt-engine-sdk-ruby_standard-on-merge/15/]
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100125)
4 years, 7 months
[JIRA] (OVIRT-2921) Permissions to cancel jobs
by Ehud Yonasi (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2921?page=com.atlassian.jir... ]
Ehud Yonasi commented on OVIRT-2921:
------------------------------------
Hi [~accountid:5e298ffca7b9540e76f55c51] ,
We don’t want to abort jobs manually on vm / bm slaves because of set up a slave for the execution and cleaning up afterward. If we abort the job during the execution we will have leftovers and it will interfere with future jobs running on the slave.
Regards,
Ehud.
> Permissions to cancel jobs
> --------------------------
>
> Key: OVIRT-2921
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2921
> Project: oVirt - virtualization made easy
> Issue Type: Task
> Components: Jenkins Master, Jenkins Slaves
> Reporter: Roberto Ciatti
> Assignee: infra
>
> Hi,
> I can log into jenkins (user: rciatti), I can see the job progress bar but I can't see any command to cancel or abort the job.
> Am I missing something?
> Thanks for the help.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100125)
4 years, 7 months