[JIRA] (OVIRT-2287) ovirt-system-tests_he-node-ng-suite-master is
failing on not enough memory to run VMs
by Dafna Ron (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2287?page=com.atlassian.jir... ]
Dafna Ron commented on OVIRT-2287:
----------------------------------
host memory is 4720 for both hosts. since one is running the engine should
we increase it?
{%- for i in range(hostCount) %}
lago-{{ env.suite_name }}-host-{{ i }}:
vm-type: ovirt-host
distro: el7
service_provider: systemd
memory: 4720
On Sun, Jul 8, 2018 at 7:38 AM, Barak Korren <bkorren(a)redhat.com> wrote:
>
>
> On 6 July 2018 at 11:57, Sandro Bonazzola <sbonazzo(a)redhat.com> wrote:
>
>>
>> https://jenkins.ovirt.org/job/ovirt-system-tests_he-node-ng-
>> suite-master/165/testReport/(root)/004_basic_sanity/vm_run/
>>
>> Cannot run VM. There is no host that satisfies current scheduling
>> constraints. See below for details:, The host lago-he-node-ng-suite-master-host-0
>> did not satisfy internal filter Memory because its available memory is too
>> low (656 MB) to run the VM.
>>
>>
>
> this sounds like something that needs to be fixed in the suit's
> LagoInitFile.
>
>
>
>> --
>>
>> SANDRO BONAZZOLA
>>
>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>>
>> Red Hat EMEA <https://www.redhat.com/>
>>
>> sbonazzo(a)redhat.com
>> <https://red.ht/sig>
>>
>> _______________________________________________
>> Devel mailing list -- devel(a)ovirt.org
>> To unsubscribe send an email to devel-leave(a)ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct: https://www.ovirt.org/communit
>> y/about/community-guidelines/
>> List Archives: https://lists.ovirt.org/archiv
>> es/list/devel(a)ovirt.org/message/TDKLML6YDBATFHS232GFJF7QVRTWUH74/
>>
>>
>
>
> --
> Barak Korren
> RHV DevOps team , RHCE, RHCi
> Red Hat EMEA
> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>
> _______________________________________________
> Devel mailing list -- devel(a)ovirt.org
> To unsubscribe send an email to devel-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-
> guidelines/
> List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/
> message/S6F3JGB3C5OSKNWMM3THXMIR2XYYUOGO/
>
>
> ovirt-system-tests_he-node-ng-suite-master is failing on not enough memory to run VMs
> -------------------------------------------------------------------------------------
>
> Key: OVIRT-2287
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2287
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: sbonazzo
> Assignee: infra
> Priority: Highest
> Labels: ost_failures, ost_lago
> Attachments: srv.log
>
>
> https://jenkins.ovirt.org/job/ovirt-system-tests_he-node-ng-suite-master/...
> Cannot run VM. There is no host that satisfies current scheduling
> constraints. See below for details:, The host
> lago-he-node-ng-suite-master-host-0 did not satisfy internal filter Memory
> because its available memory is too low (656 MB) to run the VM.
> --
> SANDRO BONAZZOLA
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
> Red Hat EMEA <https://www.redhat.com/>
> sbonazzo(a)redhat.com
> <https://red.ht/sig>
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100088)
6 years, 5 months
[JIRA] (OVIRT-2302) improve cleanup of stuck livemedia-creator
processes
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2302?page=com.atlassian.jir... ]
Evgheni Dereveanchin edited comment on OVIRT-2302 at 7/10/18 2:41 PM:
----------------------------------------------------------------------
Job log from OVIRT-2289:
13:39:34 [check-patch.fc28.x86_64] Cancelling nested steps due to timeout
13:39:34 [check-patch.fc28.x86_64] Sending interrupt signal to process
13:39:34 [check-patch.fc28.x86_64] sh: line 1: 50927 Terminated JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng(a)tmp/durable-2af3df32/script.sh' > '/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng(a)tmp/durable-2af3df32/jenkins-log.txt' 2>&1
13:39:44 [check-patch.fc28.x86_64] After 10s process did not stop
...
leftover process #1:
/usr/bin/python3 /usr/sbin/livemedia-creator --make-pxe-live --iso boot.iso --ks data/ovirt-node-ng-image.ks --resultdir build --tmp /var/tmp
leftover process #2 (child of #1):
qemu-system-x86_64 -nodefconfig -m 1024 --machine accel=kvm -kernel /var/tmp/lorax.imgutils._0vdnawj/isolinux/vmlinuz -initrd /var/tmp/lmc-initrd-t7xhq9lb.img -drive file=/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng/build/lmc-disk-57z7t8fn.img,cache=unsafe,discard=unmap,format=raw -drive file=/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng/boot.iso,media=cdrom,readonly=on -append ks=file:/ovirt-node-ng-image.ks inst.stage2=hd:LABEL=Fedora-S-dvd-x86_64-28 inst.text inst.cmdline -nographic -display vnc=127.0.0.1:0 -device virtio-serial-pci,id=virtio-serial0 -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.fedoraproject.anaconda.log.0 -chardev socket,id=charchannel0,host=127.0.0.1,port=37582 -object rng-random,id=virtio-rng0,filename=/dev/random -device virtio-rng-pci,rng=virtio-rng0,id=rng0,bus=pci.0,addr=0x9
full process tree:
mock - sh - sudo - make - livemedia-creat - qemu-system-x86
was (Author: ederevea):
Job log from OVIRT-2289:
13:39:34 [check-patch.fc28.x86_64] Cancelling nested steps due to timeout
13:39:34 [check-patch.fc28.x86_64] Sending interrupt signal to process
13:39:34 [check-patch.fc28.x86_64] sh: line 1: 50927 Terminated JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng(a)tmp/durable-2af3df32/script.sh' > '/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng(a)tmp/durable-2af3df32/jenkins-log.txt' 2>&1
13:39:44 [check-patch.fc28.x86_64] After 10s process did not stop
...
leftover process #1:
/usr/bin/python3 /usr/sbin/livemedia-creator --make-pxe-live --iso boot.iso --ks data/ovirt-node-ng-image.ks --resultdir build --tmp /var/tmp
leftover process #2 (child of #1):
qemu-system-x86_64 -nodefconfig -m 1024 --machine accel=kvm -kernel /var/tmp/lorax.imgutils._0vdnawj/isolinux/vmlinuz -initrd /var/tmp/lmc-initrd-t7xhq9lb.img -drive file=/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng/build/lmc-disk-57z7t8fn.img,cache=unsafe,discard=unmap,format=raw -drive file=/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng/boot.iso,media=cdrom,readonly=on -append ks=file:/ovirt-node-ng-image.ks inst.stage2=hd:LABEL=Fedora-S-dvd-x86_64-28 inst.text inst.cmdline -nographic -display vnc=127.0.0.1:0 -device virtio-serial-pci,id=virtio-serial0 -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.fedoraproject.anaconda.log.0 -chardev socket,id=charchannel0,host=127.0.0.1,port=37582 -object rng-random,id=virtio-rng0,filename=/dev/random -device virtio-rng-pci,rng=virtio-rng0,id=rng0,bus=pci.0,addr=0x9
full proccesstree:
mock---sh-+-sh-+-sudo---make---livemedia-creat-+-qemu-system-x86---3*[{qemu-system-x86}]
> improve cleanup of stuck livemedia-creator processes
> ----------------------------------------------------
>
> Key: OVIRT-2302
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2302
> Project: oVirt - virtualization made easy
> Issue Type: Improvement
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> The following job timed out yet the cleanup wasn't performed properly for it:
> https://jenkins.ovirt.org/job/ovirt-node-ng_standard-check-patch/5/consol...
> After the cleanup, livemedia-creator and a qemu process were still running and consuming memory.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100088)
6 years, 5 months
[JIRA] (OVIRT-2302) improve cleanup of stuck livemedia-creator
processes
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2302?page=com.atlassian.jir... ]
Evgheni Dereveanchin commented on OVIRT-2302:
---------------------------------------------
Job log from OVIRT-2289:
13:39:34 [check-patch.fc28.x86_64] Cancelling nested steps due to timeout
13:39:34 [check-patch.fc28.x86_64] Sending interrupt signal to process
13:39:34 [check-patch.fc28.x86_64] sh: line 1: 50927 Terminated JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng(a)tmp/durable-2af3df32/script.sh' > '/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng(a)tmp/durable-2af3df32/jenkins-log.txt' 2>&1
13:39:44 [check-patch.fc28.x86_64] After 10s process did not stop
...
leftover process #1:
/usr/bin/python3 /usr/sbin/livemedia-creator --make-pxe-live --iso boot.iso --ks data/ovirt-node-ng-image.ks --resultdir build --tmp /var/tmp
leftover process #2 (child of #1):
qemu-system-x86_64 -nodefconfig -m 1024 --machine accel=kvm -kernel /var/tmp/lorax.imgutils._0vdnawj/isolinux/vmlinuz -initrd /var/tmp/lmc-initrd-t7xhq9lb.img -drive file=/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng/build/lmc-disk-57z7t8fn.img,cache=unsafe,discard=unmap,format=raw -drive file=/home/jenkins/workspace/ovirt-node-ng_standard-check-patch/ovirt-node-ng/boot.iso,media=cdrom,readonly=on -append ks=file:/ovirt-node-ng-image.ks inst.stage2=hd:LABEL=Fedora-S-dvd-x86_64-28 inst.text inst.cmdline -nographic -display vnc=127.0.0.1:0 -device virtio-serial-pci,id=virtio-serial0 -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.fedoraproject.anaconda.log.0 -chardev socket,id=charchannel0,host=127.0.0.1,port=37582 -object rng-random,id=virtio-rng0,filename=/dev/random -device virtio-rng-pci,rng=virtio-rng0,id=rng0,bus=pci.0,addr=0x9
full proccesstree:
mock---sh-+-sh-+-sudo---make---livemedia-creat-+-qemu-system-x86---3*[{qemu-system-x86}]
> improve cleanup of stuck livemedia-creator processes
> ----------------------------------------------------
>
> Key: OVIRT-2302
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2302
> Project: oVirt - virtualization made easy
> Issue Type: Improvement
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> The following job timed out yet the cleanup wasn't performed properly for it:
> https://jenkins.ovirt.org/job/ovirt-node-ng_standard-check-patch/5/consol...
> After the cleanup, livemedia-creator and a qemu process were still running and consuming memory.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100088)
6 years, 5 months
[JIRA] (OVIRT-2301) cq job stuck for 6 days
by Barak Korren (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2301?page=com.atlassian.jir... ]
Barak Korren commented on OVIRT-2301:
-------------------------------------
The full job log tells you the right story:
{code}
Resuming build at Wed Jul 04 18:30:12 UTC 2018 after Jenkins restart
Waiting to resume part of ovirt-4.2_change-queue-tester #2570: ???
[Pipeline] End of Pipeline
java.lang.IllegalStateException: JENKINS-37121: something already locked /home/jenkins/workspace/ovirt-4.2_change-queue-tester
at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:75)
at org.jenkinsci.plugins.workflow.support.pickles.WorkspaceListLeasePickle$1.tryResolve(WorkspaceListLeasePickle.java:51)
at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:92)
at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
Caused: java.io.IOException: Failed to load build state
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:854)
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:852)
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:906)
at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Finished: FAILURE
{code}
> cq job stuck for 6 days
> -----------------------
>
> Key: OVIRT-2301
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2301
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Dafna Ron
> Assignee: infra
> Labels: ost_failures, ost_infra
>
> [~ederevea] noticed that this job has been stuck for 6 days: http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/2570/
> The job got stuck while waiting for build-artifacts:
> http://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-4.2_change-queu...
> before he restarts jenkins I though [~bkorren(a)redhat.com] or [~dbelenky(a)redhat.com] would like to take a look.
> the job is still reported as active
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100088)
6 years, 5 months