[JIRA] (OVIRT-2703) oVirt Node build fails due to CPU stuck
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2703?page=com.atlassian.jir... ]
Evgheni Dereveanchin reassigned OVIRT-2703:
-------------------------------------------
Assignee: Evgheni Dereveanchin (was: infra)
> oVirt Node build fails due to CPU stuck
> ---------------------------------------
>
> Key: OVIRT-2703
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2703
> Project: oVirt - virtualization made easy
> Issue Type: By-EMAIL
> Reporter: sbonazzo
> Assignee: Evgheni Dereveanchin
>
> *CPU is getting stuck for the VM running on the slave.*
> *Error is:*
> *https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-fc28-x86_64/240/console
> <https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-...>*
> *10:44:14* 09:44:13,825 WARNING kernel:ata2: lost interrupt (Status
> 0x58)*10:44:14* 09:44:13,834 DEBUG kernel:ata2: drained 65536 bytes to
> clear DRQ*10:44:14* 09:44:13,835 EMERG kernel:watchdog: BUG: soft
> lockup - CPU#0 stuck for 32s! [scsi_eh_1:85]*10:44:14* 09:44:13,835
> WARNING kernel:Modules linked in: xfs fcoe libfcoe libfc
> scsi_transport_fc zram scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> parport_pc i2c_piix4 parport joydev loop nls_utf8 isofs 8021q garp mrp
> stp llc virtio_console serio_raw qemu_fw_cfg virtio_pci e1000
> bochs_drm drm_kms_helper ttm drm ata_generic pata_acpi sunrpc mcryptd
> sha256_ssse3 dm_crypt dm_round_robin dm_multipath linear raid10
> raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
> raid6_pq libcrc32c raid1 raid0 iscsi_ibft iscsi_boot_sysfs floppy
> iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs
> zstd_decompress xxhash cramfs edd virtio_rng virtio_ring
> virtio*10:44:14* 09:44:13,844 WARNING kernel:CPU: 0 PID: 85 Comm:
> scsi_eh_1 Not tainted 4.16.3-301.fc28.x86_64 #1*10:44:14* 09:44:13,844
> WARNING kernel:Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28
> 04/01/2014*10:44:21* 09:44:13,845 WARNING kernel:RIP:
> 0010:_raw_spin_unlock_irqrestore+0xd/0x20*10:44:21* 09:44:13,846
> WARNING kernel:RSP: 0018:ffffa31e804cfdf0 EFLAGS: 00000202 ORIG_RAX:
> ffffffffffffff12*10:44:21* 09:44:13,855 WARNING kernel:RAX:
> 0000000000000000 RBX: ffff9654e8c5c000 RCX: 0000000000000000*10:44:21*
> 09:44:13,856 WARNING kernel:RDX: 0000000000000000 RSI:
> 0000000000000202 RDI: 0000000000000202*10:44:21* 09:44:13,856 WARNING
> kernel:RBP: ffffffffbd60bd20 R08: 0000000000000038 R09:
> 00000000000002a4*10:44:21* 09:44:13,857 WARNING kernel:R10:
> 0000000000000000 R11: 0000000000000001 R12: ffffffffbd60b050*10:44:21*
> 09:44:13,857 WARNING kernel:R13: ffff9654e8c5c130 R14:
> 0000000000000202 R15: 0000000000000000*10:44:21* 09:44:13,858 WARNING
> kernel:FS: 0000000000000000(0000) GS:ffff9654fbc00000(0000)
> knlGS:0000000000000000*10:44:21* 09:44:13,865 WARNING kernel:CS: 0010
> DS: 0000 ES: 0000 CR0: 0000000080050033*10:44:21* 09:44:14,008 WARNING
> kernel:CR2: 00007fece8177000 CR3: 0000000069c18000 CR4:
> 00000000000006f0*10:44:21* 09:44:14,008 WARNING kernel:Call
> Trace:*10:44:21* 09:44:14,008 WARNING kernel:
> ata_sff_error_handler+0x83/0xe0*10:44:21* 09:44:14,009 WARNING kernel:
> ata_scsi_port_error_handler+0x354/0x770*10:44:21* 09:44:14,009 WARNING
> kernel: ? scsi_try_target_reset+0x90/0x90*10:44:21* 09:44:14,009
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: ata_scsi_error+0x91/0xc0*10:44:21* 09:44:14,010
> WARNING kernel: scsi_error_handler+0xd0/0x5b0*10:44:21* 09:44:14,010
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: kthread+0x112/0x130*10:44:21* 09:44:14,011 WARNING
> kernel: ? kthread_create_worker_on_cpu+0x70/0x70*10:44:21*
> 09:44:14,026 WARNING kernel: ?
> kthread_create_worker_on_cpu+0x70/0x70*10:44:21* 09:44:14,026 WARNING
> kernel: ret_from_fork+0x35/0x40*10:44:21* 09:44:14,026 WARNING
> kernel:Code: a8 08 74 0b 65 81 25 6f 2c 76 42 ff ff ff 7f 89 d0 c3 90
> 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 48 89 f7 57
> 9d <0f> 1f 44 00 00 c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
> *10:44:21* 09:44:14,027 ERR kernel:ata2.00: exception Emask 0x0 SAct
> 0x0 SErr 0x0 action 0x6 frozen*10:44:21* 09:44:14,027 ERR
> kernel:ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio
> 16392 in#012 Get event status notification 4a 01 00 00 10 00
> 00 00 08 00res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4
> (timeout)*10:44:21* 09:44:14,028 ERR kernel:ata2.00: status: { DRDY
> }*10:44:21* 09:44:14,028 INFO kernel:ata2: soft resetting
> link*10:44:21* 09:44:19,296 WARNING kernel:ata2.00: qc timeout (cmd
> 0xa1)*10:44:21* 09:44:19,305 WARNING kernel:ata2.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)*10:44:21* 09:44:19,305 ERR
> kernel:ata2.00: revalidation failed (errno=-5)*10:44:21* 09:44:19,305
> INFO kernel:ata2: soft resetting link*10:44:21* 09:44:21,510 INFO
> kernel:ata2.00: configured for MWDMA2*10:44:21* 09:44:21,558 INFO
> kernel:ata2: EH complete
> The slave is vm0034.workers-phx.ovirt.org
> <https://jenkins.ovirt.org/computer/vm0034.workers-phx.ovirt.org>
> Looking at the slave, it looks like several updates are available including
> a kernel update.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100099)
5 years, 8 months
Re: INVALID_SERVICE: ovirtlago
by Barak Korren
That is not the real issue, the real issue seems to be this:
+ sudo -n systemctl start docker
Job for docker.service failed because the control process exited with error
code. See "systemctl status docker.service" and "journalctl -xe" for
details.
+ sudo -n systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor
preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Mon
2019-02-25 04:03:52 UTC; 45ms ago
Docs: https://docs.docker.com
Process: 15496 ExecStart=/usr/bin/dockerd -H fd:// (code=exited,
status=1/FAILURE)
Main PID: 15496 (code=exited, status=1/FAILURE)
Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]: Failed to
start Docker Application Container Engine.
Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]: Unit
docker.service entered failed state.
Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]:
docker.service failed.
+ :
+ log ERROR 'Failed to start docker service'
+ local level=ERROR
So docker is failing to start in the integ-test container. Here is the
podspec that was used:
---apiVersion: v1kind: Podmetadata: generateName: jenkins-slave
labels: integ-tests-container: "" namespace:
jenkins-ovirt-orgspec: containers: - env: - name:
JENKINS_AGENT_WORKDIR value: /home/jenkins - name:
CI_RUNTIME_UNAME value: jenkins - name:
STDCI_SLAVE_CONTAINER_NAME value: im_a_container -
name: CONTAINER_SLOTS value: /var/lib/stdci image:
docker.io/ovirtinfra/el7-runner-node:12c9f471a6e9eccd6d5052c6c4964fff3b6670c9
command: ['/usr/sbin/init'] livenessProbe: exec:
command: ['systemctl', 'status', 'multi-user.target']
initialDelaySeconds: 360 periodSeconds: 7200 name: jnlp
resources: limits: memory: 32Gi requests:
memory: 32Gi securityContext: privileged: true
volumeMounts: - mountPath: /var/lib/stdci name:
slave-cache - mountPath: /dev/shm name: dshm
workingDir: /home/jenkins tty: true nodeSelector: model: r620
serviceAccount: jenkins-slave volumes: - hostPath: path:
/var/lib/stdci type: DirectoryOrCreate name: slave-cache
- emptyDir: medium: Memory name: dshm
Adding Gal and infra list.
On Mon, 25 Feb 2019 at 08:45, Eitan Raviv <eraviv(a)redhat.com> wrote:
> Hi,
> I have some OST patches failing on:
>
> *04:03:53* Error: INVALID_SERVICE: ovirtlago
>
> e.g. https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/344...
>
> I am fully rebased on ost master.
>
> Can you have a look?
>
> Thank you
>
>
> ---------- Forwarded message ---------
> From: Galit Rosenthal <grosenth(a)redhat.com>
> Date: Mon, Feb 25, 2019 at 8:35 AM
> Subject: Re: INVALID_SERVICE: ovirtlago
> To: Eitan Raviv <eraviv(a)redhat.com>
>
>
> I think you should consult Barak
>
> On Sun, Feb 24, 2019 at 8:26 PM Eitan Raviv <eraviv(a)redhat.com> wrote:
>
>> *13:58:57* ++ sudo -n firewall-cmd --query-service=ovirtlago*13:58:58* Error: INVALID_SERVICE: ovirtlago
>>
>> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/343...
>>
>>
>
> --
>
> GALIT ROSENTHAL
>
> SOFTWARE ENGINEER
>
> Red Hat
>
> <https://www.redhat.com/>
>
> galit(a)gmail.com T: 972-9-7692230
> <https://red.ht/sig>
>
--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
5 years, 8 months
[JIRA] (OVIRT-2703) oVirt Node build fails due to CPU stuck
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2703?page=com.atlassian.jir... ]
Eyal Edri commented on OVIRT-2703:
----------------------------------
If its a nightly job and often fails on VMs, I think we should consider running on BM.
[~amarchuk][~ederevea] thoughts?
> oVirt Node build fails due to CPU stuck
> ---------------------------------------
>
> Key: OVIRT-2703
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2703
> Project: oVirt - virtualization made easy
> Issue Type: By-EMAIL
> Reporter: sbonazzo
> Assignee: infra
>
> *CPU is getting stuck for the VM running on the slave.*
> *Error is:*
> *https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-fc28-x86_64/240/console
> <https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-...>*
> *10:44:14* 09:44:13,825 WARNING kernel:ata2: lost interrupt (Status
> 0x58)*10:44:14* 09:44:13,834 DEBUG kernel:ata2: drained 65536 bytes to
> clear DRQ*10:44:14* 09:44:13,835 EMERG kernel:watchdog: BUG: soft
> lockup - CPU#0 stuck for 32s! [scsi_eh_1:85]*10:44:14* 09:44:13,835
> WARNING kernel:Modules linked in: xfs fcoe libfcoe libfc
> scsi_transport_fc zram scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> parport_pc i2c_piix4 parport joydev loop nls_utf8 isofs 8021q garp mrp
> stp llc virtio_console serio_raw qemu_fw_cfg virtio_pci e1000
> bochs_drm drm_kms_helper ttm drm ata_generic pata_acpi sunrpc mcryptd
> sha256_ssse3 dm_crypt dm_round_robin dm_multipath linear raid10
> raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
> raid6_pq libcrc32c raid1 raid0 iscsi_ibft iscsi_boot_sysfs floppy
> iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs
> zstd_decompress xxhash cramfs edd virtio_rng virtio_ring
> virtio*10:44:14* 09:44:13,844 WARNING kernel:CPU: 0 PID: 85 Comm:
> scsi_eh_1 Not tainted 4.16.3-301.fc28.x86_64 #1*10:44:14* 09:44:13,844
> WARNING kernel:Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28
> 04/01/2014*10:44:21* 09:44:13,845 WARNING kernel:RIP:
> 0010:_raw_spin_unlock_irqrestore+0xd/0x20*10:44:21* 09:44:13,846
> WARNING kernel:RSP: 0018:ffffa31e804cfdf0 EFLAGS: 00000202 ORIG_RAX:
> ffffffffffffff12*10:44:21* 09:44:13,855 WARNING kernel:RAX:
> 0000000000000000 RBX: ffff9654e8c5c000 RCX: 0000000000000000*10:44:21*
> 09:44:13,856 WARNING kernel:RDX: 0000000000000000 RSI:
> 0000000000000202 RDI: 0000000000000202*10:44:21* 09:44:13,856 WARNING
> kernel:RBP: ffffffffbd60bd20 R08: 0000000000000038 R09:
> 00000000000002a4*10:44:21* 09:44:13,857 WARNING kernel:R10:
> 0000000000000000 R11: 0000000000000001 R12: ffffffffbd60b050*10:44:21*
> 09:44:13,857 WARNING kernel:R13: ffff9654e8c5c130 R14:
> 0000000000000202 R15: 0000000000000000*10:44:21* 09:44:13,858 WARNING
> kernel:FS: 0000000000000000(0000) GS:ffff9654fbc00000(0000)
> knlGS:0000000000000000*10:44:21* 09:44:13,865 WARNING kernel:CS: 0010
> DS: 0000 ES: 0000 CR0: 0000000080050033*10:44:21* 09:44:14,008 WARNING
> kernel:CR2: 00007fece8177000 CR3: 0000000069c18000 CR4:
> 00000000000006f0*10:44:21* 09:44:14,008 WARNING kernel:Call
> Trace:*10:44:21* 09:44:14,008 WARNING kernel:
> ata_sff_error_handler+0x83/0xe0*10:44:21* 09:44:14,009 WARNING kernel:
> ata_scsi_port_error_handler+0x354/0x770*10:44:21* 09:44:14,009 WARNING
> kernel: ? scsi_try_target_reset+0x90/0x90*10:44:21* 09:44:14,009
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: ata_scsi_error+0x91/0xc0*10:44:21* 09:44:14,010
> WARNING kernel: scsi_error_handler+0xd0/0x5b0*10:44:21* 09:44:14,010
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: kthread+0x112/0x130*10:44:21* 09:44:14,011 WARNING
> kernel: ? kthread_create_worker_on_cpu+0x70/0x70*10:44:21*
> 09:44:14,026 WARNING kernel: ?
> kthread_create_worker_on_cpu+0x70/0x70*10:44:21* 09:44:14,026 WARNING
> kernel: ret_from_fork+0x35/0x40*10:44:21* 09:44:14,026 WARNING
> kernel:Code: a8 08 74 0b 65 81 25 6f 2c 76 42 ff ff ff 7f 89 d0 c3 90
> 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 48 89 f7 57
> 9d <0f> 1f 44 00 00 c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
> *10:44:21* 09:44:14,027 ERR kernel:ata2.00: exception Emask 0x0 SAct
> 0x0 SErr 0x0 action 0x6 frozen*10:44:21* 09:44:14,027 ERR
> kernel:ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio
> 16392 in#012 Get event status notification 4a 01 00 00 10 00
> 00 00 08 00res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4
> (timeout)*10:44:21* 09:44:14,028 ERR kernel:ata2.00: status: { DRDY
> }*10:44:21* 09:44:14,028 INFO kernel:ata2: soft resetting
> link*10:44:21* 09:44:19,296 WARNING kernel:ata2.00: qc timeout (cmd
> 0xa1)*10:44:21* 09:44:19,305 WARNING kernel:ata2.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)*10:44:21* 09:44:19,305 ERR
> kernel:ata2.00: revalidation failed (errno=-5)*10:44:21* 09:44:19,305
> INFO kernel:ata2: soft resetting link*10:44:21* 09:44:21,510 INFO
> kernel:ata2.00: configured for MWDMA2*10:44:21* 09:44:21,558 INFO
> kernel:ata2: EH complete
> The slave is vm0034.workers-phx.ovirt.org
> <https://jenkins.ovirt.org/computer/vm0034.workers-phx.ovirt.org>
> Looking at the slave, it looks like several updates are available including
> a kernel update.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100099)
5 years, 8 months
[JIRA] (OVIRT-2703) oVirt Node build fails due to CPU stuck
by Yuval Turgeman (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2703?page=com.atlassian.jir... ]
Yuval Turgeman commented on OVIRT-2703:
---------------------------------------
If we have BMs, that would be best...
> oVirt Node build fails due to CPU stuck
> ---------------------------------------
>
> Key: OVIRT-2703
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2703
> Project: oVirt - virtualization made easy
> Issue Type: By-EMAIL
> Reporter: sbonazzo
> Assignee: infra
>
> *CPU is getting stuck for the VM running on the slave.*
> *Error is:*
> *https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-fc28-x86_64/240/console
> <https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-...>*
> *10:44:14* 09:44:13,825 WARNING kernel:ata2: lost interrupt (Status
> 0x58)*10:44:14* 09:44:13,834 DEBUG kernel:ata2: drained 65536 bytes to
> clear DRQ*10:44:14* 09:44:13,835 EMERG kernel:watchdog: BUG: soft
> lockup - CPU#0 stuck for 32s! [scsi_eh_1:85]*10:44:14* 09:44:13,835
> WARNING kernel:Modules linked in: xfs fcoe libfcoe libfc
> scsi_transport_fc zram scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> parport_pc i2c_piix4 parport joydev loop nls_utf8 isofs 8021q garp mrp
> stp llc virtio_console serio_raw qemu_fw_cfg virtio_pci e1000
> bochs_drm drm_kms_helper ttm drm ata_generic pata_acpi sunrpc mcryptd
> sha256_ssse3 dm_crypt dm_round_robin dm_multipath linear raid10
> raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
> raid6_pq libcrc32c raid1 raid0 iscsi_ibft iscsi_boot_sysfs floppy
> iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs
> zstd_decompress xxhash cramfs edd virtio_rng virtio_ring
> virtio*10:44:14* 09:44:13,844 WARNING kernel:CPU: 0 PID: 85 Comm:
> scsi_eh_1 Not tainted 4.16.3-301.fc28.x86_64 #1*10:44:14* 09:44:13,844
> WARNING kernel:Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28
> 04/01/2014*10:44:21* 09:44:13,845 WARNING kernel:RIP:
> 0010:_raw_spin_unlock_irqrestore+0xd/0x20*10:44:21* 09:44:13,846
> WARNING kernel:RSP: 0018:ffffa31e804cfdf0 EFLAGS: 00000202 ORIG_RAX:
> ffffffffffffff12*10:44:21* 09:44:13,855 WARNING kernel:RAX:
> 0000000000000000 RBX: ffff9654e8c5c000 RCX: 0000000000000000*10:44:21*
> 09:44:13,856 WARNING kernel:RDX: 0000000000000000 RSI:
> 0000000000000202 RDI: 0000000000000202*10:44:21* 09:44:13,856 WARNING
> kernel:RBP: ffffffffbd60bd20 R08: 0000000000000038 R09:
> 00000000000002a4*10:44:21* 09:44:13,857 WARNING kernel:R10:
> 0000000000000000 R11: 0000000000000001 R12: ffffffffbd60b050*10:44:21*
> 09:44:13,857 WARNING kernel:R13: ffff9654e8c5c130 R14:
> 0000000000000202 R15: 0000000000000000*10:44:21* 09:44:13,858 WARNING
> kernel:FS: 0000000000000000(0000) GS:ffff9654fbc00000(0000)
> knlGS:0000000000000000*10:44:21* 09:44:13,865 WARNING kernel:CS: 0010
> DS: 0000 ES: 0000 CR0: 0000000080050033*10:44:21* 09:44:14,008 WARNING
> kernel:CR2: 00007fece8177000 CR3: 0000000069c18000 CR4:
> 00000000000006f0*10:44:21* 09:44:14,008 WARNING kernel:Call
> Trace:*10:44:21* 09:44:14,008 WARNING kernel:
> ata_sff_error_handler+0x83/0xe0*10:44:21* 09:44:14,009 WARNING kernel:
> ata_scsi_port_error_handler+0x354/0x770*10:44:21* 09:44:14,009 WARNING
> kernel: ? scsi_try_target_reset+0x90/0x90*10:44:21* 09:44:14,009
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: ata_scsi_error+0x91/0xc0*10:44:21* 09:44:14,010
> WARNING kernel: scsi_error_handler+0xd0/0x5b0*10:44:21* 09:44:14,010
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: kthread+0x112/0x130*10:44:21* 09:44:14,011 WARNING
> kernel: ? kthread_create_worker_on_cpu+0x70/0x70*10:44:21*
> 09:44:14,026 WARNING kernel: ?
> kthread_create_worker_on_cpu+0x70/0x70*10:44:21* 09:44:14,026 WARNING
> kernel: ret_from_fork+0x35/0x40*10:44:21* 09:44:14,026 WARNING
> kernel:Code: a8 08 74 0b 65 81 25 6f 2c 76 42 ff ff ff 7f 89 d0 c3 90
> 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 48 89 f7 57
> 9d <0f> 1f 44 00 00 c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
> *10:44:21* 09:44:14,027 ERR kernel:ata2.00: exception Emask 0x0 SAct
> 0x0 SErr 0x0 action 0x6 frozen*10:44:21* 09:44:14,027 ERR
> kernel:ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio
> 16392 in#012 Get event status notification 4a 01 00 00 10 00
> 00 00 08 00res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4
> (timeout)*10:44:21* 09:44:14,028 ERR kernel:ata2.00: status: { DRDY
> }*10:44:21* 09:44:14,028 INFO kernel:ata2: soft resetting
> link*10:44:21* 09:44:19,296 WARNING kernel:ata2.00: qc timeout (cmd
> 0xa1)*10:44:21* 09:44:19,305 WARNING kernel:ata2.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)*10:44:21* 09:44:19,305 ERR
> kernel:ata2.00: revalidation failed (errno=-5)*10:44:21* 09:44:19,305
> INFO kernel:ata2: soft resetting link*10:44:21* 09:44:21,510 INFO
> kernel:ata2.00: configured for MWDMA2*10:44:21* 09:44:21,558 INFO
> kernel:ata2: EH complete
> The slave is vm0034.workers-phx.ovirt.org
> <https://jenkins.ovirt.org/computer/vm0034.workers-phx.ovirt.org>
> Looking at the slave, it looks like several updates are available including
> a kernel update.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100099)
5 years, 8 months
[JIRA] (OVIRT-2703) oVirt Node build fails due to CPU stuck
by sbonazzo (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2703?page=com.atlassian.jir... ]
sbonazzo commented on OVIRT-2703:
---------------------------------
[~yturgema(a)redhat.com] what do you think?
> oVirt Node build fails due to CPU stuck
> ---------------------------------------
>
> Key: OVIRT-2703
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2703
> Project: oVirt - virtualization made easy
> Issue Type: By-EMAIL
> Reporter: sbonazzo
> Assignee: infra
>
> *CPU is getting stuck for the VM running on the slave.*
> *Error is:*
> *https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-fc28-x86_64/240/console
> <https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-...>*
> *10:44:14* 09:44:13,825 WARNING kernel:ata2: lost interrupt (Status
> 0x58)*10:44:14* 09:44:13,834 DEBUG kernel:ata2: drained 65536 bytes to
> clear DRQ*10:44:14* 09:44:13,835 EMERG kernel:watchdog: BUG: soft
> lockup - CPU#0 stuck for 32s! [scsi_eh_1:85]*10:44:14* 09:44:13,835
> WARNING kernel:Modules linked in: xfs fcoe libfcoe libfc
> scsi_transport_fc zram scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> parport_pc i2c_piix4 parport joydev loop nls_utf8 isofs 8021q garp mrp
> stp llc virtio_console serio_raw qemu_fw_cfg virtio_pci e1000
> bochs_drm drm_kms_helper ttm drm ata_generic pata_acpi sunrpc mcryptd
> sha256_ssse3 dm_crypt dm_round_robin dm_multipath linear raid10
> raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
> raid6_pq libcrc32c raid1 raid0 iscsi_ibft iscsi_boot_sysfs floppy
> iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs
> zstd_decompress xxhash cramfs edd virtio_rng virtio_ring
> virtio*10:44:14* 09:44:13,844 WARNING kernel:CPU: 0 PID: 85 Comm:
> scsi_eh_1 Not tainted 4.16.3-301.fc28.x86_64 #1*10:44:14* 09:44:13,844
> WARNING kernel:Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28
> 04/01/2014*10:44:21* 09:44:13,845 WARNING kernel:RIP:
> 0010:_raw_spin_unlock_irqrestore+0xd/0x20*10:44:21* 09:44:13,846
> WARNING kernel:RSP: 0018:ffffa31e804cfdf0 EFLAGS: 00000202 ORIG_RAX:
> ffffffffffffff12*10:44:21* 09:44:13,855 WARNING kernel:RAX:
> 0000000000000000 RBX: ffff9654e8c5c000 RCX: 0000000000000000*10:44:21*
> 09:44:13,856 WARNING kernel:RDX: 0000000000000000 RSI:
> 0000000000000202 RDI: 0000000000000202*10:44:21* 09:44:13,856 WARNING
> kernel:RBP: ffffffffbd60bd20 R08: 0000000000000038 R09:
> 00000000000002a4*10:44:21* 09:44:13,857 WARNING kernel:R10:
> 0000000000000000 R11: 0000000000000001 R12: ffffffffbd60b050*10:44:21*
> 09:44:13,857 WARNING kernel:R13: ffff9654e8c5c130 R14:
> 0000000000000202 R15: 0000000000000000*10:44:21* 09:44:13,858 WARNING
> kernel:FS: 0000000000000000(0000) GS:ffff9654fbc00000(0000)
> knlGS:0000000000000000*10:44:21* 09:44:13,865 WARNING kernel:CS: 0010
> DS: 0000 ES: 0000 CR0: 0000000080050033*10:44:21* 09:44:14,008 WARNING
> kernel:CR2: 00007fece8177000 CR3: 0000000069c18000 CR4:
> 00000000000006f0*10:44:21* 09:44:14,008 WARNING kernel:Call
> Trace:*10:44:21* 09:44:14,008 WARNING kernel:
> ata_sff_error_handler+0x83/0xe0*10:44:21* 09:44:14,009 WARNING kernel:
> ata_scsi_port_error_handler+0x354/0x770*10:44:21* 09:44:14,009 WARNING
> kernel: ? scsi_try_target_reset+0x90/0x90*10:44:21* 09:44:14,009
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: ata_scsi_error+0x91/0xc0*10:44:21* 09:44:14,010
> WARNING kernel: scsi_error_handler+0xd0/0x5b0*10:44:21* 09:44:14,010
> WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
> WARNING kernel: kthread+0x112/0x130*10:44:21* 09:44:14,011 WARNING
> kernel: ? kthread_create_worker_on_cpu+0x70/0x70*10:44:21*
> 09:44:14,026 WARNING kernel: ?
> kthread_create_worker_on_cpu+0x70/0x70*10:44:21* 09:44:14,026 WARNING
> kernel: ret_from_fork+0x35/0x40*10:44:21* 09:44:14,026 WARNING
> kernel:Code: a8 08 74 0b 65 81 25 6f 2c 76 42 ff ff ff 7f 89 d0 c3 90
> 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 48 89 f7 57
> 9d <0f> 1f 44 00 00 c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
> *10:44:21* 09:44:14,027 ERR kernel:ata2.00: exception Emask 0x0 SAct
> 0x0 SErr 0x0 action 0x6 frozen*10:44:21* 09:44:14,027 ERR
> kernel:ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio
> 16392 in#012 Get event status notification 4a 01 00 00 10 00
> 00 00 08 00res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4
> (timeout)*10:44:21* 09:44:14,028 ERR kernel:ata2.00: status: { DRDY
> }*10:44:21* 09:44:14,028 INFO kernel:ata2: soft resetting
> link*10:44:21* 09:44:19,296 WARNING kernel:ata2.00: qc timeout (cmd
> 0xa1)*10:44:21* 09:44:19,305 WARNING kernel:ata2.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)*10:44:21* 09:44:19,305 ERR
> kernel:ata2.00: revalidation failed (errno=-5)*10:44:21* 09:44:19,305
> INFO kernel:ata2: soft resetting link*10:44:21* 09:44:21,510 INFO
> kernel:ata2.00: configured for MWDMA2*10:44:21* 09:44:21,558 INFO
> kernel:ata2: EH complete
> The slave is vm0034.workers-phx.ovirt.org
> <https://jenkins.ovirt.org/computer/vm0034.workers-phx.ovirt.org>
> Looking at the slave, it looks like several updates are available including
> a kernel update.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100099)
5 years, 8 months