]
Eyal Edri commented on OVIRT-2703:
----------------------------------
If its a nightly job and often fails on VMs, I think we should consider running on BM.
[~amarchuk][~ederevea] thoughts?
oVirt Node build fails due to CPU stuck
---------------------------------------
Key: OVIRT-2703
URL:
https://ovirt-jira.atlassian.net/browse/OVIRT-2703
Project: oVirt - virtualization made easy
Issue Type: By-EMAIL
Reporter: sbonazzo
Assignee: infra
*CPU is getting stuck for the VM running on the slave.*
*Error is:*
*https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-fc28-x86_64/240/console
<
https://jenkins.ovirt.org/job/ovirt-node-ng-image_master_build-artifacts-...
*10:44:14* 09:44:13,825 WARNING kernel:ata2: lost interrupt (Status
0x58)*10:44:14* 09:44:13,834 DEBUG kernel:ata2: drained 65536 bytes to
clear DRQ*10:44:14* 09:44:13,835 EMERG kernel:watchdog: BUG: soft
lockup - CPU#0 stuck for 32s! [scsi_eh_1:85]*10:44:14* 09:44:13,835
WARNING kernel:Modules linked in: xfs fcoe libfcoe libfc
scsi_transport_fc zram scsi_dh_rdac scsi_dh_emc scsi_dh_alua
parport_pc i2c_piix4 parport joydev loop nls_utf8 isofs 8021q garp mrp
stp llc virtio_console serio_raw qemu_fw_cfg virtio_pci e1000
bochs_drm drm_kms_helper ttm drm ata_generic pata_acpi sunrpc mcryptd
sha256_ssse3 dm_crypt dm_round_robin dm_multipath linear raid10
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
raid6_pq libcrc32c raid1 raid0 iscsi_ibft iscsi_boot_sysfs floppy
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs
zstd_decompress xxhash cramfs edd virtio_rng virtio_ring
virtio*10:44:14* 09:44:13,844 WARNING kernel:CPU: 0 PID: 85 Comm:
scsi_eh_1 Not tainted 4.16.3-301.fc28.x86_64 #1*10:44:14* 09:44:13,844
WARNING kernel:Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28
04/01/2014*10:44:21* 09:44:13,845 WARNING kernel:RIP:
0010:_raw_spin_unlock_irqrestore+0xd/0x20*10:44:21* 09:44:13,846
WARNING kernel:RSP: 0018:ffffa31e804cfdf0 EFLAGS: 00000202 ORIG_RAX:
ffffffffffffff12*10:44:21* 09:44:13,855 WARNING kernel:RAX:
0000000000000000 RBX: ffff9654e8c5c000 RCX: 0000000000000000*10:44:21*
09:44:13,856 WARNING kernel:RDX: 0000000000000000 RSI:
0000000000000202 RDI: 0000000000000202*10:44:21* 09:44:13,856 WARNING
kernel:RBP: ffffffffbd60bd20 R08: 0000000000000038 R09:
00000000000002a4*10:44:21* 09:44:13,857 WARNING kernel:R10:
0000000000000000 R11: 0000000000000001 R12: ffffffffbd60b050*10:44:21*
09:44:13,857 WARNING kernel:R13: ffff9654e8c5c130 R14:
0000000000000202 R15: 0000000000000000*10:44:21* 09:44:13,858 WARNING
kernel:FS: 0000000000000000(0000) GS:ffff9654fbc00000(0000)
knlGS:0000000000000000*10:44:21* 09:44:13,865 WARNING kernel:CS: 0010
DS: 0000 ES: 0000 CR0: 0000000080050033*10:44:21* 09:44:14,008 WARNING
kernel:CR2: 00007fece8177000 CR3: 0000000069c18000 CR4:
00000000000006f0*10:44:21* 09:44:14,008 WARNING kernel:Call
Trace:*10:44:21* 09:44:14,008 WARNING kernel:
ata_sff_error_handler+0x83/0xe0*10:44:21* 09:44:14,009 WARNING kernel:
ata_scsi_port_error_handler+0x354/0x770*10:44:21* 09:44:14,009 WARNING
kernel: ? scsi_try_target_reset+0x90/0x90*10:44:21* 09:44:14,009
WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
WARNING kernel: ata_scsi_error+0x91/0xc0*10:44:21* 09:44:14,010
WARNING kernel: scsi_error_handler+0xd0/0x5b0*10:44:21* 09:44:14,010
WARNING kernel: ? scsi_eh_get_sense+0x220/0x220*10:44:21* 09:44:14,010
WARNING kernel: kthread+0x112/0x130*10:44:21* 09:44:14,011 WARNING
kernel: ? kthread_create_worker_on_cpu+0x70/0x70*10:44:21*
09:44:14,026 WARNING kernel: ?
kthread_create_worker_on_cpu+0x70/0x70*10:44:21* 09:44:14,026 WARNING
kernel: ret_from_fork+0x35/0x40*10:44:21* 09:44:14,026 WARNING
kernel:Code: a8 08 74 0b 65 81 25 6f 2c 76 42 ff ff ff 7f 89 d0 c3 90
90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 48 89 f7 57
9d <0f> 1f 44 00 00 c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
*10:44:21* 09:44:14,027 ERR kernel:ata2.00: exception Emask 0x0 SAct
0x0 SErr 0x0 action 0x6 frozen*10:44:21* 09:44:14,027 ERR
kernel:ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio
16392 in#012 Get event status notification 4a 01 00 00 10 00
00 00 08 00res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4
(timeout)*10:44:21* 09:44:14,028 ERR kernel:ata2.00: status: { DRDY
}*10:44:21* 09:44:14,028 INFO kernel:ata2: soft resetting
link*10:44:21* 09:44:19,296 WARNING kernel:ata2.00: qc timeout (cmd
0xa1)*10:44:21* 09:44:19,305 WARNING kernel:ata2.00: failed to
IDENTIFY (I/O error, err_mask=0x4)*10:44:21* 09:44:19,305 ERR
kernel:ata2.00: revalidation failed (errno=-5)*10:44:21* 09:44:19,305
INFO kernel:ata2: soft resetting link*10:44:21* 09:44:21,510 INFO
kernel:ata2.00: configured for MWDMA2*10:44:21* 09:44:21,558 INFO
kernel:ata2: EH complete
The slave is
vm0034.workers-phx.ovirt.org
<
https://jenkins.ovirt.org/computer/vm0034.workers-phx.ovirt.org>
Looking at the slave, it looks like several updates are available including
a kernel update.