[JIRA] (OVIRT-1015) kernel panic in nested VM
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-1015?page=com.atlassian.jir... ]
Evgheni Dereveanchin commented on OVIRT-1015:
---------------------------------------------
The host has nothing specific in dmesg and has a pretty recent kernel.
The node is installed the following way:
15:58:16 virt-install \
15:58:16 --name node-2017-01-10-1558 \
15:58:16 --boot menu=off \
15:58:16 --network none \
15:58:16 --memory 4096 \
15:58:16 --vcpus 4 \
15:58:16 --os-variant rhel7 \
15:58:16 --rng random \
15:58:16 --noreboot \
15:58:16 --location boot.iso \
15:58:16 --extra-args "inst.ks=file:///ci-image-install.ks console=ttyS0" \
15:58:16 --initrd-inject data/ci-image-install.ks \
15:58:16 --check disk_size=off,path_in_use=off \
15:58:16 --graphics none \
15:58:16 --wait 60 \
15:58:16 --disk path=ovirt-node-ng-image.installed.qcow2,bus=virtio,cache=unsafe,discard=unmap,format=qcow2 \
15:58:16 --disk path=ovirt-node-ng-image.squashfs.img,readonly=on,device=disk,bus=virtio,serial=livesrc
and here's where it crashes:
15:59:54 Running pre-installation scripts
15:59:54 .
15:59:54 Installing software 100%
16:04:04 [ 330.219115] BUG: unable to handle kernel paging request at 0000000000172001
16:04:04 [ 330.230847] IP: [<ffffffff813250c7>] clear_page_c+0x7/0x10
16:04:04 [ 330.230847] PGD 0
16:04:04 [ 330.230847] Oops: 0000 [#1] SMP
16:04:04 [ 330.230847] Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison xfs fcoe libfcoe libfc scsi_transport_fc scsi_tgt ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables parport_pc sg pcspkr virtio_console i2c_piix4 parport virtio_rng virtio_balloon i2c_core ext4 mbcache jbd2 loop nls_utf8 isofs sr_mod cdrom ata_generic virtio_blk pata_acpi crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_pci aesni_intel glue_helper ata_piix ablk_helper virtio_ring serio_raw libata cryptd virtio sunrpc xts lrw gf128mul dm_crypt dm_round_robin dm_multipath dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_zero dm_mod linear raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq libcrc32c async_xor xor async_tx raid1 raid0 iscsi_ibft iscsi_boot_sysfs floppy iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs cramfs edd
16:04:04 [ 330.230847] CPU: 1 PID: 1625 Comm: rsync Not tainted 3.10.0-514.el7.x86_64 #1
16:04:04 [ 330.230847] Hardware name: Red Hat KVM, BIOS 1.9.1-5.el7 04/01/2014
16:04:04 [ 330.230847] task: ffff88007e04af10 ti: ffff8800a3454000 task.ti: ffff8800a3454000
16:04:04 [ 330.230847] RIP: 0010:[<ffffffff813250c7>] [<ffffffff813250c7>] clear_page_c+0x7/0x10
16:04:04 [ 330.230847] RSP: 0000:ffff8800a3457bd8 EFLAGS: 00010246
16:04:04 [ 330.230847] RAX: 0000000000000000 RBX: 00000000043b4140 RCX: 0000000000000200
16:04:04 [ 330.230847] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88010ed05000
16:04:04 [ 330.230847] RBP: ffff8800a3457ce0 R08: ffffffff818df727 R09: ffffea00043b4180
16:04:04 [ 330.230847] R10: 0000000000001403 R11: 0000000000000000 R12: ffff8800a3457fd8
16:04:04 [ 330.230847] R13: 00000000043b4180 R14: ffffea00043b4140 R15: ffff8800a3454000
16:04:04 [ 330.230847] FS: 00007fb3d2247740(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
16:04:04 [ 330.230847] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
16:04:04 [ 330.230847] CR2: 0000000000172001 CR3: 000000013034c000 CR4: 00000000000006e0
16:04:04 [ 330.230847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
16:04:04 [ 330.230847] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
16:04:04 [ 330.230847] Stack:
16:04:04 [ 330.230847] ffffffff8118a67a 0000000000000001 ffff88013ffd8008 000000007fffffff
16:04:04 [ 330.230847] 0000000000000002 00000000d34672a7 ffff88013fc9a098 ffff88013fc9a0c8
16:04:04 [ 330.230847] ffff88013ffd7068 0000000000000000 0000000300000001 ffff88013ffd8000
16:04:04 [ 330.230847] Call Trace:
16:04:04 [ 330.230847] [<ffffffff8118a67a>] ? get_page_from_freelist+0x51a/0x9f0
16:04:04 [ 330.230847] [<ffffffff8118acc6>] __alloc_pages_nodemask+0x176/0x420
16:04:05 [ 330.230847] [<ffffffff811d20ba>] alloc_pages_vma+0x9a/0x150
16:04:05 [ 330.230847] [<ffffffff811b137f>] handle_mm_fault+0xc6f/0xfe0
16:04:05 [ 330.230847] [<ffffffff811b76d5>] ? do_mmap_pgoff+0x305/0x3c0
16:04:05 [ 330.230847] [<ffffffff81691a94>] __do_page_fault+0x154/0x450
16:04:05 [ 330.230847] [<ffffffff81691e76>] trace_do_page_fault+0x56/0x150
16:04:05 [ 330.230847] [<ffffffff8169151b>] do_async_page_fault+0x1b/0xd0
16:04:05 [ 330.230847] [<ffffffff8168e0b8>] async_page_fault+0x28/0x30
16:04:05 [ 330.230847] Code: 4c 29 ea 39 da 89 d1 7f c4 85 d2 7f 9d 89 d0 eb bc 0f 1f 00 e8 0b 05 d6 ff 90 90 90 90 90 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f
16:04:05 [ 330.230847] RIP [<ffffffff813250c7>] clear_page_c+0x7/0x10
16:04:05 [ 330.230847] RSP <ffff8800a3457bd8>
16:04:05 [ 330.230847] CR2: 0000000000172001
16:04:05 [ 330.230847] ---[ end trace 67ec205c6ac0a24f ]---
16:04:05 [ 330.230847] Kernel panic - not syncing: Fatal exception
16:04:05 [ 330.959703] ------------[ cut here ]------------
16:04:05 [ 330.960692] WARNING: at arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x5f/0x70()
16:04:05 [ 330.960692] Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison xfs fcoe libfcoe libfc scsi_transport_fc scsi_tgt ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables parport_pc sg pcspkr virtio_console i2c_piix4 parport virtio_rng virtio_balloon i2c_core ext4 mbcache jbd2 loop nls_utf8 isofs sr_mod cdrom ata_generic virtio_blk pata_acpi crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_pci aesni_intel glue_helper ata_piix ablk_helper virtio_ring serio_raw libata cryptd virtio sunrpc xts lrw gf128mul dm_crypt dm_round_robin dm_multipath dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_zero dm_mod linear raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq libcrc32c async_xor xor async_tx raid1 raid0 iscsi_ibft iscsi_boot_sysfs floppy iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs cramfs edd
16:04:05 [ 330.960692] CPU: 1 PID: 1625 Comm: rsync Tainted: G D ------------ 3.10.0-514.el7.x86_64 #1
16:04:05 [ 330.960692] Hardware name: Red Hat KVM, BIOS 1.9.1-5.el7 04/01/2014
16:04:05 [ 330.960692] 0000000000000000 00000000d34672a7 ffff88013fc83d98 ffffffff81685fac
16:04:05 [ 330.960692] ffff88013fc83dd0 ffffffff81085820 0000000000000000 ffff88013fc96c40
16:04:05 [ 330.960692] 00000001000078ef ffff88013fc16c40 0000000000000001 ffff88013fc83de0
16:04:05 [ 330.960692] Call Trace:
16:04:05 [ 330.960692] <IRQ> [<ffffffff81685fac>] dump_stack+0x19/0x1b
16:04:05 [ 330.960692] [<ffffffff81085820>] warn_slowpath_common+0x70/0xb0
16:04:05 [ 330.960692] [<ffffffff8108596a>] warn_slowpath_null+0x1a/0x20
16:04:05 [ 330.960692] [<ffffffff8104e18f>] native_smp_send_reschedule+0x5f/0x70
16:04:05 [ 330.960692] [<ffffffff810d339d>] trigger_load_balance+0x16d/0x200
16:04:05 [ 330.960692] [<ffffffff810c3503>] scheduler_tick+0x103/0x150
16:04:05 [ 330.960692] [<ffffffff810f2f80>] ? tick_sched_handle.isra.13+0x60/0x60
16:04:05 [ 330.960692] [<ffffffff81099196>] update_process_times+0x66/0x80
16:04:05 [ 330.960692] [<ffffffff810f2f45>] tick_sched_handle.isra.13+0x25/0x60
16:04:05 [ 330.960692] [<ffffffff810f2fc1>] tick_sched_timer+0x41/0x70
16:04:05 [ 330.960692] [<ffffffff810b4862>] __hrtimer_run_queues+0xd2/0x260
16:04:05 [ 330.960692] [<ffffffff810b4e00>] hrtimer_interrupt+0xb0/0x1e0
16:04:05 [ 330.960692] [<ffffffff810510d7>] local_apic_timer_interrupt+0x37/0x60
16:04:05 [ 330.960692] [<ffffffff81698ccf>] smp_apic_timer_interrupt+0x3f/0x60
16:04:05 [ 330.960692] [<ffffffff8169721d>] apic_timer_interrupt+0x6d/0x80
16:04:05 [ 330.960692] <EOI> [<ffffffff8167f47e>] ? panic+0x1ae/0x1f2
16:04:05 [ 330.960692] [<ffffffff8168ee9b>] oops_end+0x12b/0x150
16:04:05 [ 330.960692] [<ffffffff8167ea93>] no_context+0x280/0x2a3
16:04:05 [ 330.960692] [<ffffffff8167eb29>] __bad_area_nosemaphore+0x73/0x1ca
16:04:05 [ 330.960692] [<ffffffff8167ec93>] bad_area_nosemaphore+0x13/0x15
16:04:05 [ 330.960692] [<ffffffff81691c1e>] __do_page_fault+0x2de/0x450
16:04:05 [ 330.960692] [<ffffffff81691e76>] trace_do_page_fault+0x56/0x150
16:04:05 [ 330.960692] [<ffffffff8169151b>] do_async_page_fault+0x1b/0xd0
16:04:05 [ 330.960692] [<ffffffff8168e0b8>] async_page_fault+0x28/0x30
16:04:05 [ 330.960692] [<ffffffff813250c7>] ? clear_page_c+0x7/0x10
16:04:05 [ 330.960692] [<ffffffff8118a67a>] ? get_page_from_freelist+0x51a/0x9f0
16:04:05 [ 330.960692] [<ffffffff8118acc6>] __alloc_pages_nodemask+0x176/0x420
16:04:05 [ 330.960692] [<ffffffff811d20ba>] alloc_pages_vma+0x9a/0x150
16:04:05 [ 330.960692] [<ffffffff811b137f>] handle_mm_fault+0xc6f/0xfe0
16:04:05 [ 330.960692] [<ffffffff811b76d5>] ? do_mmap_pgoff+0x305/0x3c0
16:04:05 [ 330.960692] [<ffffffff81691a94>] __do_page_fault+0x154/0x450
16:04:05 [ 330.960692] [<ffffffff81691e76>] trace_do_page_fault+0x56/0x150
16:04:05 [ 330.960692] [<ffffffff8169151b>] do_async_page_fault+0x1b/0xd0
16:04:05 [ 330.960692] [<ffffffff8168e0b8>] async_page_fault+0x28/0x30
16:04:05 [ 330.960692] ---[ end trace 67ec205c6ac0a250 ]---
So this crashes during an rsync process and the kernel is pretty old:
16:04:04 [ 330.230847] CPU: 1 PID: 1625 Comm: rsync Not tainted 3.10.0-514.el7.x86_64 #1
[~sbonazzo(a)redhat.com] thanks for reporting this. [~fdeutsch] is it possible to use a newer kernel for the node ISO? I think that would be one of the steps to ensure it's not an actual bug in the kernel that was already fixed upstream by now.
> kernel panic in nested VM
> -------------------------
>
> Key: OVIRT-1015
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1015
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> The following job failed due to a kernel panic inside a nested VM:
> http://jenkins.ovirt.org/job/ovirt-node-ng_ovirt-4.0_build-artifacts-el7-...
> The VM that the job was runing on is:
> vm0085.workers-phx.ovirt.org (kernel-3.10.0-514.2.2.el7.x86_64)
--
This message was sent by Atlassian JIRA
(v1000.670.2#100024)
7 years, 10 months
(no subject)
by Daniel Belenky
The last failure was in the engine upgrade phase
[ ERROR ] Yum Cannot queue package iproute: Cannot find a valid
baseurl for repo: epel/x86_64
I've re triggered the build, and it passed.
On Tue, Jan 10, 2017 at 5:41 PM, Greg Sheremeta <gshereme(a)redhat.com> wrote:
> Seems to be still not working.
>
> Greg Sheremeta, MBA
> Red Hat, Inc.
> gshereme(a)redhat.com
>
> On Jan 10, 2017 10:31 AM, "Daniel Belenky" <dbelenky(a)redhat.com> wrote:
>
>> Hi Greg,
>>
>> The error you see was caused by wrong RPMs in the snapshot repo. We are
>> currently working on new upgrade jobs that will replace the old ones.
>> I've changed the configuration in Jenkins to take RPMs from latest.tested
>> repo, and re-triggered the build.
>> I assume that this should resolve the issue Jenkins console
>> <http://jenkins.ovirt.org/job/ovirt-engine_4.1_upgrade-from-3.6_el7_merged...>
>>
>>
>> On Tue, Jan 10, 2017 at 4:13 PM, Greg Sheremeta <gshereme(a)redhat.com>
>> wrote:
>>
>>> Hi,
>>>
>>> What do I need to do to fix this?
>>>
>>> "Error: Package: ovirt-engine-dashboard-1.2.0-0
>>> .1.20170105gitcb8e435.el7.centos.noarch (upgrade_to_0)
>>>
>>> Requires: ovirt-engine-webadmin-portal >= 4.2
>>>
>>> "
>>>
>>> http://jenkins.ovirt.org/job/ovirt-engine_4.1_upgrade-from-3
>>> .6_el7_merged/129/console
>>>
>>>
>>>
>>> --
>>> Greg Sheremeta, MBA
>>> Red Hat, Inc.
>>> Sr. Software Engineer
>>> gshereme(a)redhat.com
>>>
>>> _______________________________________________
>>> Infra mailing list
>>> Infra(a)ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/infra
>>>
>>>
>>
>>
>> --
>>
>> *Daniel Belenky*
>>
>> *RHV DevOps*
>>
>> *Red Hat Israel*
>>
>
--
*Daniel Belenky*
*RHV DevOps*
*Red Hat Israel*
7 years, 10 months
[JIRA] (OVIRT-1012) ovirt-engine-sdk is build on el6 in following branches master, sdk_4.1 and sdk_4.0
by Ondra Machacek (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-1012?page=com.atlassian.jir... ]
Ondra Machacek updated OVIRT-1012:
----------------------------------
Description:
Hello,
The ovirt-engine-sdk is build on el6 in following branches master, sdk_4.1 and sdk_4.0.
This isn't correct. In specified branches the project should be build only for el7, fc24 and fc25.
Thank you.
CC [~jhernand(a)redhat.com]
was:
Hello,
The ovirt-engine-sdk is build on el6 in following branches master, sdk_4.1 and sdk_4.0.
This isn't correct, it specified branches the project should be build only for el7, fc24 and fc25.
Thank you.
CC [~jhernand(a)redhat.com]
> ovirt-engine-sdk is build on el6 in following branches master, sdk_4.1 and sdk_4.0
> ----------------------------------------------------------------------------------
>
> Key: OVIRT-1012
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1012
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Components: Jenkins
> Reporter: Ondra Machacek
> Assignee: infra
>
> Hello,
> The ovirt-engine-sdk is build on el6 in following branches master, sdk_4.1 and sdk_4.0.
> This isn't correct. In specified branches the project should be build only for el7, fc24 and fc25.
> Thank you.
> CC [~jhernand(a)redhat.com]
--
This message was sent by Atlassian JIRA
(v1000.670.2#100024)
7 years, 10 months
[JIRA] (OVIRT-1014) avoid repetition in automation/*packages
by Barak Korren (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-1014?page=com.atlassian.jir... ]
Barak Korren commented on OVIRT-1014:
-------------------------------------
{quote}
the translation of *.pacakges to yum command may include more than `cat`. each line may look like
python-pthreading >= 0.1.3-3
{quote}
I don`t think yum supports something like '{{yum install python-pthreading >= 0.1.3-3}}'. So I can't really do anything useful with such a syntax ATM. I could try to just ignore the version specification when setting up mock, but that may lead to unexpected results.
> avoid repetition in automation/*packages
> ----------------------------------------
>
> Key: OVIRT-1014
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1014
> Project: oVirt - virtualization made easy
> Issue Type: New Feature
> Reporter: danken
> Assignee: infra
>
> Currently we have many automation/*packages* files, mostly repeating each other.
> Most of the information there is then repeated in vdsm.spec as well.
> It would be nice to have a hierarchical way to define packages. E.g. having most packages in automation/build-artifacts-manual.packages, and adding el7-specific dependencies in avoid repetition in automation/build-artifacts-manual.packages.el7.
> It would be even better to have a single palce (yaml file?) to declare each required package and its version, for each platform/architecture. We can then use it to generate the spec file.
--
This message was sent by Atlassian JIRA
(v1000.670.2#100024)
7 years, 10 months
[JIRA] (OVIRT-1014) avoid repetition in automation/*packages
by danken (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-1014?page=com.atlassian.jir... ]
danken commented on OVIRT-1014:
-------------------------------
the translation of *.pacakges to yum command may include more than `cat`. each line may look like
python-pthreading >= 0.1.3-3
> avoid repetition in automation/*packages
> ----------------------------------------
>
> Key: OVIRT-1014
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1014
> Project: oVirt - virtualization made easy
> Issue Type: New Feature
> Reporter: danken
> Assignee: infra
>
> Currently we have many automation/*packages* files, mostly repeating each other.
> Most of the information there is then repeated in vdsm.spec as well.
> It would be nice to have a hierarchical way to define packages. E.g. having most packages in automation/build-artifacts-manual.packages, and adding el7-specific dependencies in avoid repetition in automation/build-artifacts-manual.packages.el7.
> It would be even better to have a single palce (yaml file?) to declare each required package and its version, for each platform/architecture. We can then use it to generate the spec file.
--
This message was sent by Atlassian JIRA
(v1000.670.2#100024)
7 years, 10 months