
oVirt 4.1. run with the same issue. So the problem was not oVirt related. I tried the ELREPO kernel and that solved my issue. Kernel 4.9.11-1.el7.elrepo.x86_64 seems to be working without issues. Conclusion: We triggered some bug in 3.10.0 kernel, but I was unable to find out which one. We have 3 same hypervisors, but the problem was only on the third none. Peter On 16/02/2017 15:35, Peter Hudec wrote:
memtest run without issues. I'm thinking about install on this host latest ovirt and try running the vm. It may solve the issues if it's ovirt related and not hw/kvm.
latest stable version is 4.1.
Peter
I'm thinking to try lastest oVirt On 15/02/2017 21:46, Peter Hudec wrote:
On 15/02/2017 21:20, Nir Soffer wrote:
On Wed, Feb 15, 2017 at 10:05 PM, Peter Hudec <phudec@cnc.sk> wrote:
Hi,
so theproblem is little bit different. When I wait for a long time, the VM boots ;(
Is this an issue only with old vms imported from the old setup, or also with new vms? I do not have new VMs, so with the OLD one.But I did not import them from old setup. The Host OS upgrade I did by our docs, creating new cluster, host upgrade and vm migrations. There was no outage until now.
I tried to install new VM, but the installer hangs on that host.
But ... /see the log/. I'm invetigating the reason. The difference between the dipovirt0{1,2} and the dipovirt03 isthe installation time. The first 2 was migrated last week, the last one yesterday. There some newer packages, but nothing related to KVM.
[ 292.429622] INFO: rcu_sched self-detected stall on CPU { 0} (t=72280 jiffies g=393 c=392 q=35) [ 292.430294] sending NMI to all CPUs: [ 292.430305] NMI backtrace for cpu 0 [ 292.430309] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.39-1 [ 292.430311] Hardware name: oVirt oVirt Node, BIOS 0.5.1 01/01/2011 [ 292.430313] task: ffffffff8181a460 ti: ffffffff81800000 task.ti: ffffffff81800000 [ 292.430315] RIP: 0010:[<ffffffff81052ae6>] [<ffffffff81052ae6>] native_write_msr_safe+0x6/0x10 [ 292.430323] RSP: 0018:ffff88001fc03e08 EFLAGS: 00000046 [ 292.430325] RAX: 0000000000000400 RBX: 0000000000000000 RCX: 0000000000000830 [ 292.430326] RDX: 0000000000000000 RSI: 0000000000000400 RDI: 0000000000000830 [ 292.430327] RBP: ffffffff818e2a80 R08: ffffffff818e2a80 R09: 00000000000001e8 [ 292.430329] R10: 0000000000000000 R11: ffff88001fc03b96 R12: 0000000000000000 [ 292.430330] R13: 000000000000a0ea R14: 0000000000000002 R15: 0000000000080000 [ 292.430335] FS: 0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [ 292.430337] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 292.430339] CR2: 0000000001801000 CR3: 000000001c6de000 CR4: 00000000000006f0 [ 292.430343] Stack: [ 292.430344] ffffffff8104b30d 0000000000000002 0000000000000082 ffff88001fc0d6a0 [ 292.430347] ffffffff81853800 0000000000000000 ffffffff818e2fe0 0000000000000023 [ 292.430349] ffffffff81853800 ffffffff81047d63 ffff88001fc0d6a0 ffffffff810c73fa [ 292.430352] Call Trace: [ 292.430354] <IRQ>
[ 292.430360] [<ffffffff8104b30d>] ? __x2apic_send_IPI_mask+0xad/0xe0 [ 292.430365] [<ffffffff81047d63>] ? arch_trigger_all_cpu_backtrace+0xc3/0x140 [ 292.430369] [<ffffffff810c73fa>] ? rcu_check_callbacks+0x42a/0x670 [ 292.430373] [<ffffffff8109bb1e>] ? account_process_tick+0xde/0x180 [ 292.430376] [<ffffffff810d1e00>] ? tick_sched_handle.isra.16+0x60/0x60 [ 292.430381] [<ffffffff81075fc0>] ? update_process_times+0x40/0x70 [ 292.430404] [<ffffffff810d1dc0>] ? tick_sched_handle.isra.16+0x20/0x60 [ 292.430407] [<ffffffff810d1e3c>] ? tick_sched_timer+0x3c/0x60 [ 292.430410] [<ffffffff8108c6a7>] ? __run_hrtimer+0x67/0x210 [ 292.430412] [<ffffffff8108caa9>] ? hrtimer_interrupt+0xe9/0x220 [ 292.430416] [<ffffffff8151dcab>] ? smp_apic_timer_interrupt+0x3b/0x50 [ 292.430420] [<ffffffff8151bd3d>] ? apic_timer_interrupt+0x6d/0x80 [ 292.430422] <EOI>
[ 292.430425] [<ffffffff8109b2e5>] ? sched_clock_local+0x15/0x80 [ 292.430428] [<ffffffff8101da50>] ? mwait_idle+0xa0/0xa0 [ 292.430431] [<ffffffff81052c22>] ? native_safe_halt+0x2/0x10 [ 292.430434] [<ffffffff8101da69>] ? default_idle+0x19/0xd0 [ 292.430437] [<ffffffff810a9b74>] ? cpu_startup_entry+0x374/0x470 [ 292.430440] [<ffffffff81903076>] ? start_kernel+0x497/0x4a2 [ 292.430442] [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e [ 292.430445] [<ffffffff81902120>] ? early_idt_handler_array+0x120/0x120 [ 292.430447] [<ffffffff8190271f>] ? x86_64_start_kernel+0x14d/0x15c [ 292.430448] Code: c2 48 89 d0 c3 89 f9 0f 32 31 c9 48 c1 e2 20 89 c0 89 0e 48 09 c2 48 89 d0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 89 f0 89 f9 0f 30 <31> c0 c3 0f 1f 80 00 00 00 00 89 f9 0f 33 48 c1 e2 20 89 c0 48 [ 292.430579] Clocksource tsc unstable (delta = -289118137838 ns)
On 15/02/2017 20:39, Peter Hudec wrote:
Hi,
I did already, but not find any suspicious, see attached logs and the spice screenshot.
Actually the VM is booting, but is stacked in some bad state. When migrating, the migration is sucessfull, but the vm is not acessible /even on network/
Right now I found one VM, which is working well.
In logs look for diplci01 at 2017-02-15 20:23:00,420, the VM ID is 7ddf349b-fb9a-44f4-9e88-73e84625a44e
thanks Peter
On 15/02/2017 19:40, Nir Soffer wrote:
On Wed, Feb 15, 2017 at 8:11 PM, Peter Hudec <phudec@cnc.sk> wrote: > Hi, > > I'm preparing to migrate from 3.5 to 3.6 > The first step is the CentOS6 -> CentOS7 for hosts. > > setup: > - 3x hosts /dipovitrt01, dipovirt02, dipovirt03/ > - 1x hosted engine /on all 3 hosts/ > > The upgrade of the first 2 hosts was OK, all VM are running OK. > When I upgraded the 3rd host /dipovirt03/, some VMs are not able to run > on the or boot on this host. I tried to full reinstall the host, but > wth the same result. > > In case of migration the VMm will stop running in a while. > In case of booting the VM will not boot, I see the 'Loading kernel ...' > > Almost all VMS are Debian 8 with guest tools, some Centos 6/7 > > The hosts were OK with CentOS6. > > > Where should I start to investigate ?
Sharing vdsm logs showing the failed attempts to run or migrate a vm would be a good start.
> > best regards > Peter > > > -- > *Peter Hudec* > Infraštruktúrny architekt > phudec@cnc.sk <mailto:phudec@cnc.sk> > > *CNC, a.s.* > Borská 6, 841 04 Bratislava > Recepcia: +421 2 35 000 100 > > Mobil:+421 905 997 203 > *www.cnc.sk* <http:///www.cnc.sk> > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users
-- *Peter Hudec* Infraštruktúrny architekt phudec@cnc.sk <mailto:phudec@cnc.sk>
*CNC, a.s.* Borská 6, 841 04 Bratislava Recepcia: +421 2 35 000 100
Mobil:+421 905 997 203 *www.cnc.sk* <http:///www.cnc.sk>
-- *Peter Hudec* Infraštruktúrny architekt phudec@cnc.sk <mailto:phudec@cnc.sk> *CNC, a.s.* Borská 6, 841 04 Bratislava Recepcia: +421 2 35 000 100 Mobil:+421 905 997 203 *www.cnc.sk* <http:///www.cnc.sk>