[ovirt-users] VMs are not running/booting on one host

Thu Feb 16 14:35:48 UTC 2017

memtest run without issues.
I'm thinking about install on this host latest ovirt and try running the vm.
It may solve the issues if it's ovirt related and not hw/kvm.

latest stable version is 4.1.

	Peter

I'm thinking to try lastest oVirt
On 15/02/2017 21:46, Peter Hudec wrote:
> On 15/02/2017 21:20, Nir Soffer wrote:
>> On Wed, Feb 15, 2017 at 10:05 PM, Peter Hudec <phudec at cnc.sk> wrote:
>>> Hi,
>>>
>>> so theproblem is little bit different. When I wait for a long time, the
>>> VM boots ;(
>>
>> Is this an issue only with old vms imported from the old setup, or
>> also with new vms?
> I do not have new VMs, so with the OLD one.But I did not import them
> from old setup.
> The Host OS upgrade I did by our docs, creating new cluster, host
> upgrade and vm migrations. There was no outage until now.
> 
> I tried to install new VM, but the installer hangs on that host.
> 
> 
>>
>>>
>>> But ... /see the log/. I'm invetigating the reason.
>>> The difference between the dipovirt0{1,2} and the dipovirt03 isthe
>>> installation time. The first 2 was migrated last week, the last one
>>> yesterday. There some newer packages, but nothing related to KVM.
>>>
>>> [  292.429622] INFO: rcu_sched self-detected stall on CPU { 0}  (t=72280
>>> jiffies g=393 c=392 q=35)
>>> [  292.430294] sending NMI to all CPUs:
>>> [  292.430305] NMI backtrace for cpu 0
>>> [  292.430309] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64
>>> #1 Debian 3.16.39-1
>>> [  292.430311] Hardware name: oVirt oVirt Node, BIOS 0.5.1 01/01/2011
>>> [  292.430313] task: ffffffff8181a460 ti: ffffffff81800000 task.ti:
>>> ffffffff81800000
>>> [  292.430315] RIP: 0010:[<ffffffff81052ae6>]  [<ffffffff81052ae6>]
>>> native_write_msr_safe+0x6/0x10
>>> [  292.430323] RSP: 0018:ffff88001fc03e08  EFLAGS: 00000046
>>> [  292.430325] RAX: 0000000000000400 RBX: 0000000000000000 RCX:
>>> 0000000000000830
>>> [  292.430326] RDX: 0000000000000000 RSI: 0000000000000400 RDI:
>>> 0000000000000830
>>> [  292.430327] RBP: ffffffff818e2a80 R08: ffffffff818e2a80 R09:
>>> 00000000000001e8
>>> [  292.430329] R10: 0000000000000000 R11: ffff88001fc03b96 R12:
>>> 0000000000000000
>>> [  292.430330] R13: 000000000000a0ea R14: 0000000000000002 R15:
>>> 0000000000080000
>>> [  292.430335] FS:  0000000000000000(0000) GS:ffff88001fc00000(0000)
>>> knlGS:0000000000000000
>>> [  292.430337] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [  292.430339] CR2: 0000000001801000 CR3: 000000001c6de000 CR4:
>>> 00000000000006f0
>>> [  292.430343] Stack:
>>> [  292.430344]  ffffffff8104b30d 0000000000000002 0000000000000082
>>> ffff88001fc0d6a0
>>> [  292.430347]  ffffffff81853800 0000000000000000 ffffffff818e2fe0
>>> 0000000000000023
>>> [  292.430349]  ffffffff81853800 ffffffff81047d63 ffff88001fc0d6a0
>>> ffffffff810c73fa
>>> [  292.430352] Call Trace:
>>> [  292.430354]  <IRQ>
>>>
>>> [  292.430360]  [<ffffffff8104b30d>] ? __x2apic_send_IPI_mask+0xad/0xe0
>>> [  292.430365]  [<ffffffff81047d63>] ?
>>> arch_trigger_all_cpu_backtrace+0xc3/0x140
>>> [  292.430369]  [<ffffffff810c73fa>] ? rcu_check_callbacks+0x42a/0x670
>>> [  292.430373]  [<ffffffff8109bb1e>] ? account_process_tick+0xde/0x180
>>> [  292.430376]  [<ffffffff810d1e00>] ? tick_sched_handle.isra.16+0x60/0x60
>>> [  292.430381]  [<ffffffff81075fc0>] ? update_process_times+0x40/0x70
>>> [  292.430404]  [<ffffffff810d1dc0>] ? tick_sched_handle.isra.16+0x20/0x60
>>> [  292.430407]  [<ffffffff810d1e3c>] ? tick_sched_timer+0x3c/0x60
>>> [  292.430410]  [<ffffffff8108c6a7>] ? __run_hrtimer+0x67/0x210
>>> [  292.430412]  [<ffffffff8108caa9>] ? hrtimer_interrupt+0xe9/0x220
>>> [  292.430416]  [<ffffffff8151dcab>] ? smp_apic_timer_interrupt+0x3b/0x50
>>> [  292.430420]  [<ffffffff8151bd3d>] ? apic_timer_interrupt+0x6d/0x80
>>> [  292.430422]  <EOI>
>>>
>>> [  292.430425]  [<ffffffff8109b2e5>] ? sched_clock_local+0x15/0x80
>>> [  292.430428]  [<ffffffff8101da50>] ? mwait_idle+0xa0/0xa0
>>> [  292.430431]  [<ffffffff81052c22>] ? native_safe_halt+0x2/0x10
>>> [  292.430434]  [<ffffffff8101da69>] ? default_idle+0x19/0xd0
>>> [  292.430437]  [<ffffffff810a9b74>] ? cpu_startup_entry+0x374/0x470
>>> [  292.430440]  [<ffffffff81903076>] ? start_kernel+0x497/0x4a2
>>> [  292.430442]  [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e
>>> [  292.430445]  [<ffffffff81902120>] ? early_idt_handler_array+0x120/0x120
>>> [  292.430447]  [<ffffffff8190271f>] ? x86_64_start_kernel+0x14d/0x15c
>>> [  292.430448] Code: c2 48 89 d0 c3 89 f9 0f 32 31 c9 48 c1 e2 20 89 c0
>>> 89 0e 48 09 c2 48 89 d0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 89 f0 89 f9
>>> 0f 30 <31> c0 c3 0f 1f 80 00 00 00 00 89 f9 0f 33 48 c1 e2 20 89 c0 48
>>> [  292.430579] Clocksource tsc unstable (delta = -289118137838 ns)
>>>
>>>
>>> On 15/02/2017 20:39, Peter Hudec wrote:
>>>> Hi,
>>>>
>>>> I did already, but not find any suspicious, see attached logs and the
>>>> spice screenshot.
>>>>
>>>> Actually the VM is booting, but is stacked in some  bad state.
>>>> When migrating, the migration is sucessfull, but the vm is not acessible
>>>> /even on network/
>>>>
>>>> Right now I found one VM, which is working well.
>>>>
>>>> In logs look for diplci01 at 2017-02-15 20:23:00,420, the VM ID is
>>>> 7ddf349b-fb9a-44f4-9e88-73e84625a44e
>>>>
>>>>       thanks
>>>>               Peter
>>>>
>>>> On 15/02/2017 19:40, Nir Soffer wrote:
>>>>> On Wed, Feb 15, 2017 at 8:11 PM, Peter Hudec <phudec at cnc.sk> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm preparing to migrate from 3.5 to 3.6
>>>>>> The first step is the CentOS6 -> CentOS7 for hosts.
>>>>>>
>>>>>> setup:
>>>>>>   - 3x hosts /dipovitrt01, dipovirt02, dipovirt03/
>>>>>>   - 1x hosted engine /on all 3 hosts/
>>>>>>
>>>>>> The upgrade of the first 2 hosts was OK, all VM are running OK.
>>>>>> When I upgraded the 3rd host /dipovirt03/, some  VMs are not able to run
>>>>>> on the or boot on this host. I tried  to full reinstall the host, but
>>>>>> wth the same result.
>>>>>>
>>>>>> In case of migration the VMm will stop running in a while.
>>>>>> In case of booting the VM will not boot, I see the 'Loading kernel ...'
>>>>>>
>>>>>> Almost all VMS are Debian 8 with guest tools, some Centos 6/7
>>>>>>
>>>>>> The hosts were OK with CentOS6.
>>>>>>
>>>>>>
>>>>>> Where should I start to investigate ?
>>>>>
>>>>> Sharing vdsm logs showing the failed attempts to run or migrate
>>>>> a vm would be a good start.
>>>>>
>>>>>>
>>>>>>         best regards
>>>>>>                 Peter
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Peter Hudec*
>>>>>> Infraštruktúrny architekt
>>>>>> phudec at cnc.sk <mailto:phudec at cnc.sk>
>>>>>>
>>>>>> *CNC, a.s.*
>>>>>> Borská 6, 841 04 Bratislava
>>>>>> Recepcia: +421 2 35 000 100
>>>>>>
>>>>>> Mobil:+421 905 997 203
>>>>>> *www.cnc.sk* <http:///www.cnc.sk>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>>
>>> --
>>> *Peter Hudec*
>>> Infraštruktúrny architekt
>>> phudec at cnc.sk <mailto:phudec at cnc.sk>
>>>
>>> *CNC, a.s.*
>>> Borská 6, 841 04 Bratislava
>>> Recepcia: +421 2 35 000 100
>>>
>>> Mobil:+421 905 997 203
>>> *www.cnc.sk* <http:///www.cnc.sk>
>>>
> 
> 

-- 
*Peter Hudec*
Infraštruktúrny architekt
phudec at cnc.sk <mailto:phudec at cnc.sk>

*CNC, a.s.*
Borská 6, 841 04 Bratislava
Recepcia: +421 2  35 000 100

Mobil:+421 905 997 203
*www.cnc.sk* <http:///www.cnc.sk>