On 9 Feb 2017, at 16:16, Ondrej Svoboda <osvoboda(a)redhat.com>
wrote:
Do you mean
https://github.com/lago-project/lago/pull/398
<
https://github.com/lago-project/lago/pull/398> which has been merged for over a
month?
The second sentence in the PR (below) is contradicted by newer, non-recognized CPUs, such
as Skylake.
"This patch fixes the problems by selecting a minimum reasonable CPU model for the
given hardware platform. Westmere is selected unless older or non-Intel hardware is
used."
On Thu, Feb 9, 2017 at 4:07 PM, Michal Skrivanek <mskrivan(a)redhat.com
<mailto:mskrivan@redhat.com>> wrote:
What happened to Milan' PR from a while ago addressing this exact situation?
On 08 Feb 2017, at 16:04, Ondrej Svoboda <osvoboda(a)redhat.com
<mailto:osvoboda@redhat.com>> wrote:
> In my case, simply adding Skylake-Client a supported CPU family did the trick:
https://github.com/lago-project/lago/pull/448
<
https://github.com/lago-project/lago/pull/448>
>
> i wonder if Westmere is a good fallback -- it works for you on Broadwell, right?
>
> On Wed, Feb 8, 2017 at 1:58 PM, Nadav Goldin <ngoldin(a)redhat.com
<mailto:ngoldin@redhat.com>> wrote:
> I would first try testing it without OST, because in OST it will pick
> the CPU via the cluster family(which is controlled in virt.py). You
> can try specifying the 'cpu_model' in the init file, skipping the 'cpu
> family' logic, something like:
>
> > cat LagoInitFile
> domains:
> vm-el73:
> memory: 2048
> service_provider: systemd
> cpu_model: Broadwell
> nics:
> - net: lago
> disks:
> - template_name: el7.3-base
> type: template
> name: root
> dev: vda
> format: qcow2
> nets:
> lago:
> type: nat
> dhcp:
> start: 100
> end: 254
> management: true
> dns_domain_name: lago.local
>
> > lago init && lago start
>
> Then install lago again in the VM, copy the same init file, and check
> if for different combinations of cpu_model it works for you - would
> give us a hint how to solve this. The 'cpu_model' basically translates
> to this xml definition in libvirt:
> <cpu mode='custom' match='exact'>
> <model fallback='allow'>Broadwell</model>
> <topology sockets='2' cores='1' threads='1'/>
> <feature policy='optional' name='vmx'/>
> <feature policy='optional' name='svm'/>
> </cpu>
>
> I tried manually editing it also to host-passthrough, but still failed
> on the same error. The thing is that the 'kvm_put_msrs: Assertion `ret
> == n' failed.' error doesn't give any indication where it failed(or if
> the cpu is missing a flag), maybe there is a way to debug this at
> qemu/kvm level? I'm not sure.
>
>
>
>
>
>
> On Wed, Feb 8, 2017 at 1:18 PM, Ondrej Svoboda <osvoboda(a)redhat.com
<mailto:osvoboda@redhat.com>> wrote:
> > It is a Skylake-H, and I can see it is not mentioned in lago/virt.py.
> >
> > I guess I'll step through the code (as well as other places discovered by
> > 'git grep cpu') and see if I could solve this by adding the Skylake
family
> > to _CPU_FAMILIES.
> >
> > Do you have other pointers?
> >
> > Thanks,
> > Ondra
> >
> > On Tue, Feb 7, 2017 at 10:40 PM, Nadav Goldin <ngoldin(a)redhat.com
<mailto:ngoldin@redhat.com>> wrote:
> >>
> >> What is the host CPU you are using?
> >> I came across the same error few days ago, but without running OST, I
> >> tried running with Lago:
> >> fc24 host -> el7 vm -> el7 vm.
> >>
> >> I have a slight suspect that it is related to the CPU model we
> >> configure in libvirt, I tried a mixture of few
> >> combinations(host-pass-through, pinning down the CPU model), but it
> >> always failed on the same error:
> >> kvm_put_msrs: Assertion `ret == n' failed.
> >>
> >> My CPU is Broadwell btw.
> >>
> >>
> >> Milan, any ideas? you think it might be related?
> >>
> >> Nadav.
> >>
> >>
> >>
> >> On Tue, Feb 7, 2017 at 11:14 PM, Ondrej Svoboda <osvoboda(a)redhat.com
<mailto:osvoboda@redhat.com>>
> >> wrote:
> >> > Yes, I stated that in my message.
> >> >
> >> > root@osvoboda-t460p /home/src/ovirt-system-tests (git)-[master] # cat
> >> > /sys/module/kvm_intel/parameters/nested
> >> > :(
> >> > Y
> >> >
> >> > On Tue, Feb 7, 2017 at 1:39 PM, Eyal Edri <eedri(a)redhat.com
<mailto:eedri@redhat.com>> wrote:
> >> >>
> >> >> Did you follow the instructions on [1] ?
> >> >>
> >> >> Specifically, verifying ' cat
/sys/module/kvm_intel/parameters/nested
> >> >> '
> >> >> gives you 'Y'.
> >> >>
> >> >> [1]
> >> >>
> >> >>
http://ovirt-system-tests.readthedocs.io/en/latest/docs/general/installat...
<
http://ovirt-system-tests.readthedocs.io/en/latest/docs/general/installat...
> >> >>
> >> >> On Tue, Feb 7, 2017 at 2:29 PM, Ondrej Svoboda
<osvoboda(a)redhat.com <mailto:osvoboda@redhat.com>>
> >> >> wrote:
> >> >>>
> >> >>> Hi everyone,
> >> >>>
> >> >>> Even though I have nested virtualization enabled in my Arch
Linux
> >> >>> system
> >> >>> which I use to run OST, vm_run is the first test to fail in
> >> >>> 004_basic_sanity
> >> >>> (followed by snapshots_merge and suspend_resume_vm).
> >> >>>
> >> >>> Can you point me to what I might be missing? I believe I get
the same
> >> >>> failure even on Fedora.
> >> >>>
> >> >>> This is what host0's CPU capabilities look like (vmx is
there):
> >> >>> [root@lago-basic-suite-master-host0 ~]# cat /proc/cpuinfo
> >> >>> processor : 0
> >> >>> vendor_id : GenuineIntel
> >> >>> cpu family : 6
> >> >>> model : 44
> >> >>> model name : Westmere E56xx/L56xx/X56xx (Nehalem-C)
> >> >>> stepping : 1
> >> >>> microcode : 0x1
> >> >>> cpu MHz : 2711.988
> >> >>> cache size : 16384 KB
> >> >>> physical id : 0
> >> >>> siblings : 1
> >> >>> core id : 0
> >> >>> cpu cores : 1
> >> >>> apicid : 0
> >> >>> initial apicid : 0
> >> >>> fpu : yes
> >> >>> fpu_exception : yes
> >> >>> cpuid level : 11
> >> >>> wp : yes
> >> >>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge
> >> >>> mca
> >> >>> cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm
constant_tsc
> >> >>> rep_good
> >> >>> nopl xtopology pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2
x2apic
> >> >>> popcnt aes
> >> >>> hypervisor lahf_lm arat tpr_shadow vnmi flexpriority ept vpid
> >> >>> bogomips : 5423.97
> >> >>> clflush size : 64
> >> >>> cache_alignment : 64
> >> >>> address sizes : 40 bits physical, 48 bits virtual
> >> >>> power management:
> >> >>>
> >> >>> journalctl -b on host0 shows that libvirt complains about NUMA
> >> >>> configuration:
> >> >>>
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:
libvirt
> >> >>> version: 2.0.0, package: 10.el7_3.4 (CentOS BuildSystem
> >> >>> <
http://bugs.centos.org <
http://bugs.centos.org/>>,
2017-01-17-23 <tel:2017-01-17-23>:37:48,
c1bm.rdu2.centos.org
<
http://c1bm.rdu2.centos.org/>)
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel:
ovirtmgmt: port
> >> >>> 2(vnet0) entered disabled state
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: device
vnet0
> >> >>> left
> >> >>> promiscuous mode
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel:
ovirtmgmt: port
> >> >>> 2(vnet0) entered disabled state
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:
> >> >>> hostname:
> >> >>> lago-basic-suite-master-host0.lago.local
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:
Unable
> >> >>> to
> >> >>> read from monitor: Connection reset by peer
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:
> >> >>> internal
> >> >>> error: qemu unexpectedly closed the monitor:
> >> >>> 2017-02-07T11:33:23.058571Z
> >> >>> qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 1 2 3
4 5 6 7
> >> >>> 8 9
> >> >>> 10 11 12 13 14 15
> >> >>>
> >> >>> 2017-02-07T11:33:23.058826Z qemu-kvm: warning: All CPU(s) up
to
> >> >>> maxcpus
> >> >>> should be described in NUMA config
> >> >>>
> >> >>> qemu-kvm:
> >> >>> /builddir/build/BUILD/qemu-2.6.0/target-i386/kvm.c:1736:
kvm_put_msrs:
> >> >>> Assertion `ret == n' failed.
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0
NetworkManager[657]:
> >> >>> <info>
> >> >>> [1486467203.1025] device (vnet0): state change: disconnected
->
> >> >>> unmanaged
> >> >>> (reason 'unmanaged') [30 10 3]
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kvm[22059]: 0
guests now
> >> >>> active
> >> >>> Feb 07 06:33:23 lago-basic-suite-master-host0
systemd-machined[22044]:
> >> >>> Machine qemu-1-vm0 terminated.
> >> >>>
> >> >>> Thanks,
> >> >>> Ondra
> >> >>>
> >> >>> _______________________________________________
> >> >>> Devel mailing list
> >> >>> Devel(a)ovirt.org <mailto:Devel@ovirt.org>
> >> >>>
http://lists.ovirt.org/mailman/listinfo/devel
<
http://lists.ovirt.org/mailman/listinfo/devel>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Eyal Edri
> >> >> Associate Manager
> >> >> RHV DevOps
> >> >> EMEA ENG Virtualization R&D
> >> >> Red Hat Israel
> >> >>
> >> >> phone: +972-9-7692018 <tel:%2B972-9-7692018>
> >> >> irc: eedri (on #tlv #rhev-dev #rhev-integ)
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > Devel mailing list
> >> > Devel(a)ovirt.org <mailto:Devel@ovirt.org>
> >> >
http://lists.ovirt.org/mailman/listinfo/devel
<
http://lists.ovirt.org/mailman/listinfo/devel>
> >
> >
>
> _______________________________________________
> Devel mailing list
> Devel(a)ovirt.org <mailto:Devel@ovirt.org>
>
http://lists.ovirt.org/mailman/listinfo/devel
<
http://lists.ovirt.org/mailman/listinfo/devel>
_______________________________________________
Devel mailing list
Devel(a)ovirt.org <mailto:Devel@ovirt.org>
http://lists.ovirt.org/mailman/listinfo/devel
<
http://lists.ovirt.org/mailman/listinfo/devel>