<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 9 Feb 2017, at 16:16, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com" class="">osvoboda@redhat.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">Do you mean <a href="https://github.com/lago-project/lago/pull/398" class="">https://github.com/lago-project/lago/pull/398</a> which has been merged for over a month?<br class=""><br class=""></div>The second sentence in the PR (below) is contradicted by newer, non-recognized CPUs, such as Skylake.<br class=""></div></div></blockquote><div><br class=""></div>How/why? Westmere should have been selected in that case</div><div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><div class=""><div class=""><br class="">"This patch fixes the problems by selecting a minimum reasonable CPU model for the given hardware platform. Westmere is selected unless older or non-Intel hardware is used."</div></div></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Thu, Feb 9, 2017 at 4:07 PM, Michal Skrivanek <span dir="ltr" class=""><<a href="mailto:mskrivan@redhat.com" target="_blank" class="">mskrivan@redhat.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto" class=""><div class=""></div><div class="">What happened to Milan' PR from a while ago addressing this exact situation?</div><div class=""><div class="h5"><div class=""><br class="">On 08 Feb 2017, at 16:04, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com" target="_blank" class="">osvoboda@redhat.com</a>> wrote:<br class=""><br class=""></div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><div class="">In my case, simply adding Skylake-Client a supported CPU family did the trick: <a href="https://github.com/lago-project/lago/pull/448" target="_blank" class="">https://github.com/lago-<wbr class="">project/lago/pull/448</a><br class=""><br class=""></div>i wonder if Westmere is a good fallback -- it works for you on Broadwell, right?<br class=""></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Wed, Feb 8, 2017 at 1:58 PM, Nadav Goldin <span dir="ltr" class=""><<a href="mailto:ngoldin@redhat.com" target="_blank" class="">ngoldin@redhat.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I would first try testing it without OST, because in OST it will pick<br class="">
the CPU via the cluster family(which is controlled in virt.py). You<br class="">
can try specifying the 'cpu_model' in the init file, skipping the 'cpu<br class="">
family' logic, something like:<br class="">
<br class="">
> cat LagoInitFile<br class="">
domains:<br class="">
vm-el73:<br class="">
memory: 2048<br class="">
service_provider: systemd<br class="">
cpu_model: Broadwell<br class="">
nics:<br class="">
- net: lago<br class="">
disks:<br class="">
- template_name: el7.3-base<br class="">
type: template<br class="">
name: root<br class="">
dev: vda<br class="">
format: qcow2<br class="">
nets:<br class="">
lago:<br class="">
type: nat<br class="">
dhcp:<br class="">
start: 100<br class="">
end: 254<br class="">
management: true<br class="">
dns_domain_name: lago.local<br class="">
<br class="">
> lago init && lago start<br class="">
<br class="">
Then install lago again in the VM, copy the same init file, and check<br class="">
if for different combinations of cpu_model it works for you - would<br class="">
give us a hint how to solve this. The 'cpu_model' basically translates<br class="">
to this xml definition in libvirt:<br class="">
<cpu mode='custom' match='exact'><br class="">
<model fallback='allow'>Broadwell</mo<wbr class="">del><br class="">
<topology sockets='2' cores='1' threads='1'/><br class="">
<feature policy='optional' name='vmx'/><br class="">
<feature policy='optional' name='svm'/><br class="">
</cpu><br class="">
<br class="">
I tried manually editing it also to host-passthrough, but still failed<br class="">
on the same error. The thing is that the 'kvm_put_msrs: Assertion `ret<br class="">
== n' failed.' error doesn't give any indication where it failed(or if<br class="">
the cpu is missing a flag), maybe there is a way to debug this at<br class="">
qemu/kvm level? I'm not sure.<br class="">
<div class="m_-909161191278510801HOEnZb"><div class="m_-909161191278510801h5"><br class="">
<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
On Wed, Feb 8, 2017 at 1:18 PM, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com" target="_blank" class="">osvoboda@redhat.com</a>> wrote:<br class="">
> It is a Skylake-H, and I can see it is not mentioned in lago/virt.py.<br class="">
><br class="">
> I guess I'll step through the code (as well as other places discovered by<br class="">
> 'git grep cpu') and see if I could solve this by adding the Skylake family<br class="">
> to _CPU_FAMILIES.<br class="">
><br class="">
> Do you have other pointers?<br class="">
><br class="">
> Thanks,<br class="">
> Ondra<br class="">
><br class="">
> On Tue, Feb 7, 2017 at 10:40 PM, Nadav Goldin <<a href="mailto:ngoldin@redhat.com" target="_blank" class="">ngoldin@redhat.com</a>> wrote:<br class="">
>><br class="">
>> What is the host CPU you are using?<br class="">
>> I came across the same error few days ago, but without running OST, I<br class="">
>> tried running with Lago:<br class="">
>> fc24 host -> el7 vm -> el7 vm.<br class="">
>><br class="">
>> I have a slight suspect that it is related to the CPU model we<br class="">
>> configure in libvirt, I tried a mixture of few<br class="">
>> combinations(host-pass-through<wbr class="">, pinning down the CPU model), but it<br class="">
>> always failed on the same error:<br class="">
>> kvm_put_msrs: Assertion `ret == n' failed.<br class="">
>><br class="">
>> My CPU is Broadwell btw.<br class="">
>><br class="">
>><br class="">
>> Milan, any ideas? you think it might be related?<br class="">
>><br class="">
>> Nadav.<br class="">
>><br class="">
>><br class="">
>><br class="">
>> On Tue, Feb 7, 2017 at 11:14 PM, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com" target="_blank" class="">osvoboda@redhat.com</a>><br class="">
>> wrote:<br class="">
>> > Yes, I stated that in my message.<br class="">
>> ><br class="">
>> > root@osvoboda-t460p /home/src/ovirt-system-tests (git)-[master] # cat<br class="">
>> > /sys/module/kvm_intel/paramete<wbr class="">rs/nested<br class="">
>> > :(<br class="">
>> > Y<br class="">
>> ><br class="">
>> > On Tue, Feb 7, 2017 at 1:39 PM, Eyal Edri <<a href="mailto:eedri@redhat.com" target="_blank" class="">eedri@redhat.com</a>> wrote:<br class="">
>> >><br class="">
>> >> Did you follow the instructions on [1] ?<br class="">
>> >><br class="">
>> >> Specifically, verifying ' cat /sys/module/kvm_intel/paramete<wbr class="">rs/nested<br class="">
>> >> '<br class="">
>> >> gives you 'Y'.<br class="">
>> >><br class="">
>> >> [1]<br class="">
>> >><br class="">
>> >> <a href="http://ovirt-system-tests.readthedocs.io/en/latest/docs/general/installation.html" rel="noreferrer" target="_blank" class="">http://ovirt-system-tests.read<wbr class="">thedocs.io/en/latest/docs/gene<wbr class="">ral/installation.html</a><br class="">
>> >><br class="">
>> >> On Tue, Feb 7, 2017 at 2:29 PM, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com" target="_blank" class="">osvoboda@redhat.com</a>><br class="">
>> >> wrote:<br class="">
>> >>><br class="">
>> >>> Hi everyone,<br class="">
>> >>><br class="">
>> >>> Even though I have nested virtualization enabled in my Arch Linux<br class="">
>> >>> system<br class="">
>> >>> which I use to run OST, vm_run is the first test to fail in<br class="">
>> >>> 004_basic_sanity<br class="">
>> >>> (followed by snapshots_merge and suspend_resume_vm).<br class="">
>> >>><br class="">
>> >>> Can you point me to what I might be missing? I believe I get the same<br class="">
>> >>> failure even on Fedora.<br class="">
>> >>><br class="">
>> >>> This is what host0's CPU capabilities look like (vmx is there):<br class="">
>> >>> [root@lago-basic-suite-master-<wbr class="">host0 ~]# cat /proc/cpuinfo<br class="">
>> >>> processor : 0<br class="">
>> >>> vendor_id : GenuineIntel<br class="">
>> >>> cpu family : 6<br class="">
>> >>> model : 44<br class="">
>> >>> model name : Westmere E56xx/L56xx/X56xx (Nehalem-C)<br class="">
>> >>> stepping : 1<br class="">
>> >>> microcode : 0x1<br class="">
>> >>> cpu MHz : 2711.988<br class="">
>> >>> cache size : 16384 KB<br class="">
>> >>> physical id : 0<br class="">
>> >>> siblings : 1<br class="">
>> >>> core id : 0<br class="">
>> >>> cpu cores : 1<br class="">
>> >>> apicid : 0<br class="">
>> >>> initial apicid : 0<br class="">
>> >>> fpu : yes<br class="">
>> >>> fpu_exception : yes<br class="">
>> >>> cpuid level : 11<br class="">
>> >>> wp : yes<br class="">
>> >>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge<br class="">
>> >>> mca<br class="">
>> >>> cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc<br class="">
>> >>> rep_good<br class="">
>> >>> nopl xtopology pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic<br class="">
>> >>> popcnt aes<br class="">
>> >>> hypervisor lahf_lm arat tpr_shadow vnmi flexpriority ept vpid<br class="">
>> >>> bogomips : 5423.97<br class="">
>> >>> clflush size : 64<br class="">
>> >>> cache_alignment : 64<br class="">
>> >>> address sizes : 40 bits physical, 48 bits virtual<br class="">
>> >>> power management:<br class="">
>> >>><br class="">
>> >>> journalctl -b on host0 shows that libvirt complains about NUMA<br class="">
>> >>> configuration:<br class="">
>> >>><br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]: libvirt<br class="">
>> >>> version: 2.0.0, package: 10.el7_3.4 (CentOS BuildSystem<br class="">
>> >>> <<a href="http://bugs.centos.org/" rel="noreferrer" target="_blank" class="">http://bugs.centos.org</a>>, <a href="tel:2017-01-17-23" value="+12017011723" target="_blank" class="">2017-01-17-23</a>:37:48, <a href="http://c1bm.rdu2.centos.org/" rel="noreferrer" target="_blank" class="">c1bm.rdu2.centos.org</a>)<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: ovirtmgmt: port<br class="">
>> >>> 2(vnet0) entered disabled state<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: device vnet0<br class="">
>> >>> left<br class="">
>> >>> promiscuous mode<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: ovirtmgmt: port<br class="">
>> >>> 2(vnet0) entered disabled state<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:<br class="">
>> >>> hostname:<br class="">
>> >>> lago-basic-suite-master-host0.<wbr class="">lago.local<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]: Unable<br class="">
>> >>> to<br class="">
>> >>> read from monitor: Connection reset by peer<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:<br class="">
>> >>> internal<br class="">
>> >>> error: qemu unexpectedly closed the monitor:<br class="">
>> >>> 2017-02-07T11:33:23.058571Z<br class="">
>> >>> qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 1 2 3 4 5 6 7<br class="">
>> >>> 8 9<br class="">
>> >>> 10 11 12 13 14 15<br class="">
>> >>><br class="">
>> >>> 2017-02-07T11:33:23.058826Z qemu-kvm: warning: All CPU(s) up to<br class="">
>> >>> maxcpus<br class="">
>> >>> should be described in NUMA config<br class="">
>> >>><br class="">
>> >>> qemu-kvm:<br class="">
>> >>> /builddir/build/BUILD/qemu-2.6<wbr class="">.0/target-i386/kvm.c:1736: kvm_put_msrs:<br class="">
>> >>> Assertion `ret == n' failed.<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 NetworkManager[657]:<br class="">
>> >>> <info><br class="">
>> >>> [1486467203.1025] device (vnet0): state change: disconnected -><br class="">
>> >>> unmanaged<br class="">
>> >>> (reason 'unmanaged') [30 10 3]<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kvm[22059]: 0 guests now<br class="">
>> >>> active<br class="">
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 systemd-machined[22044]:<br class="">
>> >>> Machine qemu-1-vm0 terminated.<br class="">
>> >>><br class="">
>> >>> Thanks,<br class="">
>> >>> Ondra<br class="">
>> >>><br class="">
>> >>> ______________________________<wbr class="">_________________<br class="">
>> >>> Devel mailing list<br class="">
>> >>> <a href="mailto:Devel@ovirt.org" target="_blank" class="">Devel@ovirt.org</a><br class="">
>> >>> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank" class="">http://lists.ovirt.org/mailman<wbr class="">/listinfo/devel</a><br class="">
>> >><br class="">
>> >><br class="">
>> >><br class="">
>> >><br class="">
>> >> --<br class="">
>> >> Eyal Edri<br class="">
>> >> Associate Manager<br class="">
>> >> RHV DevOps<br class="">
>> >> EMEA ENG Virtualization R&D<br class="">
>> >> Red Hat Israel<br class="">
>> >><br class="">
>> >> phone: <a href="tel:%2B972-9-7692018" value="+97297692018" target="_blank" class="">+972-9-7692018</a><br class="">
>> >> irc: eedri (on #tlv #rhev-dev #rhev-integ)<br class="">
>> ><br class="">
>> ><br class="">
>> ><br class="">
>> > ______________________________<wbr class="">_________________<br class="">
>> > Devel mailing list<br class="">
>> > <a href="mailto:Devel@ovirt.org" target="_blank" class="">Devel@ovirt.org</a><br class="">
>> > <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank" class="">http://lists.ovirt.org/mailman<wbr class="">/listinfo/devel</a><br class="">
><br class="">
><br class="">
</div></div></blockquote></div><br class=""></div>
</div></blockquote><blockquote type="cite" class=""><div class=""><span class="">______________________________<wbr class="">_________________</span><br class=""><span class="">Devel mailing list</span><br class=""><span class=""><a href="mailto:Devel@ovirt.org" target="_blank" class="">Devel@ovirt.org</a></span><br class=""><span class=""><a href="http://lists.ovirt.org/mailman/listinfo/devel" target="_blank" class="">http://lists.ovirt.org/<wbr class="">mailman/listinfo/devel</a></span></div></blockquote></div></div></div>
<br class="">______________________________<wbr class="">_________________<br class="">
Devel mailing list<br class="">
<a href="mailto:Devel@ovirt.org" class="">Devel@ovirt.org</a><br class="">
<a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank" class="">http://lists.ovirt.org/<wbr class="">mailman/listinfo/devel</a><br class=""></blockquote></div><br class=""></div>
</div></blockquote></div><br class=""></body></html>