<div dir="ltr"><div><div>In my case, simply adding Skylake-Client a supported CPU family did the trick: <a href="https://github.com/lago-project/lago/pull/448">https://github.com/lago-project/lago/pull/448</a><br><br></div>i wonder if Westmere is a good fallback -- it works for you on Broadwell, right?<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Feb 8, 2017 at 1:58 PM, Nadav Goldin <span dir="ltr"><<a href="mailto:ngoldin@redhat.com" target="_blank">ngoldin@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I would first try testing it without OST, because in OST it will pick<br>
the CPU via the cluster family(which is controlled in virt.py). You<br>
can try specifying the 'cpu_model' in the init file, skipping the 'cpu<br>
family' logic, something like:<br>
<br>
> cat LagoInitFile<br>
domains:<br>
vm-el73:<br>
memory: 2048<br>
service_provider: systemd<br>
cpu_model: Broadwell<br>
nics:<br>
- net: lago<br>
disks:<br>
- template_name: el7.3-base<br>
type: template<br>
name: root<br>
dev: vda<br>
format: qcow2<br>
nets:<br>
lago:<br>
type: nat<br>
dhcp:<br>
start: 100<br>
end: 254<br>
management: true<br>
dns_domain_name: lago.local<br>
<br>
> lago init && lago start<br>
<br>
Then install lago again in the VM, copy the same init file, and check<br>
if for different combinations of cpu_model it works for you - would<br>
give us a hint how to solve this. The 'cpu_model' basically translates<br>
to this xml definition in libvirt:<br>
<cpu mode='custom' match='exact'><br>
<model fallback='allow'>Broadwell</<wbr>model><br>
<topology sockets='2' cores='1' threads='1'/><br>
<feature policy='optional' name='vmx'/><br>
<feature policy='optional' name='svm'/><br>
</cpu><br>
<br>
I tried manually editing it also to host-passthrough, but still failed<br>
on the same error. The thing is that the 'kvm_put_msrs: Assertion `ret<br>
== n' failed.' error doesn't give any indication where it failed(or if<br>
the cpu is missing a flag), maybe there is a way to debug this at<br>
qemu/kvm level? I'm not sure.<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
<br>
<br>
<br>
<br>
On Wed, Feb 8, 2017 at 1:18 PM, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com">osvoboda@redhat.com</a>> wrote:<br>
> It is a Skylake-H, and I can see it is not mentioned in lago/virt.py.<br>
><br>
> I guess I'll step through the code (as well as other places discovered by<br>
> 'git grep cpu') and see if I could solve this by adding the Skylake family<br>
> to _CPU_FAMILIES.<br>
><br>
> Do you have other pointers?<br>
><br>
> Thanks,<br>
> Ondra<br>
><br>
> On Tue, Feb 7, 2017 at 10:40 PM, Nadav Goldin <<a href="mailto:ngoldin@redhat.com">ngoldin@redhat.com</a>> wrote:<br>
>><br>
>> What is the host CPU you are using?<br>
>> I came across the same error few days ago, but without running OST, I<br>
>> tried running with Lago:<br>
>> fc24 host -> el7 vm -> el7 vm.<br>
>><br>
>> I have a slight suspect that it is related to the CPU model we<br>
>> configure in libvirt, I tried a mixture of few<br>
>> combinations(host-pass-<wbr>through, pinning down the CPU model), but it<br>
>> always failed on the same error:<br>
>> kvm_put_msrs: Assertion `ret == n' failed.<br>
>><br>
>> My CPU is Broadwell btw.<br>
>><br>
>><br>
>> Milan, any ideas? you think it might be related?<br>
>><br>
>> Nadav.<br>
>><br>
>><br>
>><br>
>> On Tue, Feb 7, 2017 at 11:14 PM, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com">osvoboda@redhat.com</a>><br>
>> wrote:<br>
>> > Yes, I stated that in my message.<br>
>> ><br>
>> > root@osvoboda-t460p /home/src/ovirt-system-tests (git)-[master] # cat<br>
>> > /sys/module/kvm_intel/<wbr>parameters/nested<br>
>> > :(<br>
>> > Y<br>
>> ><br>
>> > On Tue, Feb 7, 2017 at 1:39 PM, Eyal Edri <<a href="mailto:eedri@redhat.com">eedri@redhat.com</a>> wrote:<br>
>> >><br>
>> >> Did you follow the instructions on [1] ?<br>
>> >><br>
>> >> Specifically, verifying ' cat /sys/module/kvm_intel/<wbr>parameters/nested<br>
>> >> '<br>
>> >> gives you 'Y'.<br>
>> >><br>
>> >> [1]<br>
>> >><br>
>> >> <a href="http://ovirt-system-tests.readthedocs.io/en/latest/docs/general/installation.html" rel="noreferrer" target="_blank">http://ovirt-system-tests.<wbr>readthedocs.io/en/latest/docs/<wbr>general/installation.html</a><br>
>> >><br>
>> >> On Tue, Feb 7, 2017 at 2:29 PM, Ondrej Svoboda <<a href="mailto:osvoboda@redhat.com">osvoboda@redhat.com</a>><br>
>> >> wrote:<br>
>> >>><br>
>> >>> Hi everyone,<br>
>> >>><br>
>> >>> Even though I have nested virtualization enabled in my Arch Linux<br>
>> >>> system<br>
>> >>> which I use to run OST, vm_run is the first test to fail in<br>
>> >>> 004_basic_sanity<br>
>> >>> (followed by snapshots_merge and suspend_resume_vm).<br>
>> >>><br>
>> >>> Can you point me to what I might be missing? I believe I get the same<br>
>> >>> failure even on Fedora.<br>
>> >>><br>
>> >>> This is what host0's CPU capabilities look like (vmx is there):<br>
>> >>> [root@lago-basic-suite-master-<wbr>host0 ~]# cat /proc/cpuinfo<br>
>> >>> processor : 0<br>
>> >>> vendor_id : GenuineIntel<br>
>> >>> cpu family : 6<br>
>> >>> model : 44<br>
>> >>> model name : Westmere E56xx/L56xx/X56xx (Nehalem-C)<br>
>> >>> stepping : 1<br>
>> >>> microcode : 0x1<br>
>> >>> cpu MHz : 2711.988<br>
>> >>> cache size : 16384 KB<br>
>> >>> physical id : 0<br>
>> >>> siblings : 1<br>
>> >>> core id : 0<br>
>> >>> cpu cores : 1<br>
>> >>> apicid : 0<br>
>> >>> initial apicid : 0<br>
>> >>> fpu : yes<br>
>> >>> fpu_exception : yes<br>
>> >>> cpuid level : 11<br>
>> >>> wp : yes<br>
>> >>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge<br>
>> >>> mca<br>
>> >>> cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc<br>
>> >>> rep_good<br>
>> >>> nopl xtopology pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic<br>
>> >>> popcnt aes<br>
>> >>> hypervisor lahf_lm arat tpr_shadow vnmi flexpriority ept vpid<br>
>> >>> bogomips : 5423.97<br>
>> >>> clflush size : 64<br>
>> >>> cache_alignment : 64<br>
>> >>> address sizes : 40 bits physical, 48 bits virtual<br>
>> >>> power management:<br>
>> >>><br>
>> >>> journalctl -b on host0 shows that libvirt complains about NUMA<br>
>> >>> configuration:<br>
>> >>><br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]: libvirt<br>
>> >>> version: 2.0.0, package: 10.el7_3.4 (CentOS BuildSystem<br>
>> >>> <<a href="http://bugs.centos.org" rel="noreferrer" target="_blank">http://bugs.centos.org</a>>, <a href="tel:2017-01-17-23" value="+12017011723">2017-01-17-23</a>:37:48, <a href="http://c1bm.rdu2.centos.org" rel="noreferrer" target="_blank">c1bm.rdu2.centos.org</a>)<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: ovirtmgmt: port<br>
>> >>> 2(vnet0) entered disabled state<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: device vnet0<br>
>> >>> left<br>
>> >>> promiscuous mode<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: ovirtmgmt: port<br>
>> >>> 2(vnet0) entered disabled state<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:<br>
>> >>> hostname:<br>
>> >>> lago-basic-suite-master-host0.<wbr>lago.local<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]: Unable<br>
>> >>> to<br>
>> >>> read from monitor: Connection reset by peer<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:<br>
>> >>> internal<br>
>> >>> error: qemu unexpectedly closed the monitor:<br>
>> >>> 2017-02-07T11:33:23.058571Z<br>
>> >>> qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 1 2 3 4 5 6 7<br>
>> >>> 8 9<br>
>> >>> 10 11 12 13 14 15<br>
>> >>><br>
>> >>> 2017-02-07T11:33:23.058826Z qemu-kvm: warning: All CPU(s) up to<br>
>> >>> maxcpus<br>
>> >>> should be described in NUMA config<br>
>> >>><br>
>> >>> qemu-kvm:<br>
>> >>> /builddir/build/BUILD/qemu-2.<wbr>6.0/target-i386/kvm.c:1736: kvm_put_msrs:<br>
>> >>> Assertion `ret == n' failed.<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 NetworkManager[657]:<br>
>> >>> <info><br>
>> >>> [1486467203.1025] device (vnet0): state change: disconnected -><br>
>> >>> unmanaged<br>
>> >>> (reason 'unmanaged') [30 10 3]<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kvm[22059]: 0 guests now<br>
>> >>> active<br>
>> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 systemd-machined[22044]:<br>
>> >>> Machine qemu-1-vm0 terminated.<br>
>> >>><br>
>> >>> Thanks,<br>
>> >>> Ondra<br>
>> >>><br>
>> >>> ______________________________<wbr>_________________<br>
>> >>> Devel mailing list<br>
>> >>> <a href="mailto:Devel@ovirt.org">Devel@ovirt.org</a><br>
>> >>> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/devel</a><br>
>> >><br>
>> >><br>
>> >><br>
>> >><br>
>> >> --<br>
>> >> Eyal Edri<br>
>> >> Associate Manager<br>
>> >> RHV DevOps<br>
>> >> EMEA ENG Virtualization R&D<br>
>> >> Red Hat Israel<br>
>> >><br>
>> >> phone: <a href="tel:%2B972-9-7692018" value="+97297692018">+972-9-7692018</a><br>
>> >> irc: eedri (on #tlv #rhev-dev #rhev-integ)<br>
>> ><br>
>> ><br>
>> ><br>
>> > ______________________________<wbr>_________________<br>
>> > Devel mailing list<br>
>> > <a href="mailto:Devel@ovirt.org">Devel@ovirt.org</a><br>
>> > <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/devel</a><br>
><br>
><br>
</div></div></blockquote></div><br></div>