I would first try testing it without OST, because in OST it will pick
the CPU via the cluster family(which is controlled in virt.py). You
can try specifying the 'cpu_model' in the init file, skipping the 'cpu
family' logic, something like:
cat LagoInitFile
domains:
vm-el73:
memory: 2048
service_provider: systemd
cpu_model: Broadwell
nics:
- net: lago
disks:
- template_name: el7.3-base
type: template
name: root
dev: vda
format: qcow2
nets:
lago:
type: nat
dhcp:
start: 100
end: 254
management: true
dns_domain_name: lago.local
lago init && lago start
Then install lago again in the VM, copy the same init file, and check
if for different combinations of cpu_model it works for you - would
give us a hint how to solve this. The 'cpu_model' basically translates
to this xml definition in libvirt:
<cpu mode='custom' match='exact'>
<model fallback='allow'>Broadwell</model>
<topology sockets='2' cores='1' threads='1'/>
<feature policy='optional' name='vmx'/>
<feature policy='optional' name='svm'/>
</cpu>
I tried manually editing it also to host-passthrough, but still failed
on the same error. The thing is that the 'kvm_put_msrs: Assertion `ret
== n' failed.' error doesn't give any indication where it failed(or if
the cpu is missing a flag), maybe there is a way to debug this at
qemu/kvm level? I'm not sure.
On Wed, Feb 8, 2017 at 1:18 PM, Ondrej Svoboda <osvoboda(a)redhat.com> wrote:
It is a Skylake-H, and I can see it is not mentioned in
lago/virt.py.
I guess I'll step through the code (as well as other places discovered by
'git grep cpu') and see if I could solve this by adding the Skylake family
to _CPU_FAMILIES.
Do you have other pointers?
Thanks,
Ondra
On Tue, Feb 7, 2017 at 10:40 PM, Nadav Goldin <ngoldin(a)redhat.com> wrote:
>
> What is the host CPU you are using?
> I came across the same error few days ago, but without running OST, I
> tried running with Lago:
> fc24 host -> el7 vm -> el7 vm.
>
> I have a slight suspect that it is related to the CPU model we
> configure in libvirt, I tried a mixture of few
> combinations(host-pass-through, pinning down the CPU model), but it
> always failed on the same error:
> kvm_put_msrs: Assertion `ret == n' failed.
>
> My CPU is Broadwell btw.
>
>
> Milan, any ideas? you think it might be related?
>
> Nadav.
>
>
>
> On Tue, Feb 7, 2017 at 11:14 PM, Ondrej Svoboda <osvoboda(a)redhat.com>
> wrote:
> > Yes, I stated that in my message.
> >
> > root@osvoboda-t460p /home/src/ovirt-system-tests (git)-[master] # cat
> > /sys/module/kvm_intel/parameters/nested
> > :(
> > Y
> >
> > On Tue, Feb 7, 2017 at 1:39 PM, Eyal Edri <eedri(a)redhat.com> wrote:
> >>
> >> Did you follow the instructions on [1] ?
> >>
> >> Specifically, verifying ' cat /sys/module/kvm_intel/parameters/nested
> >> '
> >> gives you 'Y'.
> >>
> >> [1]
> >>
> >>
http://ovirt-system-tests.readthedocs.io/en/latest/docs/general/installat...
> >>
> >> On Tue, Feb 7, 2017 at 2:29 PM, Ondrej Svoboda <osvoboda(a)redhat.com>
> >> wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> Even though I have nested virtualization enabled in my Arch Linux
> >>> system
> >>> which I use to run OST, vm_run is the first test to fail in
> >>> 004_basic_sanity
> >>> (followed by snapshots_merge and suspend_resume_vm).
> >>>
> >>> Can you point me to what I might be missing? I believe I get the same
> >>> failure even on Fedora.
> >>>
> >>> This is what host0's CPU capabilities look like (vmx is there):
> >>> [root@lago-basic-suite-master-host0 ~]# cat /proc/cpuinfo
> >>> processor : 0
> >>> vendor_id : GenuineIntel
> >>> cpu family : 6
> >>> model : 44
> >>> model name : Westmere E56xx/L56xx/X56xx (Nehalem-C)
> >>> stepping : 1
> >>> microcode : 0x1
> >>> cpu MHz : 2711.988
> >>> cache size : 16384 KB
> >>> physical id : 0
> >>> siblings : 1
> >>> core id : 0
> >>> cpu cores : 1
> >>> apicid : 0
> >>> initial apicid : 0
> >>> fpu : yes
> >>> fpu_exception : yes
> >>> cpuid level : 11
> >>> wp : yes
> >>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> >>> mca
> >>> cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc
> >>> rep_good
> >>> nopl xtopology pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic
> >>> popcnt aes
> >>> hypervisor lahf_lm arat tpr_shadow vnmi flexpriority ept vpid
> >>> bogomips : 5423.97
> >>> clflush size : 64
> >>> cache_alignment : 64
> >>> address sizes : 40 bits physical, 48 bits virtual
> >>> power management:
> >>>
> >>> journalctl -b on host0 shows that libvirt complains about NUMA
> >>> configuration:
> >>>
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]: libvirt
> >>> version: 2.0.0, package: 10.el7_3.4 (CentOS BuildSystem
> >>> <
http://bugs.centos.org>, 2017-01-17-23:37:48,
c1bm.rdu2.centos.org)
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: ovirtmgmt: port
> >>> 2(vnet0) entered disabled state
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: device vnet0
> >>> left
> >>> promiscuous mode
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kernel: ovirtmgmt: port
> >>> 2(vnet0) entered disabled state
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:
> >>> hostname:
> >>> lago-basic-suite-master-host0.lago.local
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]: Unable
> >>> to
> >>> read from monitor: Connection reset by peer
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 libvirtd[12888]:
> >>> internal
> >>> error: qemu unexpectedly closed the monitor:
> >>> 2017-02-07T11:33:23.058571Z
> >>> qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 1 2 3 4 5 6 7
> >>> 8 9
> >>> 10 11 12 13 14 15
> >>>
> >>> 2017-02-07T11:33:23.058826Z qemu-kvm: warning: All CPU(s) up to
> >>> maxcpus
> >>> should be described in NUMA config
> >>>
> >>> qemu-kvm:
> >>> /builddir/build/BUILD/qemu-2.6.0/target-i386/kvm.c:1736: kvm_put_msrs:
> >>> Assertion `ret == n' failed.
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 NetworkManager[657]:
> >>> <info>
> >>> [1486467203.1025] device (vnet0): state change: disconnected ->
> >>> unmanaged
> >>> (reason 'unmanaged') [30 10 3]
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 kvm[22059]: 0 guests now
> >>> active
> >>> Feb 07 06:33:23 lago-basic-suite-master-host0 systemd-machined[22044]:
> >>> Machine qemu-1-vm0 terminated.
> >>>
> >>> Thanks,
> >>> Ondra
> >>>
> >>> _______________________________________________
> >>> Devel mailing list
> >>> Devel(a)ovirt.org
> >>>
http://lists.ovirt.org/mailman/listinfo/devel
> >>
> >>
> >>
> >>
> >> --
> >> Eyal Edri
> >> Associate Manager
> >> RHV DevOps
> >> EMEA ENG Virtualization R&D
> >> Red Hat Israel
> >>
> >> phone: +972-9-7692018
> >> irc: eedri (on #tlv #rhev-dev #rhev-integ)
> >
> >
> >
> > _______________________________________________
> > Devel mailing list
> > Devel(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/devel