Dear Ryan Bullock,

Thank you for help with this issue. I will correlate accordingly.

With Best Regards.

Steven Rosenberg.

On Mon, Feb 18, 2019 at 5:46 AM Ryan Bullock <rrb3942@gmail.com> wrote:
Hey Steven,

Including just the cpuFlags, since the output is pretty verbose. Let me know if you need anything else from the output.

Without avic=1 (Works Fine):
    "cpuFlags": "fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca,model_Opteron_G3,model_Opteron_G2,model_kvm32,model_kvm64,model_Westmere,model_Nehalem,model_Conroe,model_EPYC-IBPB,model_Opteron_G1,model_SandyBridge,model_qemu32,model_Penryn,model_pentium2,model_486,model_qemu64,model_cpu64-rhel6,model_EPYC,model_pentium,model_pentium3"

With avic=1 (Problem Configuration):
"cpuFlags": "fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca"

Flags stay the same, but with avic=1 no models are shown as supported.
Also, I opened this bug https://bugzilla.redhat.com/show_bug.cgi?id=1675030 regarding the avic=1 setting seemingly requiring the x2apic flag.

-Ryan

On Sun, Feb 17, 2019 at 5:22 AM Steven Rosenberg <srosenbe@redhat.com> wrote:
Dear Ryan Bullock,

I am currently looking at this issue:


We would like more information concerning the CPU Flags (even though you included them in your engine log dump above).

Could you run the following command on the same host running: AMD EPYC-IBPB 
vdsm-client Host getCapabilities
Please send me the output, especially the CPU Flags.
Thank you in advance for your help.
With Best Regards.
Steven Rosenberg.


On Thu, Feb 7, 2019 at 6:35 PM Ryan Bullock <rrb3942@gmail.com> wrote:
That would explain it.

Would removing the host and then reinstalling it under a new 4.3 cluster work without having to set the entire old cluster into maintenance to change the cpu? Then I could just restart VM's into the new cluster as we transition to minimize downtime.

Thanks for the info!

Ryan

On Thu, Feb 7, 2019 at 7:56 AM Greg Sheremeta <gshereme@redhat.com> wrote:
AMD EPYC IBPB is deprecated in 4.3.
The deprecated CPUs (cpus variable, that entire list) are:

So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still.

Greg

On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:

Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.

Relevant (I think) parts from the the engine log:

(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca'
2019-02-06 17:23:58,527-08 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'


From virsh -r capabilities:

    <cpu>
      <arch>x86_64</arch>
      <model>EPYC-IBPB</model>
      <vendor>AMD</vendor>
      <microcode version='134222375'/>
      <topology sockets='1' cores='32' threads='2'/>
      <feature name='ht'/>
      <feature name='osxsave'/>
      <feature name='xsaves'/>
      <feature name='cmp_legacy'/>
      <feature name='extapic'/>
      <feature name='skinit'/>
      <feature name='wdt'/>
      <feature name='tce'/>
      <feature name='topoext'/>
      <feature name='perfctr_core'/>
      <feature name='perfctr_nb'/>
      <feature name='invtsc'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='2048'/>
      <pages unit='KiB' size='1048576'/>
    </cpu>

I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.

We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed):
7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64

We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.

Reverting the host back to 4.2 allows it to activate and run normally.

Anyone have any ideas as to why it can't seem to find the cpu type?

Thanks,

Ryan Bullock
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5JK45TLDERNM7IMTHIYY/


--

GREG SHEREMETA

SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX

Red Hat NA

gshereme@redhat.com    IRC: gshereme

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMDXG35JGXXEA3IKDWKFRS5OICZIXQYL/