AMD EPYC 4.3 upgrade 'CPU type is not supported in this cluster compatibility version or is not supported at all'

We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error: Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all. Relevant (I think) parts from the the engine log: (EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER' From virsh -r capabilities: <cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu> I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster. We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64 We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model. Reverting the host back to 4.2 allows it to activate and run normally. Anyone have any ideas as to why it can't seem to find the cpu type? Thanks, Ryan Bullock

AMD EPYC IBPB is deprecated in 4.3. The deprecated CPUs (cpus variable, that entire list) are: https://gerrit.ovirt.org/#/c/95310/7/frontend/webadmin/modules/webadmin/src/... So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still. Greg On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:
Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.
Relevant (I think) parts from the the engine log:
(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'
From virsh -r capabilities:
<cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.
We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64
We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.
Reverting the host back to 4.2 allows it to activate and run normally.
Anyone have any ideas as to why it can't seem to find the cpu type?
Thanks,
Ryan Bullock _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

That would explain it. Would removing the host and then reinstalling it under a new 4.3 cluster work without having to set the entire old cluster into maintenance to change the cpu? Then I could just restart VM's into the new cluster as we transition to minimize downtime. Thanks for the info! Ryan On Thu, Feb 7, 2019 at 7:56 AM Greg Sheremeta <gshereme@redhat.com> wrote:
AMD EPYC IBPB is deprecated in 4.3. The deprecated CPUs (cpus variable, that entire list) are:
https://gerrit.ovirt.org/#/c/95310/7/frontend/webadmin/modules/webadmin/src/...
So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still.
Greg
On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:
Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.
Relevant (I think) parts from the the engine log:
(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'
From virsh -r capabilities:
<cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.
We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64
We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.
Reverting the host back to 4.2 allows it to activate and run normally.
Anyone have any ideas as to why it can't seem to find the cpu type?
Thanks,
Ryan Bullock _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

On Thu, Feb 7, 2019 at 11:31 AM Ryan Bullock <rrb3942@gmail.com> wrote:
That would explain it.
Would removing the host and then reinstalling it under a new 4.3 cluster work without having to set the entire old cluster into maintenance to change the cpu? Then I could just restart VM's into the new cluster as we transition to minimize downtime.
@Simone Tiraboschi <stirabos@redhat.com> or @Ryan Barry <rbarry@redhat.com> ?
Thanks for the info!
Glad to help :) Greg
Ryan
On Thu, Feb 7, 2019 at 7:56 AM Greg Sheremeta <gshereme@redhat.com> wrote:
AMD EPYC IBPB is deprecated in 4.3. The deprecated CPUs (cpus variable, that entire list) are:
https://gerrit.ovirt.org/#/c/95310/7/frontend/webadmin/modules/webadmin/src/...
So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still.
Greg
On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:
Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.
Relevant (I think) parts from the the engine log:
(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'
From virsh -r capabilities:
<cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.
We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64
We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.
Reverting the host back to 4.2 allows it to activate and run normally.
Anyone have any ideas as to why it can't seem to find the cpu type?
Thanks,
Ryan Bullock _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

On Thu, Feb 7, 2019 at 5:46 PM Greg Sheremeta <gshereme@redhat.com> wrote:
On Thu, Feb 7, 2019 at 11:31 AM Ryan Bullock <rrb3942@gmail.com> wrote:
That would explain it.
Would removing the host and then reinstalling it under a new 4.3 cluster work without having to set the entire old cluster into maintenance to change the cpu? Then I could just restart VM's into the new cluster as we transition to minimize downtime.
@Simone Tiraboschi <stirabos@redhat.com> or @Ryan Barry <rbarry@redhat.com> ?
For an hosted-engine cluster we have a manual workaround procedure documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1672859#c1
Thanks for the info!
Glad to help :)
Greg
Ryan
On Thu, Feb 7, 2019 at 7:56 AM Greg Sheremeta <gshereme@redhat.com> wrote:
AMD EPYC IBPB is deprecated in 4.3. The deprecated CPUs (cpus variable, that entire list) are:
https://gerrit.ovirt.org/#/c/95310/7/frontend/webadmin/modules/webadmin/src/...
So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still.
Greg
On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:
Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.
Relevant (I think) parts from the the engine log:
(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'
From virsh -r capabilities:
<cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.
We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64
We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.
Reverting the host back to 4.2 allows it to activate and run normally.
Anyone have any ideas as to why it can't seem to find the cpu type?
Thanks,
Ryan Bullock _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

On Thu, Feb 7, 2019 at 6:52 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
For an hosted-engine cluster we have a manual workaround procedure documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1672859#c1
I managed to upgrade my Epyc cluster with those steps. I made new cluster with Epyc CPU Type and cluster already in 4.3 level. Starting engine in new cluster complained something about not finding vm with that uuid but it still started engine fine. When all nodes were in new cluster I still couldn't upgrade old cluster because engine was complaining that couple of VM's couldn't be upgraded (something to do with custom level). I moved them to new cluster too. Had to just change networks to management for the move. After that I could upgrade old cluster to Epyc and 4.3 level. Then I just moved VM's and nodes back (same steps but backwards). After that you can remove the extra cluster and raise datacenter to 4.3 level. -Juhani

On Thu, Feb 7, 2019 at 7:15 PM Juhani Rautiainen < juhani.rautiainen@gmail.com> wrote:
On Thu, Feb 7, 2019 at 6:52 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
For an hosted-engine cluster we have a manual workaround procedure documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1672859#c1
I managed to upgrade my Epyc cluster with those steps. I made new cluster with Epyc CPU Type and cluster already in 4.3 level. Starting engine in new cluster complained something about not finding vm with that uuid but it still started engine fine. When all nodes were in new cluster I still couldn't upgrade old cluster because engine was complaining that couple of VM's couldn't be upgraded (something to do with custom level). I moved them to new cluster too. Had to just change networks to management for the move. After that I could upgrade old cluster to Epyc and 4.3 level. Then I just moved VM's and nodes back (same steps but backwards). After that you can remove the extra cluster and raise datacenter to 4.3 level.
-Juhani
Thanks for the report! We definitively have to figure out a better upgrade flow when a cluster CPU change is required/advised.

This procedure worked for our HE, which is on Skylake. I think I have a process that should work for moving our EPYC clusters to 4.3. If it works this weekend I will post it for others. Ryan On Thu, Feb 7, 2019 at 12:06 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Thu, Feb 7, 2019 at 7:15 PM Juhani Rautiainen < juhani.rautiainen@gmail.com> wrote:
On Thu, Feb 7, 2019 at 6:52 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
For an hosted-engine cluster we have a manual workaround procedure documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1672859#c1
I managed to upgrade my Epyc cluster with those steps. I made new cluster with Epyc CPU Type and cluster already in 4.3 level. Starting engine in new cluster complained something about not finding vm with that uuid but it still started engine fine. When all nodes were in new cluster I still couldn't upgrade old cluster because engine was complaining that couple of VM's couldn't be upgraded (something to do with custom level). I moved them to new cluster too. Had to just change networks to management for the move. After that I could upgrade old cluster to Epyc and 4.3 level. Then I just moved VM's and nodes back (same steps but backwards). After that you can remove the extra cluster and raise datacenter to 4.3 level.
-Juhani
Thanks for the report! We definitively have to figure out a better upgrade flow when a cluster CPU change is required/advised.

So I tried making a new cluster with a 4.2 compatibility level and moving one of my EPYC hosts into it. I then updated the host to 4.3 and switched the cluster version 4.3 + set cluster cpu to the new AMD EPYC IBPD SSBD (also tried plain AMD EPYC). It still fails to make the host operational complaining that 'CPU type is not supported in this cluster compatibility version or is not supported at all'. I tried a few iterations of updating, moving, activating, reinstalling, etc, but none of them seem to work. The hosts are running CentOS Linux release 7.6.1810 (Core), all packages are up to date. I checked my CPU flags, and I can't see anything missing. cat /proc/cpuinfo | head -n 26 processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD EPYC 7551P 32-Core Processor stepping : 2 microcode : 0x8001227 cpu MHz : 2000.000 cache size : 512 KB physical id : 0 siblings : 64 core id : 0 cpu cores : 32 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall *nx* mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy *svm* extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 cpb hw_pstate sme retpoline_amd *ssbd ibpb* vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca bogomips : 3992.39 TLB size : 2560 4K pages clflush size : 64 cache_alignment : 64 address sizes : 43 bits physical, 48 bits virtual power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] I just can't seem to figure out why 4.3 does not like these EPYC systems. We have Skylake and Sandybridge clusters that are full on 4.3 now with the SSBD variant CPUs. I'm at a loss as to what to try next. Only thing I can think of is to reinstall the host OS or try ovirt-node, but I would like to avoid that if I can. Thank you for all the help so far. Regards, Ryan On Fri, Feb 8, 2019 at 9:07 AM Ryan Bullock <rrb3942@gmail.com> wrote:
This procedure worked for our HE, which is on Skylake.
I think I have a process that should work for moving our EPYC clusters to 4.3. If it works this weekend I will post it for others.
Ryan
On Thu, Feb 7, 2019 at 12:06 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Thu, Feb 7, 2019 at 7:15 PM Juhani Rautiainen < juhani.rautiainen@gmail.com> wrote:
On Thu, Feb 7, 2019 at 6:52 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
For an hosted-engine cluster we have a manual workaround procedure documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1672859#c1
I managed to upgrade my Epyc cluster with those steps. I made new cluster with Epyc CPU Type and cluster already in 4.3 level. Starting engine in new cluster complained something about not finding vm with that uuid but it still started engine fine. When all nodes were in new cluster I still couldn't upgrade old cluster because engine was complaining that couple of VM's couldn't be upgraded (something to do with custom level). I moved them to new cluster too. Had to just change networks to management for the move. After that I could upgrade old cluster to Epyc and 4.3 level. Then I just moved VM's and nodes back (same steps but backwards). After that you can remove the extra cluster and raise datacenter to 4.3 level.
-Juhani
Thanks for the report! We definitively have to figure out a better upgrade flow when a cluster CPU change is required/advised.

On Sat, Feb 9, 2019 at 7:43 PM Ryan Bullock <rrb3942@gmail.com> wrote:
So I tried making a new cluster with a 4.2 compatibility level and moving one of my EPYC hosts into it. I then updated the host to 4.3 and switched the cluster version 4.3 + set cluster cpu to the new AMD EPYC IBPD SSBD (also tried plain AMD EPYC). It still fails to make the host operational complaining that 'CPU type is not supported in this cluster compatibility version or is not supported at all'.
When I did this with Epyc I made new cluster wth 4.3 level and Epyc CPU. And then moved the nodes to it. Maybe try that? I also had to move couple of VM's to new cluster because old cluster couldn't upgrade with those. When nodes and couple problem VM's were in new cluster I could upgrade old cluster to new level. -Juhani

I tried that too, but it still complains about an unsupported CPU in the new cluster. Even if I leave the cluster level at 4.2, if I update the host to 4.3 it can't activate under a 4.2 cluster. Makes me think something changed in how it verifies the CPU support and for some reason it is not liking my EPYC systems. On Sat, Feb 9, 2019 at 10:18 AM Juhani Rautiainen < juhani.rautiainen@gmail.com> wrote:
On Sat, Feb 9, 2019 at 7:43 PM Ryan Bullock <rrb3942@gmail.com> wrote:
So I tried making a new cluster with a 4.2 compatibility level and
moving one of my EPYC hosts into it. I then updated the host to 4.3 and switched the cluster version 4.3 + set cluster cpu to the new AMD EPYC IBPD SSBD (also tried plain AMD EPYC). It still fails to make the host operational complaining that 'CPU type is not supported in this cluster compatibility version or is not supported at all'.
When I did this with Epyc I made new cluster wth 4.3 level and Epyc CPU. And then moved the nodes to it. Maybe try that? I also had to move couple of VM's to new cluster because old cluster couldn't upgrade with those. When nodes and couple problem VM's were in new cluster I could upgrade old cluster to new level.
-Juhani

Got a host activated! 1. Update host to 4.3 2. rm /var/cache/libvirt/qemu/capabilities/*.xml 3. systemctl restart libvirtd 4. Activate host Seems like some kind of stuck state going from 4.2 -> 4.3 Hope this helps someone else. On Sat, Feb 9, 2019 at 1:12 PM Ryan Bullock <rrb3942@gmail.com> wrote:
I tried that too, but it still complains about an unsupported CPU in the new cluster. Even if I leave the cluster level at 4.2, if I update the host to 4.3 it can't activate under a 4.2 cluster. Makes me think something changed in how it verifies the CPU support and for some reason it is not liking my EPYC systems.
On Sat, Feb 9, 2019 at 10:18 AM Juhani Rautiainen < juhani.rautiainen@gmail.com> wrote:
On Sat, Feb 9, 2019 at 7:43 PM Ryan Bullock <rrb3942@gmail.com> wrote:
So I tried making a new cluster with a 4.2 compatibility level and
moving one of my EPYC hosts into it. I then updated the host to 4.3 and switched the cluster version 4.3 + set cluster cpu to the new AMD EPYC IBPD SSBD (also tried plain AMD EPYC). It still fails to make the host operational complaining that 'CPU type is not supported in this cluster compatibility version or is not supported at all'.
When I did this with Epyc I made new cluster wth 4.3 level and Epyc CPU. And then moved the nodes to it. Maybe try that? I also had to move couple of VM's to new cluster because old cluster couldn't upgrade with those. When nodes and couple problem VM's were in new cluster I could upgrade old cluster to new level.
-Juhani

Thanks, Ryan. I opened https://bugzilla.redhat.com/show_bug.cgi?id=1674265 to track this. Greg On Sat, Feb 9, 2019 at 5:50 PM Ryan Bullock <rrb3942@gmail.com> wrote:
Got a host activated!
1. Update host to 4.3 2. rm /var/cache/libvirt/qemu/capabilities/*.xml 3. systemctl restart libvirtd 4. Activate host
Seems like some kind of stuck state going from 4.2 -> 4.3
Hope this helps someone else.
On Sat, Feb 9, 2019 at 1:12 PM Ryan Bullock <rrb3942@gmail.com> wrote:
I tried that too, but it still complains about an unsupported CPU in the new cluster. Even if I leave the cluster level at 4.2, if I update the host to 4.3 it can't activate under a 4.2 cluster. Makes me think something changed in how it verifies the CPU support and for some reason it is not liking my EPYC systems.
On Sat, Feb 9, 2019 at 10:18 AM Juhani Rautiainen < juhani.rautiainen@gmail.com> wrote:
On Sat, Feb 9, 2019 at 7:43 PM Ryan Bullock <rrb3942@gmail.com> wrote:
So I tried making a new cluster with a 4.2 compatibility level and
moving one of my EPYC hosts into it. I then updated the host to 4.3 and switched the cluster version 4.3 + set cluster cpu to the new AMD EPYC IBPD SSBD (also tried plain AMD EPYC). It still fails to make the host operational complaining that 'CPU type is not supported in this cluster compatibility version or is not supported at all'.
When I did this with Epyc I made new cluster wth 4.3 level and Epyc CPU. And then moved the nodes to it. Maybe try that? I also had to move couple of VM's to new cluster because old cluster couldn't upgrade with those. When nodes and couple problem VM's were in new cluster I could upgrade old cluster to new level.
-Juhani
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

So this looks like it is a bug in qemu/libvirtd We had avic=1 set for kvm_amd, but when this is set the qemu capabilities cache showed all EPYC/AMD Variants as unusable and blocking due to missing 'x2apic'. My guess is that it should probably be looking for the avic flag instead. My other guess is that avic doesn't actually get enabled at all when it is turned on. Even though I had disable avic earlier in testing, libvirt did not pickup the capabilities change until I cleared its cache. I'm also assuming some of the verification code changes in ovirt from 4.2 to 4.3 or libvirt updated and exposed this. Apparently qemu will just drop unsupported flags when starting a VM. Which is why this was working before. Mystery solved. Regards, Ryan On Sun, Feb 10, 2019 at 8:54 AM Greg Sheremeta <gshereme@redhat.com> wrote:
Thanks, Ryan. I opened https://bugzilla.redhat.com/show_bug.cgi?id=1674265 to track this.
Greg
On Sat, Feb 9, 2019 at 5:50 PM Ryan Bullock <rrb3942@gmail.com> wrote:
Got a host activated!
1. Update host to 4.3 2. rm /var/cache/libvirt/qemu/capabilities/*.xml 3. systemctl restart libvirtd 4. Activate host
Seems like some kind of stuck state going from 4.2 -> 4.3
Hope this helps someone else.
On Sat, Feb 9, 2019 at 1:12 PM Ryan Bullock <rrb3942@gmail.com> wrote:
I tried that too, but it still complains about an unsupported CPU in the new cluster. Even if I leave the cluster level at 4.2, if I update the host to 4.3 it can't activate under a 4.2 cluster. Makes me think something changed in how it verifies the CPU support and for some reason it is not liking my EPYC systems.
On Sat, Feb 9, 2019 at 10:18 AM Juhani Rautiainen < juhani.rautiainen@gmail.com> wrote:
On Sat, Feb 9, 2019 at 7:43 PM Ryan Bullock <rrb3942@gmail.com> wrote:
So I tried making a new cluster with a 4.2 compatibility level and
moving one of my EPYC hosts into it. I then updated the host to 4.3 and switched the cluster version 4.3 + set cluster cpu to the new AMD EPYC IBPD SSBD (also tried plain AMD EPYC). It still fails to make the host operational complaining that 'CPU type is not supported in this cluster compatibility version or is not supported at all'.
When I did this with Epyc I made new cluster wth 4.3 level and Epyc CPU. And then moved the nodes to it. Maybe try that? I also had to move couple of VM's to new cluster because old cluster couldn't upgrade with those. When nodes and couple problem VM's were in new cluster I could upgrade old cluster to new level.
-Juhani
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

Dear Ryan Bullock, I am currently looking at this issue: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J... We would like more information concerning the CPU Flags (even though you included them in your engine log dump above). Could you run the following command on the same host running: AMD EPYC-IBPB vdsm-client Host getCapabilities Please send me the output, especially the CPU Flags. Thank you in advance for your help. With Best Regards. Steven Rosenberg. On Thu, Feb 7, 2019 at 6:35 PM Ryan Bullock <rrb3942@gmail.com> wrote:
That would explain it.
Would removing the host and then reinstalling it under a new 4.3 cluster work without having to set the entire old cluster into maintenance to change the cpu? Then I could just restart VM's into the new cluster as we transition to minimize downtime.
Thanks for the info!
Ryan
On Thu, Feb 7, 2019 at 7:56 AM Greg Sheremeta <gshereme@redhat.com> wrote:
AMD EPYC IBPB is deprecated in 4.3. The deprecated CPUs (cpus variable, that entire list) are:
https://gerrit.ovirt.org/#/c/95310/7/frontend/webadmin/modules/webadmin/src/...
So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still.
Greg
On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:
Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.
Relevant (I think) parts from the the engine log:
(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'
From virsh -r capabilities:
<cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.
We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64
We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.
Reverting the host back to 4.2 allows it to activate and run normally.
Anyone have any ideas as to why it can't seem to find the cpu type?
Thanks,
Ryan Bullock _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMDXG35JGXXEA3...

Hey Steven, Including just the cpuFlags, since the output is pretty verbose. Let me know if you need anything else from the output. Without avic=1 (Works Fine): "cpuFlags": "fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca,model_Opteron_G3,model_Opteron_G2,model_kvm32,model_kvm64,model_Westmere,model_Nehalem,model_Conroe,model_EPYC-IBPB,model_Opteron_G1,model_SandyBridge,model_qemu32,model_Penryn,model_pentium2,model_486,model_qemu64,model_cpu64-rhel6,model_EPYC,model_pentium,model_pentium3" With avic=1 (Problem Configuration): "cpuFlags": "fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca" Flags stay the same, but with avic=1 no models are shown as supported. Also, I opened this bug https://bugzilla.redhat.com/show_bug.cgi?id=1675030 regarding the avic=1 setting seemingly requiring the x2apic flag. -Ryan On Sun, Feb 17, 2019 at 5:22 AM Steven Rosenberg <srosenbe@redhat.com> wrote:
Dear Ryan Bullock,
I am currently looking at this issue:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
We would like more information concerning the CPU Flags (even though you included them in your engine log dump above).
Could you run the following command on the same host running: AMD EPYC-IBPB
vdsm-client Host getCapabilities
Please send me the output, especially the CPU Flags.
Thank you in advance for your help.
With Best Regards.
Steven Rosenberg.
On Thu, Feb 7, 2019 at 6:35 PM Ryan Bullock <rrb3942@gmail.com> wrote:
That would explain it.
Would removing the host and then reinstalling it under a new 4.3 cluster work without having to set the entire old cluster into maintenance to change the cpu? Then I could just restart VM's into the new cluster as we transition to minimize downtime.
Thanks for the info!
Ryan
On Thu, Feb 7, 2019 at 7:56 AM Greg Sheremeta <gshereme@redhat.com> wrote:
AMD EPYC IBPB is deprecated in 4.3. The deprecated CPUs (cpus variable, that entire list) are:
https://gerrit.ovirt.org/#/c/95310/7/frontend/webadmin/modules/webadmin/src/...
So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still.
Greg
On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:
Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.
Relevant (I think) parts from the the engine log:
(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'
From virsh -r capabilities:
<cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.
We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64
We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.
Reverting the host back to 4.2 allows it to activate and run normally.
Anyone have any ideas as to why it can't seem to find the cpu type?
Thanks,
Ryan Bullock _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMDXG35JGXXEA3...

Dear Ryan Bullock, Thank you for help with this issue. I will correlate accordingly. With Best Regards. Steven Rosenberg. On Mon, Feb 18, 2019 at 5:46 AM Ryan Bullock <rrb3942@gmail.com> wrote:
Hey Steven,
Including just the cpuFlags, since the output is pretty verbose. Let me know if you need anything else from the output.
Without avic=1 (Works Fine): "cpuFlags": "fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca,model_Opteron_G3,model_Opteron_G2,model_kvm32,model_kvm64,model_Westmere,model_Nehalem,model_Conroe,model_EPYC-IBPB,model_Opteron_G1,model_SandyBridge,model_qemu32,model_Penryn,model_pentium2,model_486,model_qemu64,model_cpu64-rhel6,model_EPYC,model_pentium,model_pentium3"
With avic=1 (Problem Configuration): "cpuFlags": "fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca"
Flags stay the same, but with avic=1 no models are shown as supported. Also, I opened this bug https://bugzilla.redhat.com/show_bug.cgi?id=1675030 regarding the avic=1 setting seemingly requiring the x2apic flag.
-Ryan
On Sun, Feb 17, 2019 at 5:22 AM Steven Rosenberg <srosenbe@redhat.com> wrote:
Dear Ryan Bullock,
I am currently looking at this issue:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
We would like more information concerning the CPU Flags (even though you included them in your engine log dump above).
Could you run the following command on the same host running: AMD EPYC-IBPB
vdsm-client Host getCapabilities
Please send me the output, especially the CPU Flags.
Thank you in advance for your help.
With Best Regards.
Steven Rosenberg.
On Thu, Feb 7, 2019 at 6:35 PM Ryan Bullock <rrb3942@gmail.com> wrote:
That would explain it.
Would removing the host and then reinstalling it under a new 4.3 cluster work without having to set the entire old cluster into maintenance to change the cpu? Then I could just restart VM's into the new cluster as we transition to minimize downtime.
Thanks for the info!
Ryan
On Thu, Feb 7, 2019 at 7:56 AM Greg Sheremeta <gshereme@redhat.com> wrote:
AMD EPYC IBPB is deprecated in 4.3. The deprecated CPUs (cpus variable, that entire list) are:
https://gerrit.ovirt.org/#/c/95310/7/frontend/webadmin/modules/webadmin/src/...
So, *-IBRS [IBRS-SSBD is still ok], Epyc IBPB, Conroe, Penryn, and Opteron G1-3. If you have those, you need to change it to a supported type while it's in 4.2 still.
Greg
On Thu, Feb 7, 2019 at 1:11 AM Ryan Bullock <rrb3942@gmail.com> wrote:
We just updated our engine to 4.3, but when I tried to update one of our AMD EPYC hosts it could not activate with the error:
Host vmc2h2 moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all.
Relevant (I think) parts from the the engine log:
(EE-ManagedThreadFactory-engineScheduled-Thread-82) [ee51a70] Could not find server cpu for server 'vmc2h2' (745a14c6-9d31-48a4-9566-914647d83f53), flags: 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,mmx,fxsr,sse,sse2,ht,syscall,nx,mmxext,fxsr_opt,pdpe1gb,rdtscp,lm,constant_tsc,art,rep_good,nopl,nonstop_tsc,extd_apicid,amd_dcm,aperfmperf,eagerfpu,pni,pclmulqdq,monitor,ssse3,fma,cx16,sse4_1,sse4_2,movbe,popcnt,aes,xsave,avx,f16c,rdrand,lahf_lm,cmp_legacy,svm,extapic,cr8_legacy,abm,sse4a,misalignsse,3dnowprefetch,osvw,skinit,wdt,tce,topoext,perfctr_core,perfctr_nb,bpext,perfctr_l2,cpb,hw_pstate,sme,retpoline_amd,ssbd,ibpb,vmmcall,fsgsbase,bmi1,avx2,smep,bmi2,rdseed,adx,smap,clflushopt,sha_ni,xsaveopt,xsavec,xgetbv1,clzero,irperf,xsaveerptr,arat,npt,lbrv,svm_lock,nrip_save,tsc_scale,vmcb_clean,flushbyasid,decodeassists,pausefilter,pfthreshold,avic,v_vmsave_vmload,vgif,overflow_recov,succor,smca' 2019-02-06 17:23:58,527-08 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [7f6d4f0d] START, SetVdsStatusVDSCommand(HostName = vmc2h2, SetVdsStatusVDSCommandParameters:{hostId='745a14c6-9d31-48a4-9566-914647d83f53', status='NonOperational', nonOperationalReason='CPU_TYPE_INCOMPATIBLE_WITH_CLUSTER'
From virsh -r capabilities:
<cpu> <arch>x86_64</arch> <model>EPYC-IBPB</model> <vendor>AMD</vendor> <microcode version='134222375'/> <topology sockets='1' cores='32' threads='2'/> <feature name='ht'/> <feature name='osxsave'/> <feature name='xsaves'/> <feature name='cmp_legacy'/> <feature name='extapic'/> <feature name='skinit'/> <feature name='wdt'/> <feature name='tce'/> <feature name='topoext'/> <feature name='perfctr_core'/> <feature name='perfctr_nb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
I also tried creating a new 4.3 cluster, set to the AMD EPYC IPBDB SSBD and moving the host into it, but it failed to move it into that cluster with a similar error about an unsupported CPU (for some reason it also made me clear the additional kernel options as well, we use 1gb hugepages). I have not yet tried removing the host entirely and adding it as part of creating the new cluster.
We have been/are using a database change to update the 4.2 cluster level to include EPYC support with the following entries (can post the whole query if needed): 7:AMD EPYC:svm,nx,model_EPYC:EPYC:x86_64; 8:AMD EPYC IBPB:svm,nx,ibpb,model_EPYC:EPYC-IBPB:x86_64
We have been running 4.2 with this for awhile. We did apply the same changes after the 4.3 update, but only for the 4.2 cluster level. We only used the AMD EPYC IBPB model.
Reverting the host back to 4.2 allows it to activate and run normally.
Anyone have any ideas as to why it can't seem to find the cpu type?
Thanks,
Ryan Bullock _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4Y4X7UGDEYSB5J...
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMDXG35JGXXEA3...
participants (5)
-
Greg Sheremeta
-
Juhani Rautiainen
-
Ryan Bullock
-
Simone Tiraboschi
-
Steven Rosenberg