Update to 4.4.8 leaves cluster in a circular error-state

Having updated rpm-packages for a DC with a cluster containing 2 hosts (and executed the engine-setup on the engine machine), I now face the following issue: One of the VMs had a couple of snapshots and apparently this interferes with the upgrade of the cluster version, which currently is 4.4. The 4.4 has become incompatible with the chosen CPU architecture (Intel Westmere Family, the hosts are both Dell x10-series with Xeon X56xx CPUs) and requires an upgrade to newer compatibility version, 4.6 presumably, so the DC/Cluster state is "unknown". However I can't do this because the snapshots exists. I get the "Cannot change cluster version since following VMs are previewing snapshots:" listing a VM with snapshots. The snapshots in turn can not be Commit/Undo because: "Cannot revert to Snapshot. Unknown Data Center status.". So how do I break this circular error? The VM itself is not essential, it can be removed if that solves the case. However this can not be completed from the UI because: "Cannot remove VM. Unknown Data Center status.". Poltsi

Hi, Could you please share more details about the CPU problem you're facing? There shouldn't be any breaking change in that CPU definition in 4.4+ compatibility version. Regards, Lucia On Wed, Sep 1, 2021 at 7:15 PM Paul-Erik Törrönen <poltsi@poltsi.fi> wrote:
Having updated rpm-packages for a DC with a cluster containing 2 hosts (and executed the engine-setup on the engine machine), I now face the following issue:
One of the VMs had a couple of snapshots and apparently this interferes with the upgrade of the cluster version, which currently is 4.4.
The 4.4 has become incompatible with the chosen CPU architecture (Intel Westmere Family, the hosts are both Dell x10-series with Xeon X56xx CPUs) and requires an upgrade to newer compatibility version, 4.6 presumably, so the DC/Cluster state is "unknown". However I can't do this because the snapshots exists. I get the "Cannot change cluster version since following VMs are previewing snapshots:" listing a VM with snapshots. The snapshots in turn can not be Commit/Undo because: "Cannot revert to Snapshot. Unknown Data Center status.".
So how do I break this circular error? The VM itself is not essential, it can be removed if that solves the case. However this can not be completed from the UI because: "Cannot remove VM. Unknown Data Center status.".
Poltsi _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MVFQJJNEXVR74H...

On 9/2/21 9:16 AM, Lucia Jelinkova wrote:
Could you please share more details about the CPU problem you're facing? There shouldn't be any breaking change in that CPU definition in 4.4+ compatibility version.
Unfortunately not, I've already made irreversible changes to the cluster so that I can no longer reproduce the error which I got when I tried to activate one of the Dell hosts and which resulted in an error about the CPU family. After having wiped out most of the configurations I still do get a related error: "The host CPU does not match the Cluster CPU Type and is running in a degraded mode. It is missing the following CPU flags: vmx, nx, model_Westmere, aes. Please update the host CPU microcode or change the Cluster CPU Type." This error is not quite accurate since lscpu does list all of the flags mentioned above, except for the model_Westmere. IIRC there were some comments in the mailing list earlier this year WRT this flag-mismatch and being related to incompatible linux-firmware package. Currently the host that generates this error has this package installed: Name : linux-firmware Version : 20210702 Release : 103.gitd79c2677.el8 Will need to dig through the mailing list archives. Poltsi

Hi, I think you're hitting the same issue with the edk2-ovmf package as some users before [1], please check the version of that package. If it is 20200602gitca407c7246bf-5.el8 it is a broken version and you should downgrade to 20200602gitca407c7246bf-4.el8_4.1. 1: https://lists.ovirt.org/archives/list/users@ovirt.org/thread/ZTOZO4DO6F6LKHE... Regards, Lucia On Fri, Sep 3, 2021 at 6:58 PM Paul-Erik Törrönen <poltsi@poltsi.fi> wrote:
On 9/2/21 9:16 AM, Lucia Jelinkova wrote:
Could you please share more details about the CPU problem you're facing? There shouldn't be any breaking change in that CPU definition in 4.4+ compatibility version.
Unfortunately not, I've already made irreversible changes to the cluster so that I can no longer reproduce the error which I got when I tried to activate one of the Dell hosts and which resulted in an error about the CPU family.
After having wiped out most of the configurations I still do get a related error:
"The host CPU does not match the Cluster CPU Type and is running in a degraded mode. It is missing the following CPU flags: vmx, nx, model_Westmere, aes. Please update the host CPU microcode or change the Cluster CPU Type."
This error is not quite accurate since lscpu does list all of the flags mentioned above, except for the model_Westmere.
IIRC there were some comments in the mailing list earlier this year WRT this flag-mismatch and being related to incompatible linux-firmware package. Currently the host that generates this error has this package installed:
Name : linux-firmware Version : 20210702 Release : 103.gitd79c2677.el8
Will need to dig through the mailing list archives.
Poltsi _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SUGLF3ON72DBZC...
participants (2)
-
Lucia Jelinkova
-
Paul-Erik Törrönen