
Hey, I'm wondering if anyone is experiencing freezing VMs? Especially Windows servers and especially with a lot of RAM. I found the following here: https://forum.proxmox.com/threads/vms-freeze-with-100-cpu.127459/page-11 Post #218 and further down are really interesting, as the conclusion is that there might a problem in kernels between 4.18.0-372.26.1 all the way up to Mainstream 6.3 or LTS 6.1 kernels. The problem appears to be, that mmu_notifier_seq is referenced as an integer in the is_page_fault_stale() function, causing KVM to freeze when the counter reaches max integer - 2,147,483,647. Post #220 seems to state, that it was fixed in commit ba6e3fe25543 in the kernel. The problem is, that the latest kernel of my nodes running on Centos 8 are running 4.18.0-408.el8.x86_64, which is affected. I could try and downgrade on those nodes, but would lock me to the unsupported CentOS 8 oVirt nodes. I tried a new oVirt node based on CentOS 9, en it comes with 5.14.0-514.el9.x86_64, which is also affected by the looks of it. I tried upgrading the kernel on CentOS 9 to the latest 6.1 LTS kernel og the latest 6.11 Mainstream kernel, and while the node works fine, it does not work for oVirt. The node cannot be activated, once the new kernel is in use. Is I'm a fixer and not a develloper, I think the task migh be too big for me to fix ovirt and make it work with 6.1/6.3 kernels. My last attempt is going to be an attempt to backport the fix to the 5.14 kernel supplied with oVirt based on CentOS 9 nodes. I know... I should probably look for a new solution, but oVirt has been running our many VMs quite well, at an affordable price. Yes we have more work fixing various issues that pop up from time to time, but if left alone, it does work quite well and stable. Has anyone else encountered these issues? //J

Hey again, Just to be clear... this does not appear to be a problem with the software wrapped around KVM, such as proxmox or ovirt, og the hypervisor itself in this KVM... the problem appears to be in the kernel, where a function is referencing a variable as an integer. So all I really want it to upgrade the kernel, which is hwat Proxmox did for its users. I'm aware we are on our own here using oVirt... but I was hoping to get some help from the community doing just that :-) //J
participants (1)
-
change_jeeringly679@dralias.com