Hi Gianluca,
I am facing two different cases. Lets say the first case "stuck VM" and the
second "fake 100% CPU". On both I have verified that I have no storage
issues. Gluster volumes are up and accessible with other VMs (Windows 10
and Windows server 2016) running normally. The "stuck VM" case I have
observed more rarely. For the fake 100CPU% case, I suspect it could be sth
with the guest agent drivers or sth between qemu and Win10 since I've never
seen this with Windows 2016 server or Linux VMs.
Alex
On Mon, Jan 1, 2018 at 9:56 PM, Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
On Mon, Jan 1, 2018 at 8:43 PM, Alex K
<rightkicktech(a)gmail.com> wrote:
> Hi all and Happy New Year!
>
> I have a ovirt 4.1.3.5 cluster (running with 3 nodes and shared gluster
> storage).
> I have randomly observed that some Windows 10 64bit VMs are reported from
> engine dashboard with 100%CPU while when connecting within the VM the CPU
> utilization is normal.
> Sometimes, when reported with 100% CPU I cannot get a console at VM
> (console gives black screen) then I have to force shutdown the VM and start
> it up again. The only warning I see is in the qemu logs of the guest
> reporting that CPUs not present in any NUMA nodes.
>
> Any ideas how to tackle this?
>
> Thanx,
> Alex
>
>
Hi Alex,
I have seen something similar but on ISCSI domain environment and not
GlusterFS one, when I got problems with the storage array (in my case it
was a firmware update that lasted too much) and the VMs were paused and
after some seconds reactivated again.
For some of them I registered the related qemu-kvm process going to fixed
100% cpu usage and unable to open spice console (black screen). But in my
case also the VM itself was stuck: unable to connect to it via network or
ping.
I had to force power off the VM and power on it again. Some other VMs
resumed from pause state without any apparent problem (apart from clock
unsync).
Both the good and bad VMs had ovirt guest agent running: they were CentOS
6.5 VMs
Perhaps your situation is something in the middle.... verify you didn't
had any problem with your storage and that your problematic VM had not been
paused/resumed due to that
Gianluca