2016-09-30 15:35 GMT+02:00 Michal Skrivanek <michal.skrivanek(a)redhat.com>:
that is a very low level error really pointing at HW issues. It may or may
not be detected by memtest…but I would give it a try
I left memtest86 running for 2 days and no error detected :(
The only difference that this host (vmhost01) has is that it was the
first
host installed in my self-hosted engine installation. But I have already
reinstalled it from GUI and menawhile I've upgraded to 4.0.4 from 4.0.3.
does it happen only for the big 96GB VM? The others which you said are
working, are they all small?
Might be worth trying other system stability tests, playing with
safer/slower settings in BIOS, use lower CPU cluster, etc
Yep, it happens only for the 96GB VM. Other VMs with fewer RAM (16GB for
example) can be created on or migrated to that host flawlessly. I'll try to
play a little with BIOS settings but otherwise I'll have the HW replaced. I
was only trying to rule out possible oVirt SW problems due to that host
being the first I deployed (from CLI) when I installed the cluster.
Thanks!
--
Davide Ferrari
Senior Systems Engineer