Hello
just for the record, after I have that server replaced (only
motherboard+ram+controller, same disks), now everything works ok, so it was
definitely an hardware issue.
Thanks everyone for the troubleshoot help!
2016-10-04 18:06 GMT+02:00 Michal Skrivanek <michal.skrivanek(a)redhat.com>:
On 3 Oct 2016, at 10:39, Davide Ferrari <davide(a)billymob.com> wrote:
2016-09-30 15:35 GMT+02:00 Michal Skrivanek <michal.skrivanek(a)redhat.com>:
>
>
> that is a very low level error really pointing at HW issues. It may or
> may not be detected by memtest…but I would give it a try
>
>
I left memtest86 running for 2 days and no error detected :(
> The only difference that this host (vmhost01) has is that it was the
> first host installed in my self-hosted engine installation. But I have
> already reinstalled it from GUI and menawhile I've upgraded to 4.0.4 from
> 4.0.3.
>
>
> does it happen only for the big 96GB VM? The others which you said are
> working, are they all small?
> Might be worth trying other system stability tests, playing with
> safer/slower settings in BIOS, use lower CPU cluster, etc
>
>
Yep, it happens only for the 96GB VM. Other VMs with fewer RAM (16GB for
example) can be created on or migrated to that host flawlessly. I'll try to
play a little with BIOS settings but otherwise I'll have the HW replaced. I
was only trying to rule out possible oVirt SW problems due to that host
being the first I deployed (from CLI) when I installed the cluster.
I understand. Unfortunately it really does look like some sort of
incompatibility rather than a sw issue:/
Thanks!
--
Davide Ferrari
Senior Systems Engineer
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
--
Davide Ferrari
Senior Systems Engineer