[ovirt-users] One of my 2 identical Nodes keeps restarting every couple of hours

Simone Tiraboschi stirabos at redhat.com
Fri Mar 31 14:45:39 UTC 2017


Hi,
I see sanlock lease refresh issues before your reboots:

Mar 30 17:21:12 hype01 sanlock[1210]: 2017-03-30 17:21:12+0300 2109 [3805]:
s1 delta_renew read timeout 10 sec offset 0
/dev/463e6284-7676-432e-9eed-384587525846/ids
Mar 30 17:21:12 hype01 sanlock[1210]: 2017-03-30 17:21:12+0300 2109 [3805]:
s1 renewal error -202 delta_length 10 last_success 2079

Can you please share also your sanlock logs?


On Fri, Mar 31, 2017 at 2:23 PM, George Mcro <george.mcro at experia.gr> wrote:

> Hello,
>
> My infrastructure consist of 2 ovirt-nodes and one ovirt-engine. All of
> them use Centos 7.
>
> I have already configured Ovirt-Engine and both of ovirt-nodes. Before I
> install the two ovirt-nodes in Ovirt engine, I install this repo in both of
> them à yum install http://resources.ovirt.org/
> pub/yum-repo/ovirt-release41.rpm.
>
> I want to clarify that the ovirt engine is HP proliant DL380 G6 and the 2
> ovirt nodes are HP Proliant DL380G7. Also, the ovirt-node servers are
> hardware identical(same motherboard, same HP model, same NIC’s etc).
>
> Now, the issue.
>
> Ovirt-Node no2(hype02) operates perfectly for days with 4 VM’s on it. But,
> when I am migrating vm’s from ovirt-Node no2 (hype02) to ovirt-Node no1
> (hype01) to see if it is capable to operate like hype02, it restarts after
> couple of hours (2-4).
>
>
>
> Ovirt engine event logs report :
>
> VDSM hype01 command GetStatsVDS failed: Heartbeat exceeded (hype01).
>
>                                                                 Or
>
> VDSM hype01 command GetStatsVDS failed: Connection issue
> java.rmi.ConnectException: Connection timeout
>
>
>
> I have done almost every change I could think of. I reinstall Centos 7
> couple of times, upgrade BIOS and iLO in the latest version. Moreover, I
> changed Hard Drives (with the same HP Model), Motherboard, and RAMs but
> nothing worked.
>
>
>
> Then, I tried something else. I put the server in Maintenance mode and
> voila, it was operating for 4 days straight without restarting.
>
>
>
> So, I do not know what it’s wrong and logs sadly do not help me understand.
>
> I will post here some log files, dmesg, messages, supervdsm and vdsm.
>
> Any ideas what’s the issue here. Hardware or Software. Any help would be
> appreciated.
>
>
> King Regards,
>
> George Mcro
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170331/29b97287/attachment.html>


More information about the Users mailing list