<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#FFFFFF">
I am having trouble getting fencing to work on my HP DL180 g6
servers. They have ilo100 controllers. The documentation mentions
ipmi compliance, but there are problems.<br>
<br>
The ipmilan driver gets a response, but it is the wrong response. A
status request results in the NMI line being asserted, which (in
standard PC architecture) is the same as pressing the reset button
(which these servers don't have).<br>
<br>
Here are some log excerpts:<br>
<br>
<table style="text-align: left; width: 100%;" border="1"
cellpadding="2" cellspacing="2">
<tbody>
<tr>
<td style="vertical-align: top;">16:33<br>
</td>
<td style="vertical-align: top;">just after re-running
re-install
from engine, which ended:<br>
<b>From oVirt GUI "Events" tab<br>
</b>Host s1 installed<br>
State was set to up for host s1.<br>
Host s3 from cluster Default was <b>chosen</b> as a proxy
to execute Status
command on Host s1<br>
Host s1 power management was verified successfully<br>
</td>
</tr>
<tr>
<td style="vertical-align: top;">16:34<br>
</td>
<td style="vertical-align: top;"><b>on ssh screen:</b><br>
Message from syslogd@s1 at Jan 21 16:34:14 ...<br>
kernel:Uhhuh. NMI received for unknown reason 31 on CPU 0.<br>
<br>
Message from syslogd@s1 at Jan 21 16:34:14 ...<br>
kernel:Do you have a strange power saving mode enabled?<br>
<br>
Message from syslogd@s1 at Jan 21 16:34:14 ...<br>
kernel:Dazed and confused, but trying to continue<br>
<br>
<b></b><b>from IPMI web interface event log:</b><br>
<table id="Table1" class="datatable" cellpadding="5">
<tbody>
<tr>
<td>Generic</td>
<td>01/21/2014</td>
<td>21:34:15</td>
<td>Gen ID 0x21</td>
<td>Bus Uncorrectable Error</td>
<td>Assertion</td>
</tr>
<tr>
<td>Generic</td>
<td>01/21/2014</td>
<td>21:34:15</td>
<td>IOH_NMI_DETECT</td>
<td>State Asserted</td>
<td>Assertion</td>
</tr>
</tbody>
</table>
<b><br>
From oVirt GUI "Events" tab<br>
</b>Host s1 is non responsive<br>
Host s3 from cluster Default was chosen as a proxy to
execute Restart
command on Host s1<br>
Host s3 from cluster Default was chosen as a proxy to
execute Stop
command on Host s1<br>
Host s3 from cluster Default was chosen as a proxy to
execute Status
command on Host s1<br>
Host s1 was stopped by engine<br>
Manual fence for host s1 was started<br>
Host s3 from cluster Default was chosen as a proxy to
execute Status
command on Host s1<br>
Host s3 from cluster Default was chosen as a proxy to
execute Start
command on Host s1<br>
Host s3 from cluster Default was chosen as a proxy to
execute Status
command on Host s1<br>
Host s1 was started by engine<br>
Host s1 is rebooting<br>
State was set to up for host s1.<br>
Host s3 from cluster Default was chosen as a proxy to
execute Status
command on Host s1<br>
</td>
</tr>
<tr>
<td style="vertical-align: top;">16:41<br>
</td>
<td style="vertical-align: top;">saw kernel panic output on
remote KVM terminal<br>
computer rebooted itself</td>
</tr>
</tbody>
</table>
<br>
I have searched for ilo100, but find nothing related to ovirt, so am
clueless as to what is the "correct" driver for this hardware.<br>
<br>
So far I have seen this mostly on server1 (s1), but that is also the
one I have cycled up and down most often.<br>
<br>
I have also seen where the commands are apparently issued too fast
(these servers are fairly slow booting). For example, I found that
one server was powered down when the boot process had gotten to the
stage where the RAID controller screen was up, so it had not had
time to complete the boot that was already in progress.<br>
<br>
Ted Miller<br>
Elkhart, IN, USA<br>
<br>
</body>
</html>