[Users] fencing: HP ilo100 status does NMI, reboots computer
Ted Miller
tmiller at hcjb.org
Wed Jan 22 15:31:22 UTC 2014
I am having trouble getting fencing to work on my HP DL180 g6 servers. They
have ilo100 controllers. The documentation mentions ipmi compliance, but
there are problems.
The ipmilan driver gets a response, but it is the wrong response. A status
request results in the NMI line being asserted, which (in standard PC
architecture) is the same as pressing the reset button (which these servers
don't have).
Here are some log excerpts:
16:33
just after re-running re-install from engine, which ended:
*From oVirt GUI "Events" tab
*Host s1 installed
State was set to up for host s1.
Host s3 from cluster Default was *chosen* as a proxy to execute Status
command on Host s1
Host s1 power management was verified successfully
16:34
*on ssh screen:*
Message from syslogd at s1 at Jan 21 16:34:14 ...
kernel:Uhhuh. NMI received for unknown reason 31 on CPU 0.
Message from syslogd at s1 at Jan 21 16:34:14 ...
kernel:Do you have a strange power saving mode enabled?
Message from syslogd at s1 at Jan 21 16:34:14 ...
kernel:Dazed and confused, but trying to continue
***from IPMI web interface event log:*
Generic 01/21/2014 21:34:15 Gen ID 0x21 Bus Uncorrectable Error Assertion
Generic 01/21/2014 21:34:15 IOH_NMI_DETECT State Asserted Assertion
*
From oVirt GUI "Events" tab
*Host s1 is non responsive
Host s3 from cluster Default was chosen as a proxy to execute Restart command
on Host s1
Host s3 from cluster Default was chosen as a proxy to execute Stop command on
Host s1
Host s3 from cluster Default was chosen as a proxy to execute Status command
on Host s1
Host s1 was stopped by engine
Manual fence for host s1 was started
Host s3 from cluster Default was chosen as a proxy to execute Status command
on Host s1
Host s3 from cluster Default was chosen as a proxy to execute Start command
on Host s1
Host s3 from cluster Default was chosen as a proxy to execute Status command
on Host s1
Host s1 was started by engine
Host s1 is rebooting
State was set to up for host s1.
Host s3 from cluster Default was chosen as a proxy to execute Status command
on Host s1
16:41
saw kernel panic output on remote KVM terminal
computer rebooted itself
I have searched for ilo100, but find nothing related to ovirt, so am clueless
as to what is the "correct" driver for this hardware.
So far I have seen this mostly on server1 (s1), but that is also the one I
have cycled up and down most often.
I have also seen where the commands are apparently issued too fast (these
servers are fairly slow booting). For example, I found that one server was
powered down when the boot process had gotten to the stage where the RAID
controller screen was up, so it had not had time to complete the boot that
was already in progress.
Ted Miller
Elkhart, IN, USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140122/e0208461/attachment-0001.html>
More information about the Users
mailing list