Good afternoon,

I've having a problem with my hosts, at least one time per week the host that has all the VMs running restarts and becomes unresponsible.

After the restart sent to the ilo by the ovirt engine the host becomes unresponsible, the fans on the enclosure go up like crazy.

Then the only way to get the blade up is to stop it using ilo or onboard administrator,
and then remove it from the enclosure and put it back in and then issue the start using ovirt gui, because using stop/start on the ilo or onboard administrator the blade powers up but becomes unresponsible, doesn't show any image or any boot post messages.


Anyone else seen this problem before?

BLADE ENCLOSURE: HP BladeSystem c3000
BLADES: HP BL460c G6
OS: CentOS 6.4 (64 bits)
OVIRT: 3.2


engine.log:

2013-08-07 14:38:47,256 INFO  [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-7) [3b761f63] Running command: SetStoragePoolStatusCommand interna
l: true. Entities affected :  ID: 06951dba-556b-4323-9356-819c9160fe8e Type: StoragePool
2013-08-07 14:38:47,257 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-8) vds::refreshVdsStats Failed getVdsStats,  vds = 44d77dcb-b775-4aef-ae59-
1dea8d5c691a : blade5, error = VDSNetworkException: java.net.NoRouteToHostException: No route to host
2013-08-07 14:38:47,263 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-8) ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = 44d77dcb-b77
5-4aef-ae59-1dea8d5c691a : blade5, VDS Network Error, continuing.
java.net.NoRouteToHostException: No route to host
2013-08-07 14:38:50,252 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) [3b761f63] IrsBroker::Failed::GetStoragePoolInfoVDS due to: NoRout
eToHostException: No route to host
2013-08-07 14:38:50,253 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-10) [2b4cb7c5] ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds =
44d77dcb-b775-4aef-ae59-1dea8d5c691a : blade5, VDS Network Error, continuing.
java.net.NoRouteToHostException: No route to host
2013-08-07 14:38:53,252 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) [3b761f63] Irs placed on server 44d77dcb-b775-4aef-ae59-1dea8d5c69
1a failed. Proceed Failover
2013-08-07 14:38:53,254 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-4) VDS::handleNetworkException Server failed to respond,  vds_id = 44d77dcb-b775-4aef
-ae59-1dea8d5c691a, vds_name = blade5, error = java.net.NoRouteToHostException: No route to host
2013-08-07 14:38:53,296 INFO  [org.ovirt.engine.core.bll.VdsEventListener] (pool-3-thread-47) ResourceManager::vdsNotResponding entered for Host 44d77dcb-b775-4aef-ae59-1dea8d5c691a, 192.168.
10.25
2013-08-07 14:38:53,299 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) [3b761f63] hostFromVds::selectedVds - blade6, spmStatus Free, stor
age pool VI-DataCenter
2013-08-07 14:38:53,308 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) [3b761f63] SPM Init: could not find reported vds or not up - pool:
VI-DataCenter vds_spm_id: 1
2013-08-07 14:38:53,346 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) [3b761f63] SPM selection - vds seems as spm blade5
2013-08-07 14:38:53,355 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) [3b761f63] spm vds is non responsive, stopping spm selection.
2013-08-07 14:38:53,438 INFO  [org.ovirt.engine.core.bll.FenceExecutor] (pool-3-thread-47) Using Host blade6 from CLUSTER as proxy to execute Restart command on Host blade5
2013-08-07 14:38:53,438 INFO  [org.ovirt.engine.core.bll.FenceExecutor] (pool-3-thread-47) Executing <Status> Power Management command, Proxy Host:blade6, Agent:ilo, Target Host:blade5, Manag
ement IP:ilo5.vi.pt, User:Administrator, Options:secure=true
2013-08-07 14:38:53,457 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (pool-3-thread-47) START, FenceVdsVDSCommand(HostName = blade6, HostId = 2530f498-6029-496a-ab42-9
24ca2e3eb7f, targetVdsId = 44d77dcb-b775-4aef-ae59-1dea8d5c691a, action = Status, ip = ilo5.vi.pt, port = , type = ilo, user = Administrator, password = ******, options = 'secure=true'), log
id: 41a729f3
2013-08-07 14:39:02,533 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (pool-3-thread-47) FINISH, FenceVdsVDSCommand, return: Test Succeeded, Host Status is: on, log id:
41a729f3
2013-08-07 14:39:02,541 INFO  [org.ovirt.engine.core.bll.VdsNotRespondingTreatmentCommand] (pool-3-thread-47) Running command: VdsNotRespondingTreatmentCommand internal: true. Entities affect
ed :  ID: 44d77dcb-b775-4aef-ae59-1dea8d5c691a Type: VDS
2013-08-07 14:39:02,598 INFO  [org.ovirt.engine.core.bll.StopVdsCommand] (pool-3-thread-47) [56fa00a1] Running command: StopVdsCommand internal: true. Entities affected :  ID: 44d77dcb-b775-4
aef-ae59-1dea8d5c691a Type: VDS
2013-08-07 14:39:02,619 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-3-thread-47) [56fa00a1] START, SetVdsStatusVDSCommand(HostName = blade5, HostId = 44d77dcb-b775-4a
ef-ae59-1dea8d5c691a, status=Reboot, nonOperationalReason=NONE), log id: 20a49440
2013-08-07 14:39:02,622 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-3-thread-47) [56fa00a1] VDS blade5 is spm and moved from up calling ResetIrs.
2013-08-07 14:39:02,622 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (pool-3-thread-47) [56fa00a1] START, ResetIrsVDSCommand( storagePoolId = 06951dba-556b-4323-9356-8
19c9160fe8e, ignoreFailoverLimit = false, compatabilityVersion = null, vdsId = 44d77dcb-b775-4aef-ae59-1dea8d5c691a, ignoreStopFailed = false), log id: 3b546d0a
2013-08-07 14:39:02,643 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-3-thread-47) [56fa00a1] START, SpmStopVDSCommand(HostName = blade5, HostId = 44d77dcb-b775-4a
ef-ae59-1dea8d5c691a, storagePoolId = 06951dba-556b-4323-9356-819c9160fe8e), log id: 283e8812
2013-08-07 14:39:02,644 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-3-thread-47) [56fa00a1] SpmStopVDSCommand:: vds blade5 is in Reboot status - not performing s
pm stop, pool id 06951dba-556b-4323-9356-819c9160fe8e
2013-08-07 14:39:02,644 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-3-thread-47) [56fa00a1] FINISH, SpmStopVDSCommand, log id: 283e8812
2013-08-07 14:39:02,645 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (pool-3-thread-47) [56fa00a1] FINISH, ResetIrsVDSCommand, log id: 3b546d0a
2013-08-07 14:39:02,645 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-3-thread-47) [56fa00a1] FINISH, SetVdsStatusVDSCommand, log id: 20a49440
2013-08-07 14:39:02,699 INFO  [org.ovirt.engine.core.bll.FenceExecutor] (pool-3-thread-47) [56fa00a1] Using Host blade6 from CLUSTER as proxy to execute Stop command on Host blade5
2013-08-07 14:39:02,735 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-3-thread-47) [56fa00a1] START, SpmStopVDSCommand(HostName = blade5, HostId = 44d77dcb-b775-4a
ef-ae59-1dea8d5c691a, storagePoolId = 06951dba-556b-4323-9356-819c9160fe8e), log id: 6bbeb9ca
2013-08-07 14:39:02,736 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-3-thread-47) [56fa00a1] SpmStopVDSCommand:: vds blade5 is in Reboot status - not performing s
pm stop, pool id 06951dba-556b-4323-9356-819c9160fe8e
2013-08-07 14:39:02,736 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (pool-3-thread-47) [56fa00a1] FINISH, SpmStopVDSCommand, log id: 6bbeb9ca
2013-08-07 14:39:02,737 INFO  [org.ovirt.engine.core.bll.FenceExecutor] (pool-3-thread-47) [56fa00a1] Executing <Stop> Power Management command, Proxy Host:blade6, Agent:ilo, Target Host:blad
e5, Management IP:ilo5.vi.pt, User:Administrator, Options:secure=true
2013-08-07 14:39:02,755 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (pool-3-thread-47) [56fa00a1] START, FenceVdsVDSCommand(HostName = blade6, HostId = 2530f498-6029-
496a-ab42-924ca2e3eb7f, targetVdsId = 44d77dcb-b775-4aef-ae59-1dea8d5c691a, action = Stop, ip = ilo5.vi.pt, port = , type = ilo, user = Administrator, password = ******, options = 'secure=tru
e'), log id: 6f3cc543
2013-08-07 14:39:03,388 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-1) [18d8826] hostFromVds::selectedVds - blade6, spmStatus Free, stora
ge pool VI-DataCenter
2013-08-07 14:39:03,392 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-1) [18d8826] SPM Init: could not find reported vds or not up - pool:V
I-DataCenter vds_spm_id: 1
2013-08-07 14:39:03,411 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-1) [18d8826] SPM selection - vds seems as spm blade5
2013-08-07 14:39:03,430 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (DefaultQuartzScheduler_Worker-1) [18d8826] START, SpmStopVDSCommand(HostName = blade5, HostId = 44
d77dcb-b775-4aef-ae59-1dea8d5c691a, storagePoolId = 06951dba-556b-4323-9356-819c9160fe8e), log id: 5ae3209c
2013-08-07 14:39:03,431 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (DefaultQuartzScheduler_Worker-1) [18d8826] SpmStopVDSCommand:: vds blade5 is in Reboot status - no
t performing spm stop, pool id 06951dba-556b-4323-9356-819c9160fe8e

Best regards,
Ricardo Esteves.