[Users] PROBLEM - host is restarted aromatically by ovirt and becomes unresponsible

Doron Fediuck dfediuck at redhat.com
Wed Aug 7 15:28:49 UTC 2013



----- Original Message -----
| From: "Ricardo Esteves" <maverick.pt at gmail.com>
| To: Users at ovirt.org
| Sent: Wednesday, August 7, 2013 5:43:26 PM
| Subject: [Users] PROBLEM - host is restarted aromatically by ovirt and becomes unresponsible
| 
| Good afternoon,
| 
| I've having a problem with my hosts, at least one time per week the host that
| has all the VMs running restarts and becomes unresponsible.
| 
| After the restart sent to the ilo by the ovirt engine the host becomes
| unresponsible, the fans on the enclosure go up like crazy.
| 
| Then the only way to get the blade up is to stop it using ilo or onboard
| administrator,
| and then remove it from the enclosure and put it back in and then issue the
| start using ovirt gui, because using stop/start on the ilo or onboard
| administrator the blade powers up but becomes unresponsible, doesn't show
| any image or any boot post messages.
| 
| 
| Anyone else seen this problem before?
| 
| BLADE ENCLOSURE: HP BladeSystem c3000
| BLADES: HP BL460c G6
| OS: CentOS 6.4 (64 bits)
| OVIRT: 3.2
| 
| 
| engine.log:
| 
| 2013-08-07 14:38:47,256 INFO
| [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand]
| (DefaultQuartzScheduler_Worker-7) [3b761f63] Running command:
| SetStoragePoolStatusCommand interna
| l: true. Entities affected : ID: 06951dba-556b-4323-9356-819c9160fe8e Type:
| StoragePool
| 2013-08-07 14:38:47,257 ERROR
| [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
| (DefaultQuartzScheduler_Worker-8) vds::refreshVdsStats Failed getVdsStats,
| vds = 44d77dcb-b775-4aef-ae59-
| 1dea8d5c691a : blade5, error = VDSNetworkException:
| java.net.NoRouteToHostException: No route to host
| 2013-08-07 14:38:47,263 WARN [org.ovirt.engine.core.vdsbroker.VdsManager]
| (DefaultQuartzScheduler_Worker-8)
| ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds =
| 44d77dcb-b77
| 5-4aef-ae59-1dea8d5c691a : blade5, VDS Network Error, continuing.
| java.net.NoRouteToHostException: No route to host
| 2013-08-07 14:38:50,252 ERROR
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-7) [3b761f63]
| IrsBroker::Failed::GetStoragePoolInfoVDS due to: NoRout
| eToHostException: No route to host
| 2013-08-07 14:38:50,253 WARN [org.ovirt.engine.core.vdsbroker.VdsManager]
| (DefaultQuartzScheduler_Worker-10) [2b4cb7c5]
| ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds =
| 44d77dcb-b775-4aef-ae59-1dea8d5c691a : blade5, VDS Network Error, continuing.
| java.net.NoRouteToHostException: No route to host
| 2013-08-07 14:38:53,252 INFO
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-7) [3b761f63] Irs placed on server
| 44d77dcb-b775-4aef-ae59-1dea8d5c69
| 1a failed. Proceed Failover
| 2013-08-07 14:38:53,254 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager]
| (DefaultQuartzScheduler_Worker-4) VDS::handleNetworkException Server failed
| to respond, vds_id = 44d77dcb-b775-4aef
| -ae59-1dea8d5c691a, vds_name = blade5, error =
| java.net.NoRouteToHostException: No route to host
| 2013-08-07 14:38:53,296 INFO [org.ovirt.engine.core.bll.VdsEventListener]
| (pool-3-thread-47) ResourceManager::vdsNotResponding entered for Host
| 44d77dcb-b775-4aef-ae59-1dea8d5c691a, 192.168.
| 10.25
| 2013-08-07 14:38:53,299 INFO
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-7) [3b761f63] hostFromVds::selectedVds -
| blade6, spmStatus Free, stor
| age pool VI-DataCenter
| 2013-08-07 14:38:53,308 ERROR
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-7) [3b761f63] SPM Init: could not find
| reported vds or not up - pool:
| VI-DataCenter vds_spm_id: 1
| 2013-08-07 14:38:53,346 INFO
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-7) [3b761f63] SPM selection - vds seems as
| spm blade5
| 2013-08-07 14:38:53,355 WARN
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-7) [3b761f63] spm vds is non responsive,
| stopping spm selection.
| 2013-08-07 14:38:53,438 INFO [org.ovirt.engine.core.bll.FenceExecutor]
| (pool-3-thread-47) Using Host blade6 from CLUSTER as proxy to execute
| Restart command on Host blade5
| 2013-08-07 14:38:53,438 INFO [org.ovirt.engine.core.bll.FenceExecutor]
| (pool-3-thread-47) Executing <Status> Power Management command, Proxy
| Host:blade6, Agent:ilo, Target Host:blade5, Manag
| ement IP:ilo5.vi.pt, User:Administrator, Options:secure=true
| 2013-08-07 14:38:53,457 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
| (pool-3-thread-47) START, FenceVdsVDSCommand(HostName = blade6, HostId =
| 2530f498-6029-496a-ab42-9
| 24ca2e3eb7f, targetVdsId = 44d77dcb-b775-4aef-ae59-1dea8d5c691a, action =
| Status, ip = ilo5.vi.pt, port = , type = ilo, user = Administrator, password
| = ******, options = 'secure=true'), log
| id: 41a729f3
| 2013-08-07 14:39:02,533 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
| (pool-3-thread-47) FINISH, FenceVdsVDSCommand, return: Test Succeeded, Host
| Status is: on, log id:
| 41a729f3
| 2013-08-07 14:39:02,541 INFO
| [org.ovirt.engine.core.bll.VdsNotRespondingTreatmentCommand]
| (pool-3-thread-47) Running command: VdsNotRespondingTreatmentCommand
| internal: true. Entities affect
| ed : ID: 44d77dcb-b775-4aef-ae59-1dea8d5c691a Type: VDS
| 2013-08-07 14:39:02,598 INFO [org.ovirt.engine.core.bll.StopVdsCommand]
| (pool-3-thread-47) [56fa00a1] Running command: StopVdsCommand internal:
| true. Entities affected : ID: 44d77dcb-b775-4
| aef-ae59-1dea8d5c691a Type: VDS
| 2013-08-07 14:39:02,619 INFO
| [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-3-thread-47)
| [56fa00a1] START, SetVdsStatusVDSCommand(HostName = blade5, HostId =
| 44d77dcb-b775-4a
| ef-ae59-1dea8d5c691a, status=Reboot, nonOperationalReason=NONE), log id:
| 20a49440
| 2013-08-07 14:39:02,622 INFO
| [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-3-thread-47)
| [56fa00a1] VDS blade5 is spm and moved from up calling ResetIrs.
| 2013-08-07 14:39:02,622 INFO
| [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand]
| (pool-3-thread-47) [56fa00a1] START, ResetIrsVDSCommand( storagePoolId =
| 06951dba-556b-4323-9356-8
| 19c9160fe8e, ignoreFailoverLimit = false, compatabilityVersion = null, vdsId
| = 44d77dcb-b775-4aef-ae59-1dea8d5c691a, ignoreStopFailed = false), log id:
| 3b546d0a
| 2013-08-07 14:39:02,643 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (pool-3-thread-47) [56fa00a1] START, SpmStopVDSCommand(HostName = blade5,
| HostId = 44d77dcb-b775-4a
| ef-ae59-1dea8d5c691a, storagePoolId = 06951dba-556b-4323-9356-819c9160fe8e),
| log id: 283e8812
| 2013-08-07 14:39:02,644 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (pool-3-thread-47) [56fa00a1] SpmStopVDSCommand:: vds blade5 is in Reboot
| status - not performing s
| pm stop, pool id 06951dba-556b-4323-9356-819c9160fe8e
| 2013-08-07 14:39:02,644 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (pool-3-thread-47) [56fa00a1] FINISH, SpmStopVDSCommand, log id: 283e8812
| 2013-08-07 14:39:02,645 INFO
| [org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand]
| (pool-3-thread-47) [56fa00a1] FINISH, ResetIrsVDSCommand, log id: 3b546d0a
| 2013-08-07 14:39:02,645 INFO
| [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-3-thread-47)
| [56fa00a1] FINISH, SetVdsStatusVDSCommand, log id: 20a49440
| 2013-08-07 14:39:02,699 INFO [org.ovirt.engine.core.bll.FenceExecutor]
| (pool-3-thread-47) [56fa00a1] Using Host blade6 from CLUSTER as proxy to
| execute Stop command on Host blade5
| 2013-08-07 14:39:02,735 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (pool-3-thread-47) [56fa00a1] START, SpmStopVDSCommand(HostName = blade5,
| HostId = 44d77dcb-b775-4a
| ef-ae59-1dea8d5c691a, storagePoolId = 06951dba-556b-4323-9356-819c9160fe8e),
| log id: 6bbeb9ca
| 2013-08-07 14:39:02,736 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (pool-3-thread-47) [56fa00a1] SpmStopVDSCommand:: vds blade5 is in Reboot
| status - not performing s
| pm stop, pool id 06951dba-556b-4323-9356-819c9160fe8e
| 2013-08-07 14:39:02,736 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (pool-3-thread-47) [56fa00a1] FINISH, SpmStopVDSCommand, log id: 6bbeb9ca
| 2013-08-07 14:39:02,737 INFO [org.ovirt.engine.core.bll.FenceExecutor]
| (pool-3-thread-47) [56fa00a1] Executing <Stop> Power Management command,
| Proxy Host:blade6, Agent:ilo, Target Host:blad
| e5, Management IP:ilo5.vi.pt, User:Administrator, Options:secure=true
| 2013-08-07 14:39:02,755 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
| (pool-3-thread-47) [56fa00a1] START, FenceVdsVDSCommand(HostName = blade6,
| HostId = 2530f498-6029-
| 496a-ab42-924ca2e3eb7f, targetVdsId = 44d77dcb-b775-4aef-ae59-1dea8d5c691a,
| action = Stop, ip = ilo5.vi.pt, port = , type = ilo, user = Administrator,
| password = ******, options = 'secure=tru
| e'), log id: 6f3cc543
| 2013-08-07 14:39:03,388 INFO
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-1) [18d8826] hostFromVds::selectedVds -
| blade6, spmStatus Free, stora
| ge pool VI-DataCenter
| 2013-08-07 14:39:03,392 ERROR
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-1) [18d8826] SPM Init: could not find
| reported vds or not up - pool:V
| I-DataCenter vds_spm_id: 1
| 2013-08-07 14:39:03,411 INFO
| [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
| (DefaultQuartzScheduler_Worker-1) [18d8826] SPM selection - vds seems as spm
| blade5
| 2013-08-07 14:39:03,430 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (DefaultQuartzScheduler_Worker-1) [18d8826] START,
| SpmStopVDSCommand(HostName = blade5, HostId = 44
| d77dcb-b775-4aef-ae59-1dea8d5c691a, storagePoolId =
| 06951dba-556b-4323-9356-819c9160fe8e), log id: 5ae3209c
| 2013-08-07 14:39:03,431 INFO
| [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
| (DefaultQuartzScheduler_Worker-1) [18d8826] SpmStopVDSCommand:: vds blade5
| is in Reboot status - no
| t performing spm stop, pool id 06951dba-556b-4323-9356-819c9160fe8e
| 
| Best regards,
| Ricardo Esteves.
| 


Ricardo, see:

2013-08-07 14:38:47,257 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-8) vds::refreshVdsStats Failed getVdsStats,  vds = 44d77dcb-b775-4aef-ae59-
1dea8d5c691a : blade5, error = VDSNetworkException: java.net.NoRouteToHostException: No route to host

One of your hosts is loosing connectivity.
I f you have HA VMs running there, the host will be fenced to allow
restarting the VMs in another host.



More information about the Users mailing list