
On Thu, Oct 29, 2015 at 2:52 PM, Robert Story <rstory@tislabs.com> wrote:
On Thu, 29 Oct 2015 14:08:22 +0100 Simone wrote: ST> it seams that two hosts are fighting fir the same host ID: ST> ST> MainThread::INFO::2015-10-27 ST> 09:14:56,764::hosted_engine::562::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) ST> Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: ST> /var/run/vdsm/storage/2daba0ab-2b3d-4026-bcfc-1cd071c30038/04b08c8e-657f-4bac-9ddf-c9c57373409c/2d7f5020-42c1-442d-8237-fba9d6787080) ST> MainThread::ERROR::2015-10-27 ST> 09:14:56,766::hosted_engine::578::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) ST> cannot get lock on host id 1: host already holds lock on a different ST> host id MainThread::ERROR::2015-10-27 ST> 09:14:56,767::agent::177::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) ST> Error: '(22, 'Sanlock lockspace add failure', 'Invalid argument')' - ST> trying to restart agent ST> ST> can you please share the output of: hosted-engine --vm-status
Hi Simone, thanks for taking the time to look at this. Here is the outpu:
# hosted-engine --vm-status
!! Cluster is in GLOBAL MAINTENANCE mode !!
--== Host 1 status ==-- Status up-to-date : False Hostname : ares.netsec Host ID : 1 Engine status : unknown stale-data Score : 2334 Local maintenance : False Host timestamp : 2496391 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=2496391 (Tue Oct 27 07:41:00 2015) host-id=1 score=2334 maintenance=False state=EngineUp
--== Host 2 status ==-- Status up-to-date : False Hostname : hera.netsec Host ID : 2 Engine status : unknown stale-data Score : 1689 Local maintenance : False Host timestamp : 2038037 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=2038037 (Mon Oct 26 08:50:13 2015) host-id=2 score=1689 maintenance=False state=EngineDown
--== Host 3 status ==-- Status up-to-date : False Hostname : eclipse.netsec Host ID : 3 Engine status : unknown stale-data Score : 2000 Local maintenance : False Host timestamp : 2298393 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=2298393 (Thu Oct 29 09:46:21 2015) host-id=3 score=2000 maintenance=False state=GlobalMaintenance
--== Host 4 status ==-- Status up-to-date : False Hostname : poseidon.netsec Host ID : 4 Engine status : unknown stale-data Score : 2000 Local maintenance : False Host timestamp : 123241 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=123241 (Thu Oct 29 09:46:30 2015) host-id=4 score=2000 maintenance=False state=GlobalMaintenance
--== Host 5 status ==-- Status up-to-date : False Hostname : apollo.netsec Host ID : 5 Engine status : unknown stale-data Score : 2000 Local maintenance : False Host timestamp : 2028116 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=2028116 (Mon Oct 26 04:14:46 2015) host-id=5 score=2000 maintenance=False state=EngineDown
Here the host IDs seam coherent. Can you please specify the name of the hosts where you took the logs in your first log archive (complaining host and engine host) ?
Robert
-- Senior Software Engineer @ Parsons