On Thu, Oct 29, 2015 at 3:47 PM, Robert Story <rstory@tislabs.com> wrote:
On Thu, 29 Oct 2015 15:40:23 +0100 Simone wrote:
ST> Here the host IDs seam coherent.
ST> Can you please specify the name of the hosts where you took the logs in
ST> your first log archive (complaining host and engine host) ?

Hmm.. I know the complaining host was posedion, and I'm pretty sure the
engine was running on ares.

And indeed ares was host 1 so when it failed it was correctly trying to get lock for host 1 but it seams that previously it acquired a lock as different host.
Could you please check 
 grep host_id /etc/ovirt-hosted-engine/hosted-engine.conf
on ares and share vdsm and sanlock logs from that host?


MainThread::INFO::2015-10-27 09:14:56,764::hosted_engine::562::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: /var/run/vdsm/storage/2daba0ab-2b3d-4026-bcfc-1cd071c30038/04b08c8e-657f-4bac-9ddf-c9c57373409c/2d7f5020-42c1-442d-8237-fba9d6787080)
MainThread::ERROR::2015-10-27 09:14:56,766::hosted_engine::578::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) cannot get lock on host id 1: host already holds lock on a different host id
MainThread::ERROR::2015-10-27 09:14:56,767::agent::177::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: '(22, 'Sanlock lockspace add failure', 'Invalid argument')' - trying to restart agent
MainThread::WARNING::2015-10-27 09:15:01,772::agent::180::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent, attempt '9'
MainThread::ERROR::2015-10-27 09:15:01,772::agent::182::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Too many errors occurred, giving up. Please review the log and consider filing a bug.
MainThread::INFO::2015-10-27 09:15:01,773::agent::121::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down



 

Robert

--
Senior Software Engineer @ Parsons