On 04/09/2014 02:32 PM, Daniel Helgenberger wrote:
On Mi, 2014-04-09 at 09:18 +0200, Jiri Moskovcak wrote:
> On 04/08/2014 06:09 PM, Daniel Helgenberger wrote:
>> Hello,
>>
>> I have an oVirt 3.4 hosted engine lab setup witch I am evaluating for
>> production use.
>>
>> I "simulated" an ungraceful shutdown of all HA nodes (powercut) while
>> the engine was running. After powering up, the system did not recover
>> itself (it seemed).
>> I had to restart the ovirt-hosted-ha service (witch was in a locked
>> state) and then manually run 'hosted-engine --vm-start'.
>>
>> What is the supposed procedure after a shutdown (graceful / ungraceful)
>> of Hosted-Engine HA nodes? Should the engine recover by itself? Should
>> the running VM's be restarted automatically?
>
> When this happens the agent should start the engine VM and the engine
> should take care of restarting the VMs which were running on that
> restarted host and are marked as HA. Can you please provide contents ov
> /var/log/ovirt* from the host after the powercut when the engine VM
> doesn't come up?
>
Hello Jirka,
I accidentally already send the message without pointing out the
interesting part; this is:
<<< start logging ha-agent after reboot:
/var/log/ovirt-hosted-engine-ha/agent.log:MainTMainThread::INFO::2014-04-08
15:53:33,862::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
ovirt-hosted-engine-ha agent 1.1.2-1 started
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,936::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
Found certificate common name: 192.168.50.201
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,937::hosted_engine::363::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
Initializing ha-broker connection
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,937::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor ping, options {'addr': '192.168.50.1'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,939::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911299600
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,939::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor mgmt-bridge, options {'use_ssl': 'true',
'bridge_name': 'ovirtmgmt', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,013::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911300304
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,013::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor mem-free, options {'use_ssl': 'true', 'address':
'0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,015::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911300112
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,015::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor cpu-load-no-engine, options {'use_ssl': 'true',
'vm_uuid': 'e68a11c8-1251-4c13-9e3b-3847bbb4fa3d', 'address':
'0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,018::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911300240
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,018::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor engine-health, options {'use_ssl': 'true',
'vm_uuid': 'e68a11c8-1251-4c13-9e3b-3847bbb4fa3d', 'address':
'0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,024::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700723857104
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,024::hosted_engine::386::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
Broker initialized, all submonitors started
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,312::hosted_engine::430::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_cond_start_service)
Starting vdsmd
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::CRITICAL::2014-04-08
15:53:34,442::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start
ha-agent
(10 min nothing)
<<< here I did a 'service ovirt-hosted-ha start'
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:59:16,698::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
ovirt-hosted-engine-ha agent 1.1.2-1 started
....
after this things went quite smoothly.
Hi Daniel,
I noticed that in the log and I was just about to ask if that's when you
manually fixed it. Is there something else around that time in
/var/log/message which might be related to it?
Thanks,
Jirka
> Thanks,
> Jirka
>
>>
>> Thanks,
>> Daniel
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/users
>>
>