[ovirt-users] Self-hosted engine won't start
John Gardeniers
jgardeniers at objectmastery.com
Mon Jul 28 00:57:46 UTC 2014
Hi Jira,
Version: ovirt-hosted-engine-ha-1.1.5-1.el6.noarch
Attached are the logs. Thanks for looking.
Regards,
John
On 25/07/14 17:47, Jiri Moskovcak wrote:
> On 07/24/2014 11:37 PM, John Gardeniers wrote:
>> Hi Jiri,
>>
>> Perhaps you can tell me how to determine the exact version of
>> ovirt-hosted-engine-ha.
>
> Centos/RHEL/Fedora: rpm -q ovirt-hosted-engine-ha
>
>> As for the logs, I am not going to attach 60MB
>> of logs to an email,
>
> - there are other ways to share the logs
>
>> nor can I see any imaginagle reason for you wanting
>> to see them all, as the bulk is historical. I have already included the
>> *relevant* sections. However, if you think there may be some other
>> section that may help you feel free to be more explicit about what you
>> are looking for. Right now I fail to understand what you might hope to
>> see in logs from several weeks ago that you can't get from the last day
>> or so.
>>
>
> It's a standard way, people tend to think that they know what is a
> relevant part of a log, but in many cases they fail. Asking for the
> whole logs has proven to be faster than trying to find the relevant
> part through the user. And you're right, I don't need the logs from
> last week, just logs since the last start of the services when you
> observed the problem.
>
> Regards,
> Jirka
>
>> regards,
>> John
>>
>>
>> On 24/07/14 19:10, Jiri Moskovcak wrote:
>>> Hi, please provide the the exact versions of ovirt-hosted-engine-ha
>>> and all logs from /var/log/ovirt-hosted-engine-ha/
>>>
>>> Thank you,
>>> Jirka
>>>
>>> On 07/24/2014 01:29 AM, John Gardeniers wrote:
>>>> Hi All,
>>>>
>>>> I have created a lab with 2 hypervisors and a self-hosted engine.
>>>> Today
>>>> I followed the upgrade instructions as described in
>>>> http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I
>>>> didn't really do an upgrade but simply wanted to test what would
>>>> happen
>>>> when the engine was rebooted.
>>>>
>>>> When the engine didn't restart I re-ran hosted-engine
>>>> --set-maintenance=none and restarted the vdsm, ovirt-ha-agent and
>>>> ovirt-ha-broker services on both nodes. 15 minutes later it still
>>>> hadn't
>>>> restarted, so I then tried rebooting both hypervisers. After an hour
>>>> there was still no sign of the engine starting. The agent logs don't
>>>> help me much. The following bits are repeated over and over.
>>>>
>>>> ovirt1 (192.168.19.20):
>>>>
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>>>>
>>>>
>>>> Trying: notify time=1406157520.27 type=state_transition
>>>> detail=EngineDown-EngineDown hostname='ovirt1.om.net'
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>>>>
>>>>
>>>> Success, was notification of state_transition (EngineDown-EngineDown)
>>>> sent? ignored
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>
>>>>
>>>> Current state EngineDown (score: 2400)
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>
>>>>
>>>> Best remote host 192.168.19.21 (id: 2, score: 2400)
>>>>
>>>> ovirt2 (192.168.19.21):
>>>>
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>>>>
>>>>
>>>> Trying: notify time=1406157484.01 type=state_transition
>>>> detail=EngineDown-EngineDown hostname='ovirt2.om.net'
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>>>>
>>>>
>>>> Success, was notification of state_transition (EngineDown-EngineDown)
>>>> sent? ignored
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:04,324::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>
>>>>
>>>> Current state EngineDown (score: 2400)
>>>> MainThread::INFO::2014-07-24
>>>> 09:18:04,324::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>
>>>>
>>>> Best remote host 192.168.19.20 (id: 1, score: 2400)
>>>>
>>>> From the above information I decided to simply shut down one
>>>> hypervisor
>>>> and see what happens. The engine did start back up again a few minutes
>>>> later.
>>>>
>>>> The interesting part is that each hypervisor seems to think the
>>>> other is
>>>> a better host. The two machines are identical, so there's no reason I
>>>> can see for this odd behaviour. In a lab environment this is little
>>>> more
>>>> than an annoying inconvenience. In a production environment it
>>>> would be
>>>> completely unacceptable.
>>>>
>>>> May I suggest that this issue be looked into and some means found to
>>>> eliminate this kind of mutual exclusion? e.g. After a few minutes of
>>>> such an issue one hypervisor could be randomly given a slightly higher
>>>> weighting, which should result in it being chosen to start the engine.
>>>>
>>>> regards,
>>>> John
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>
>>>
>>> ______________________________________________________________________
>>> This email has been scanned by the Symantec Email Security.cloud
>>> service.
>>> For more information please visit http://www.symanteccloud.com
>>> ______________________________________________________________________
>>
>
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com
> ______________________________________________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ovirt-hosted-engine-ha.logs.tar.gz
Type: application/gzip
Size: 2857728 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140728/6b13a753/attachment-0001.bin>
More information about the Users
mailing list