On 04/17/2014 09:40 AM, Jiri Moskovcak wrote:
On 04/17/2014 09:34 AM, René Koch wrote:
> On 04/15/2014 04:53 PM, Jiri Moskovcak wrote:
>> On 04/14/2014 10:50 AM, René Koch wrote:
>>> Hi,
>>>
>>> I have some issues with hosted engine status.
>>>
>>> oVirt hosts think that hosted engine is down because it seems that
>>> hosts
>>> can't write to hosted-engine.lockspace due to glusterfs issues (or at
>>> least I think so).
>>>
>>> Here's the output of vm-status:
>>>
>>> # hosted-engine --vm-status
>>>
>>>
>>> --== Host 1 status ==--
>>>
>>> Status up-to-date : False
>>> Hostname : 10.0.200.102
>>> Host ID : 1
>>> Engine status : unknown stale-data
>>> Score : 2400
>>> Local maintenance : False
>>> Host timestamp : 1397035677
>>> Extra metadata (valid at timestamp):
>>> metadata_parse_version=1
>>> metadata_feature_version=1
>>> timestamp=1397035677 (Wed Apr 9 11:27:57 2014)
>>> host-id=1
>>> score=2400
>>> maintenance=False
>>> state=EngineUp
>>>
>>>
>>> --== Host 2 status ==--
>>>
>>> Status up-to-date : True
>>> Hostname : 10.0.200.101
>>> Host ID : 2
>>> Engine status : {'reason': 'vm not running
on this
>>> host', 'health': 'bad', 'vm': 'down',
'detail': 'unknown'}
>>> Score : 0
>>> Local maintenance : False
>>> Host timestamp : 1397464031
>>> Extra metadata (valid at timestamp):
>>> metadata_parse_version=1
>>> metadata_feature_version=1
>>> timestamp=1397464031 (Mon Apr 14 10:27:11 2014)
>>> host-id=2
>>> score=0
>>> maintenance=False
>>> state=EngineUnexpectedlyDown
>>> timeout=Mon Apr 14 10:35:05 2014
>>>
>>> oVirt engine is sending me 2 emails every 10 minutes with the following
>>> subjects:
>>> - ovirt-hosted-engine state transition EngineDown-EngineStart
>>> - ovirt-hosted-engine state transition EngineStart-EngineUp
>>>
>>> In oVirt webadmin I can see the following message:
>>> VM HostedEngine is down. Exit message: internal error Failed to acquire
>>> lock: error -243.
>>>
>>> These messages are really annoying as oVirt isn't doing anything with
>>> hosted engine - I have an uptime of 9 days in my engine vm.
>>>
>>> So my questions are now:
>>> Is it intended to send out these messages and detect that ovirt engine
>>> is down (which is false anyway), but not to restart the vm?
>>>
>>> How can I disable notifications? I'm planning to write a Nagios plugin
>>> which parses the output of hosted-engine --vm-status and only Nagios
>>> should notify me, not hosted-engine script.
>>>
>>> Is is possible or planned to make the whole ha feature optional? I
>>> really really really hate cluster software as it causes more troubles
>>> then standalone machines and in my case the hosted-engine ha feature
>>> really causes troubles (and I didn't had a hardware or network outage
>>> yet only issues with hosted-engine ha agent). I don't need any ha
>>> feature for hosted engine. I just want to run engine virtualized on
>>> oVirt and if engine vm fails (e.g. because of issues with a host) I'll
>>> restart it on another node.
>>
>> Hi, you can:
>> 1. edit /etc/ovirt-hosted-engine-ha/{agent,broker}-log.conf and tweak
>> the logger as you like
>> 2. or kill ovirt-ha-broker & ovirt-ha-agent services
>
> Thanks for the information.
> So engine is able to run when ovirt-ha-broker and ovirt-ha-agent isn't
> running?
>
- yes, it might cause some problems if you set up another host for
hosted engine and run the agent on the other host, but as long as you
don't have the agent running anywhere or you don't need to migrate the
engine vm, you should be fine.
Thanks!
At the moment I have an issue with ovirt-ha-broker running crazy and
don't react on kill -9:
# ps aux | egrep -e '%CPU|\[ovirt-ha-broker\]' | grep -v grep
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
vdsm 3059 224 0.0 0 0 ? Zl Mar03 145536:45
[ovirt-ha-broker] <defunct>
# kill -9 3059
# ps aux | egrep -e '%CPU|\[ovirt-ha-broker\]' | grep -v grep
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
vdsm 3059 224 0.0 0 0 ? Zl Mar03 145545:17
[ovirt-ha-broker] <defunct>
--Jirka
>
> Regards,
> René
>
>>
>> --Jirka
>>>
>>> Thanks,
>>> René
>>>
>>>
>>