[ovirt-users] hosted engine health check issues

René Koch rkoch at linuxland.at
Tue Apr 22 14:15:31 UTC 2014


On 04/22/2014 04:04 PM, Itamar Heim wrote:
> On 04/14/2014 11:50 AM, René Koch wrote:
>> Hi,
>>
>> I have some issues with hosted engine status.
>>
>> oVirt hosts think that hosted engine is down because it seems that hosts
>> can't write to hosted-engine.lockspace due to glusterfs issues (or at
>> least I think so).
>>
>> Here's the output of vm-status:
>>
>> # hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==--
>>
>> Status up-to-date                  : False
>> Hostname                           : 10.0.200.102
>> Host ID                            : 1
>> Engine status                      : unknown stale-data
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 1397035677
>> Extra metadata (valid at timestamp):
>>      metadata_parse_version=1
>>      metadata_feature_version=1
>>      timestamp=1397035677 (Wed Apr  9 11:27:57 2014)
>>      host-id=1
>>      score=2400
>>      maintenance=False
>>      state=EngineUp
>>
>>
>> --== Host 2 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.200.101
>> Host ID                            : 2
>> Engine status                      : {'reason': 'vm not running on this
>> host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}
>> Score                              : 0
>> Local maintenance                  : False
>> Host timestamp                     : 1397464031
>> Extra metadata (valid at timestamp):
>>      metadata_parse_version=1
>>      metadata_feature_version=1
>>      timestamp=1397464031 (Mon Apr 14 10:27:11 2014)
>>      host-id=2
>>      score=0
>>      maintenance=False
>>      state=EngineUnexpectedlyDown
>>      timeout=Mon Apr 14 10:35:05 2014
>>
>> oVirt engine is sending me 2 emails every 10 minutes with the following
>> subjects:
>> - ovirt-hosted-engine state transition EngineDown-EngineStart
>> - ovirt-hosted-engine state transition EngineStart-EngineUp
>>
>> In oVirt webadmin I can see the following message:
>> VM HostedEngine is down. Exit message: internal error Failed to acquire
>> lock: error -243.
>>
>> These messages are really annoying as oVirt isn't doing anything with
>> hosted engine - I have an uptime of 9 days in my engine vm.
>>
>> So my questions are now:
>> Is it intended to send out these messages and detect that ovirt engine
>> is down (which is false anyway), but not to restart the vm?
>>
>> How can I disable notifications? I'm planning to write a Nagios plugin
>> which parses the output of hosted-engine --vm-status and only Nagios
>> should notify me, not hosted-engine script.
>>
>> Is is possible or planned to make the whole ha feature optional? I
>> really really really hate cluster software as it causes more troubles
>> then standalone machines and in my case the hosted-engine ha feature
>> really causes troubles (and I didn't had a hardware or network outage
>> yet only issues with hosted-engine ha agent). I don't need any ha
>> feature for hosted engine. I just want to run engine virtualized on
>> oVirt and if engine vm fails (e.g. because of issues with a host) I'll
>> restart it on another node.
>>
>> Thanks,
>> René
>>
>>
>
> I'm pretty sure we removed hosted-engine on gluster due to concerns
> around the locking issues.
> is the gluster configured with quorum to avoid split brains?
>

At the moment there's no quorum (1 host online is enough - but GlusterFS 
network is on dedicated nics which are directly connected between two 
hosts), as I'm waiting for additional memory and disks for the other 2 
nodes (so I have only 2 nodes atm).

But GlusterFS looks fine (now) - same for info heal-failed and info 
split-brain:

# gluster volume heal engine info
Gathering Heal info on volume engine has been successful

Brick ovirt-host01-gluster:/data/engine
Number of entries: 0

Brick ovirt-host02-gluster:/data/engine
Number of entries: 0


I can also create (touch) the lockspace file on the mounted GlusterFS 
volume - so imho GlusterFS isn't blocking libvirt.


Regards,
René



More information about the Users mailing list