
On 04/22/2014 04:04 PM, Itamar Heim wrote:
On 04/14/2014 11:50 AM, René Koch wrote:
Hi,
I have some issues with hosted engine status.
oVirt hosts think that hosted engine is down because it seems that hosts can't write to hosted-engine.lockspace due to glusterfs issues (or at least I think so).
Here's the output of vm-status:
# hosted-engine --vm-status
--== Host 1 status ==--
Status up-to-date : False Hostname : 10.0.200.102 Host ID : 1 Engine status : unknown stale-data Score : 2400 Local maintenance : False Host timestamp : 1397035677 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1397035677 (Wed Apr 9 11:27:57 2014) host-id=1 score=2400 maintenance=False state=EngineUp
--== Host 2 status ==--
Status up-to-date : True Hostname : 10.0.200.101 Host ID : 2 Engine status : {'reason': 'vm not running on this host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'} Score : 0 Local maintenance : False Host timestamp : 1397464031 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1397464031 (Mon Apr 14 10:27:11 2014) host-id=2 score=0 maintenance=False state=EngineUnexpectedlyDown timeout=Mon Apr 14 10:35:05 2014
oVirt engine is sending me 2 emails every 10 minutes with the following subjects: - ovirt-hosted-engine state transition EngineDown-EngineStart - ovirt-hosted-engine state transition EngineStart-EngineUp
In oVirt webadmin I can see the following message: VM HostedEngine is down. Exit message: internal error Failed to acquire lock: error -243.
These messages are really annoying as oVirt isn't doing anything with hosted engine - I have an uptime of 9 days in my engine vm.
So my questions are now: Is it intended to send out these messages and detect that ovirt engine is down (which is false anyway), but not to restart the vm?
How can I disable notifications? I'm planning to write a Nagios plugin which parses the output of hosted-engine --vm-status and only Nagios should notify me, not hosted-engine script.
Is is possible or planned to make the whole ha feature optional? I really really really hate cluster software as it causes more troubles then standalone machines and in my case the hosted-engine ha feature really causes troubles (and I didn't had a hardware or network outage yet only issues with hosted-engine ha agent). I don't need any ha feature for hosted engine. I just want to run engine virtualized on oVirt and if engine vm fails (e.g. because of issues with a host) I'll restart it on another node.
Thanks, René
I'm pretty sure we removed hosted-engine on gluster due to concerns around the locking issues. is the gluster configured with quorum to avoid split brains?
At the moment there's no quorum (1 host online is enough - but GlusterFS network is on dedicated nics which are directly connected between two hosts), as I'm waiting for additional memory and disks for the other 2 nodes (so I have only 2 nodes atm). But GlusterFS looks fine (now) - same for info heal-failed and info split-brain: # gluster volume heal engine info Gathering Heal info on volume engine has been successful Brick ovirt-host01-gluster:/data/engine Number of entries: 0 Brick ovirt-host02-gluster:/data/engine Number of entries: 0 I can also create (touch) the lockspace file on the mounted GlusterFS volume - so imho GlusterFS isn't blocking libvirt. Regards, René