[ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?

Andrew Lau andrew at andrewklau.com
Fri Jul 18 08:13:13 EDT 2014


​​

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur <vbellur at redhat.com> wrote:

> [Adding gluster-devel]
>
>
> On 07/18/2014 05:20 PM, Andrew Lau wrote:
>
>> Hi all,
>>
>> As most of you have got hints from previous messages, hosted engine
>> won't work on gluster . A quote from BZ1097639
>>
>> "Using hosted engine with Gluster backed storage is currently something
>> we really warn against.
>>
>>
>> I think this bug should be closed or re-targeted at documentation,
>> because there is nothing we can do here. Hosted engine assumes that all
>> writes are atomic and (immediately) available for all hosts in the cluster.
>> Gluster violates those assumptions.
>> ​"
>>
> I tried going through BZ1097639 but could not find much detail with
> respect to gluster there.
>
> A few questions around the problem:
>
> 1. Can somebody please explain in detail the scenario that causes the
> problem?
>
> 2. Is hosted engine performing synchronous writes to ensure that writes
> are durable?
>
> Also, if there is any documentation that details the hosted engine
> architecture that would help in enhancing our understanding of its
> interactions with gluster.
>
>
>>>
>> Now my question, does this theory prevent a scenario of perhaps
>> something like a gluster replicated volume being mounted as a glusterfs
>> filesystem and then re-exported as the native kernel NFS share for the
>> hosted-engine to consume? It could then be possible to chuck ctdb in
>> there to provide a last resort failover solution. I have tried myself
>> and suggested it to two people who are running a similar setup. Now
>> using the native kernel NFS server for hosted-engine and they haven't
>> reported as many issues. Curious, could anyone validate my theory on this?
>>
>>
> If we obtain more details on the use case and obtain gluster logs from the
> failed scenarios, we should be able to understand the problem better. That
> could be the first step in validating your theory or evolving further
> recommendations :).
>
>
​I'm not sure how useful this is, but ​Jiri Moskovcak tracked this down in
an off list message.

​Message Quote:​

​==​

​We were able to track it down to this (thanks Andrew for providing the
testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
line 165, in handle
    response = "success " + self._dispatch(data)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
line 261, in _dispatch
    .get_all_stats_for_service_type(**options)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
line 41, in get_all_stats_for_service_type
    d = self.get_raw_stats_for_service_type(storage_dir, service_type)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
line 74, in get_raw_stats_for_service_type
    f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localho
st:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-
engine.metadata'

It's definitely connected to the storage which leads us to the gluster, I'm
not very familiar with the gluster so I need to check this with our gluster
 gurus.​

​==​



> Thanks,
> Vijay
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140718/cd867bd6/attachment.html>


More information about the Users mailing list