[ovirt-users] [Gluster-devel] Can we debug some truths/myths/facts about hosted-engine and gluster?

Mon Jul 21 09:09:08 UTC 2014

On 07/21/2014 02:08 PM, Jiri Moskovcak wrote:
> On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote:
>>
>> On 07/19/2014 11:25 AM, Andrew Lau wrote:
>>>
>>>
>>> On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>>
>>>
>>>     On 07/18/2014 05:43 PM, Andrew Lau wrote:
>>>>      
>>>>
>>>>     On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
>>>>     <vbellur at redhat.com <mailto:vbellur at redhat.com>> wrote:
>>>>
>>>>         [Adding gluster-devel]
>>>>
>>>>
>>>>         On 07/18/2014 05:20 PM, Andrew Lau wrote:
>>>>
>>>>             Hi all,
>>>>
>>>>             As most of you have got hints from previous messages,
>>>>             hosted engine
>>>>             won't work on gluster . A quote from BZ1097639
>>>>
>>>>             "Using hosted engine with Gluster backed storage is
>>>>             currently something
>>>>             we really warn against.
>>>>
>>>>
>>>>             I think this bug should be closed or re-targeted at
>>>>             documentation, because there is nothing we can do here.
>>>>             Hosted engine assumes that all writes are atomic and
>>>>             (immediately) available for all hosts in the cluster.
>>>>             Gluster violates those assumptions.
>>>>             "
>>>>
>>>>         I tried going through BZ1097639 but could not find much
>>>>         detail with respect to gluster there.
>>>>
>>>>         A few questions around the problem:
>>>>
>>>>         1. Can somebody please explain in detail the scenario that
>>>>         causes the problem?
>>>>
>>>>         2. Is hosted engine performing synchronous writes to ensure
>>>>         that writes are durable?
>>>>
>>>>         Also, if there is any documentation that details the hosted
>>>>         engine architecture that would help in enhancing our
>>>>         understanding of its interactions with gluster.
>>>>
>>>>
>>>>             
>>>>
>>>>             Now my question, does this theory prevent a scenario of
>>>>             perhaps
>>>>             something like a gluster replicated volume being mounted
>>>>             as a glusterfs
>>>>             filesystem and then re-exported as the native kernel NFS
>>>>             share for the
>>>>             hosted-engine to consume? It could then be possible to
>>>>             chuck ctdb in
>>>>             there to provide a last resort failover solution. I have
>>>>             tried myself
>>>>             and suggested it to two people who are running a similar
>>>>             setup. Now
>>>>             using the native kernel NFS server for hosted-engine and
>>>>             they haven't
>>>>             reported as many issues. Curious, could anyone validate
>>>>             my theory on this?
>>>>
>>>>
>>>>         If we obtain more details on the use case and obtain gluster
>>>>         logs from the failed scenarios, we should be able to
>>>>         understand the problem better. That could be the first step
>>>>         in validating your theory or evolving further 
>>>> recommendations :).
>>>>
>>>>
>>>>      I'm not sure how useful this is, but Jiri Moskovcak tracked
>>>>     this down in an off list message.
>>>>
>>>>      Message Quote:
>>>>
>>>>      ==
>>>>
>>>>     We were able to track it down to this (thanks Andrew for
>>>>     providing the testing setup):
>>>>
>>>>     -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
>>>>     Traceback (most recent call last):
>>>>     File
>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>>>     line 165, in handle
>>>>       response = "success " + self._dispatch(data)
>>>>     File
>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>>>     line 261, in _dispatch
>>>>       .get_all_stats_for_service_type(**options)
>>>>     File
>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>     line 41, in get_all_stats_for_service_type
>>>>       d = self.get_raw_stats_for_service_type(storage_dir, 
>>>> service_type)
>>>>     File
>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>     line 74, in get_raw_stats_for_service_type
>>>>       f = os.open(path, direct_flag | os.O_RDONLY)
>>>>     OSError: [Errno 116] Stale file handle:
>>>> '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'
>>>     Andrew/Jiri,
>>>             Would it be possible to post gluster logs of both the
>>>     mount and bricks on the bz? I can take a look at it once. If I
>>>     gather nothing then probably I will ask for your help in
>>>     re-creating the issue.
>>>
>>>     Pranith
>>>
>>>
>>> Unfortunately, I don't have the logs for that setup any more.. I'll
>>> try replicate when I get a chance. If I understand the comment from
>>> the BZ, I don't think it's a gluster bug per-say, more just how
>>> gluster does its replication.
>> hi Andrew,
>>           Thanks for that. I couldn't come to any conclusions because no
>> logs were available. It is unlikely that self-heal is involved because
>> there were no bricks going down/up according to the bug description.
>>
>
> Hi,
> I've never had such setup, I guessed problem with gluster based on 
> "OSError: [Errno 116] Stale file handle:" which happens when the file 
> opened by application on client gets removed on the server. I'm pretty 
> sure we (hosted-engine) don't remove that file, so I think it's some 
> gluster magic moving the data around...
Hi,
Without bricks going up/down or there are new bricks added data is not 
moved around by gluster unless a file operation comes to gluster. So I 
am still not sure why this happened.

Pranith
>
> --Jirka
>
>> Pranith
>>>
>>>
>>>>
>>>>     It's definitely connected to the storage which leads us to the
>>>>     gluster, I'm not very familiar with the gluster so I need to
>>>>     check this with our gluster gurus.
>>>>
>>>>     == 
>>>>
>>>>         Thanks,
>>>>         Vijay
>>>>
>>>>
>>>>
>>>>
>>>>     _______________________________________________
>>>>     Gluster-devel mailing list
>>>>     Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>
>