[ovirt-users] ovirt 3.6.6 and gluster 3.7.13

David Gossage dgossage at carouselchecks.com
Fri Jul 22 13:47:00 UTC 2016


Trimmed out the logs to just about when I was shutting down ovirt servers
for updates which was 14:30 UTC 2016-07-09

Pre-update settings were

Volume Name: GLUSTER1
Type: Replicate
Volume ID: 167b8e57-28c3-447a-95cc-8410cbdf3f7f
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ccgl1.gl.local:/gluster1/BRICK1/1
Brick2: ccgl2.gl.local:/gluster1/BRICK1/1
Brick3: ccgl3.gl.local:/gluster1/BRICK1/1
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
server.allow-insecure: on
cluster.self-heal-window-size: 1024
cluster.background-self-heal-count: 16
performance.strict-write-ordering: off
nfs.disable: on
nfs.addr-namelookup: off
nfs.enable-ino32: off

At the time of updates ccgl3 was offline from bad nic on server but had
been so for about a week with no issues in volume

Shortly after update I added these settings to enable sharding but did not
as of yet have any VM images sharded.
features.shard-block-size: 64MB
features.shard: on




*David Gossage*
*Carousel Checks Inc. | System Administrator*
*Office* 708.613.2284

On Fri, Jul 22, 2016 at 5:00 AM, Krutika Dhananjay <kdhananj at redhat.com>
wrote:

> Hi David,
>
> Could you also share the brick logs from the affected volume? They're
> located at
> /var/log/glusterfs/bricks/<hyphenated-path-to-the-brick-directory>.log.
>
> Also, could you share the volume configuration (output of `gluster volume
> info <VOL>`) for the affected volume(s) AND at the time you actually saw
> this issue?
>
> -Krutika
>
>
>
>
> On Thu, Jul 21, 2016 at 11:23 PM, David Gossage <
> dgossage at carouselchecks.com> wrote:
>
>> On Thu, Jul 21, 2016 at 11:47 AM, Scott <romracer at gmail.com> wrote:
>>
>>> Hi David,
>>>
>>> My backend storage is ZFS.
>>>
>>> I thought about moving from FUSE to NFS mounts for my Gluster volumes to
>>> help test.  But since I use hosted engine this would be a real pain.  Its
>>> difficult to modify the storage domain type/path in the
>>> hosted-engine.conf.  And I don't want to go through the process of
>>> re-deploying hosted engine.
>>>
>>>
>> I found this
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1347553
>>
>> Not sure if related.
>>
>> But I also have zfs backend, another user in gluster mailing list had
>> issues and used zfs backend although she used proxmox and got it working by
>> changing disk to writeback cache I think it was.
>>
>> I also use hosted engine, but I run my gluster volume for HE actually on
>> a LVM separate from zfs on xfs and if i recall it did not have the issues
>> my gluster on zfs did.  I'm wondering now if the issue was zfs settings.
>>
>> Hopefully should have a test machone up soon I can play around with more.
>>
>> Scott
>>>
>>> On Thu, Jul 21, 2016 at 11:36 AM David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> What back end storage do you run gluster on?  xfs/zfs/ext4 etc?
>>>>
>>>> *David Gossage*
>>>> *Carousel Checks Inc. | System Administrator*
>>>> *Office* 708.613.2284
>>>>
>>>> On Thu, Jul 21, 2016 at 8:18 AM, Scott <romracer at gmail.com> wrote:
>>>>
>>>>> I get similar problems with oVirt 4.0.1 and hosted engine.  After
>>>>> upgrading all my hosts to Gluster 3.7.13 (client and server), I get the
>>>>> following:
>>>>>
>>>>> $ sudo hosted-engine --set-maintenance --mode=none
>>>>> Traceback (most recent call last):
>>>>>   File "/usr/lib64/python2.7/runpy.py", line 162, in
>>>>> _run_module_as_main
>>>>>     "__main__", fname, loader, pkg_name)
>>>>>   File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
>>>>>     exec code in run_globals
>>>>>   File
>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
>>>>> line 73, in <module>
>>>>>     if not maintenance.set_mode(sys.argv[1]):
>>>>>   File
>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
>>>>> line 61, in set_mode
>>>>>     value=m_global,
>>>>>   File
>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>>>> line 259, in set_maintenance_mode
>>>>>     str(value))
>>>>>   File
>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>>>> line 204, in set_global_md_flag
>>>>>     all_stats = broker.get_stats_from_storage(service)
>>>>>   File
>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>> line 232, in get_stats_from_storage
>>>>>     result = self._checked_communicate(request)
>>>>>   File
>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>> line 260, in _checked_communicate
>>>>>     .format(message or response))
>>>>> ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed:
>>>>> failed to read metadata: [Errno 1] Operation not permitted
>>>>>
>>>>> If I only upgrade one host, then things will continue to work but my
>>>>> nodes are constantly healing shards.  My logs are also flooded with:
>>>>>
>>>>> [2016-07-21 13:15:14.137734] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>> 0-glusterfs-fuse: 274714: READ => -1 gfid=4
>>>>> 41f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>>>> permitted)
>>>>> The message "W [MSGID: 114031]
>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
>>>>> operation failed [Operation not permitted]" repeated 6 times between
>>>>> [2016-07-21 13:13:24.134985] and [2016-07-21 13:15:04.132226]
>>>>> The message "W [MSGID: 114031]
>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote
>>>>> operation failed [Operation not permitted]" repeated 8 times between
>>>>> [2016-07-21 13:13:34.133116] and [2016-07-21 13:15:14.137178]
>>>>> The message "W [MSGID: 114031]
>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
>>>>> operation failed [Operation not permitted]" repeated 7 times between
>>>>> [2016-07-21 13:13:24.135071] and [2016-07-21 13:15:14.137666]
>>>>> [2016-07-21 13:15:24.134647] W [MSGID: 114031]
>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
>>>>> operation failed [Operation not permitted]
>>>>> [2016-07-21 13:15:24.134764] W [MSGID: 114031]
>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
>>>>> operation failed [Operation not permitted]
>>>>> [2016-07-21 13:15:24.134793] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>> 0-glusterfs-fuse: 274741: READ => -1
>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not
>>>>> permitted)
>>>>> [2016-07-21 13:15:34.135413] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>> 0-glusterfs-fuse: 274756: READ => -1
>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>>>> permitted)
>>>>> [2016-07-21 13:15:44.141062] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>> 0-glusterfs-fuse: 274818: READ => -1
>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not
>>>>> permitted)
>>>>> [2016-07-21 13:15:54.133582] W [MSGID: 114031]
>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote
>>>>> operation failed [Operation not permitted]
>>>>> [2016-07-21 13:15:54.133629] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>> 0-glusterfs-fuse: 274853: READ => -1
>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not
>>>>> permitted)
>>>>> [2016-07-21 13:16:04.133666] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>> 0-glusterfs-fuse: 274879: READ => -1
>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>>>> permitted)
>>>>> [2016-07-21 13:16:14.134954] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>> 0-glusterfs-fuse: 274894: READ => -1
>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not
>>>>> permitted)
>>>>>
>>>>> Scott
>>>>>
>>>>>
>>>>> On Thu, Jul 21, 2016 at 6:57 AM Frank Rothenstein <
>>>>> f.rothenstein at bodden-kliniken.de> wrote:
>>>>>
>>>>>> Hey Devid,
>>>>>>
>>>>>> I have the very same problem on my test-cluster, despite on running
>>>>>> ovirt 4.0.
>>>>>> If you access your volumes via NFS all is fine, problem is FUSE. I
>>>>>> stayed on 3.7.13, but have no solution yet, now I use NFS.
>>>>>>
>>>>>> Frank
>>>>>>
>>>>>> Am Donnerstag, den 21.07.2016, 04:28 -0500 schrieb David Gossage:
>>>>>>
>>>>>> Anyone running one of recent 3.6.x lines and gluster using 3.7.13?  I
>>>>>> am looking to upgrade gluster from 3.7.11->3.7.13 for some bug fixes, but
>>>>>> have been told by users on gluster mail list due to some gluster changes
>>>>>> I'd need to change the disk parameters to use writeback cache.  Something
>>>>>> to do with aio support being removed.
>>>>>>
>>>>>> I believe this could be done with custom parameters?  But I believe
>>>>>> strage tests are done using dd and would they fail with current settings
>>>>>> then? Last upgrade to 3.7.13 I had to rollback to 3.7.11 due to stability
>>>>>> isues where gluster storage would go into down state and always show N/A as
>>>>>> space available/used.  Even if hosts saw storage still and VM's were
>>>>>> running on it on all 3 hosts.
>>>>>>
>>>>>> Saw a lot of messages like these that went away once gluster rollback
>>>>>> finished
>>>>>>
>>>>>> [2016-07-09 15:27:46.935694] I [fuse-bridge.c:4083:fuse_init]
>>>>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel
>>>>>> 7.22
>>>>>> [2016-07-09 15:27:49.555466] W [MSGID: 114031]
>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
>>>>>> operation failed [Operation not permitted]
>>>>>> [2016-07-09 15:27:49.556574] W [MSGID: 114031]
>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
>>>>>> operation failed [Operation not permitted]
>>>>>> [2016-07-09 15:27:49.556659] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>>> 0-glusterfs-fuse: 80: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d
>>>>>> fd=0x7f5224002f68 (Operation not permitted)
>>>>>> [2016-07-09 15:27:59.612477] W [MSGID: 114031]
>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
>>>>>> operation failed [Operation not permitted]
>>>>>> [2016-07-09 15:27:59.613700] W [MSGID: 114031]
>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
>>>>>> operation failed [Operation not permitted]
>>>>>> [2016-07-09 15:27:59.613781] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>>>>> 0-glusterfs-fuse: 168: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d
>>>>>> fd=0x7f5224002f68 (Operation not permitted)
>>>>>>
>>>>>> *David Gossage*
>>>>>> *Carousel Checks Inc. | System Administrator*
>>>>>> *Office* 708.613.2284
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ______________________________________________________________________________
>>>>>> BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
>>>>>> Sandhufe 2
>>>>>> 18311 Ribnitz-Damgarten
>>>>>>
>>>>>> Telefon: 03821-700-0
>>>>>> Fax:       03821-700-240
>>>>>>
>>>>>> E-Mail: info at bodden-kliniken.de   Internet:
>>>>>> http://www.bodden-kliniken.de
>>>>>>
>>>>>>
>>>>>> Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919, Steuer-Nr.: 079/133/40188
>>>>>>
>>>>>> Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr. Falko Milski
>>>>>>
>>>>>>
>>>>>> Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorge-
>>>>>>
>>>>>> sehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, beachten Sie bitte, dass jede Form der Veröf-
>>>>>>
>>>>>> fentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail unzulässig ist. Wir bitten Sie, sofort den
>>>>>> Absender zu informieren und die E-Mail zu löschen.
>>>>>>
>>>>>>
>>>>>>              Bodden-Kliniken Ribnitz-Damgarten GmbH 2016
>>>>>> *** Virenfrei durch Kerio Mail Server und Sophos Antivirus ***
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160722/93880cae/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gluster1-BRICK1-1-ccgl1.log.gz
Type: application/x-gzip
Size: 447121 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160722/93880cae/attachment-0002.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gluster1-BRICK1-1-ccgl2.log.gz
Type: application/x-gzip
Size: 137219 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160722/93880cae/attachment-0003.gz>


More information about the Users mailing list