[ovirt-users] ovirt 3.6.6 and gluster 3.7.13

Sat Aug 13 19:08:28 EDT 2016

Nope, not in any official repo.  I only use those suggested by oVirt, ie:

http://centos.bhs.mirrors.ovh.net/ftp.centos.org/7/storage/x86_64/gluster-3.7/

No 3.7.14 there.  Thanks though.

Scott

On Sat, Aug 13, 2016 at 11:23 AM David Gossage <dgossage at carouselchecks.com>
wrote:

> On Sat, Aug 13, 2016 at 11:00 AM, David Gossage <
> dgossage at carouselchecks.com> wrote:
>
>> On Sat, Aug 13, 2016 at 8:19 AM, Scott <romracer at gmail.com> wrote:
>>
>>> Had a chance to upgrade my cluster to Gluster 3.7.14 and can confirm it
>>> works for me too where 3.7.12/13 did not.
>>>
>>> I did find that you should NOT turn off network.remote-dio or turn
>>> on performance.strict-o-direct as suggested earlier in the thread.  They
>>> will prevent dd (using direct flag) and other things from working
>>> properly.  I'd leave those at network.remote-dio=enabled
>>> and performance.strict-o-direct=off.
>>>
>>
>> Those were actually just suggested during a testing phase trying to trace
>> down the issue.  Neither of those 2 I think have ever been suggested as
>> good practice. At least not for VM storage.
>>
>>
>>> Hopefully we can see Gluster 3.7.14 moved out of testing repo soon.
>>>
>>
> Is it still in testing repo? I updated my production cluster I think 2
> weeks ago from default repo on centos7.
>
>
>>> Scott
>>>
>>> On Tue, Aug 2, 2016 at 9:05 AM, David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> So far gluster 3.7.14 seems to have resolved issues at least on my test
>>>> box.  dd commands that failed previously now work with sharding on zfs
>>>> backend,
>>>>
>>>> Where before I couldn't even mount a new storage domain it now mounted
>>>> and I have a test vm being created.
>>>>
>>>> Still have to let VM run for a few days and make sure no locking
>>>> freezing occurs but looks hopeful so far.
>>>>
>>>> *David Gossage*
>>>> *Carousel Checks Inc. | System Administrator*
>>>> *Office* 708.613.2284
>>>>
>>>> On Tue, Jul 26, 2016 at 8:15 AM, David Gossage <
>>>> dgossage at carouselchecks.com> wrote:
>>>>
>>>>> On Tue, Jul 26, 2016 at 4:37 AM, Krutika Dhananjay <
>>>>> kdhananj at redhat.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> 1.  Could you please attach the glustershd logs from all three nodes?
>>>>>>
>>>>>
>>>>> Here are ccgl1 and ccgl2.  as previously mentioned ccgl3 third node
>>>>> was down from bad nic so no relevant logs would be on that node.
>>>>>
>>>>>
>>>>>>
>>>>>> 2. Also, so far what we know is that the 'Operation not permitted'
>>>>>> errors are on the main vm image itself and not its individual shards (ex
>>>>>> deb61291-5176-4b81-8315-3f1cf8e3534d). Could you do the following:
>>>>>> Get the inode number of
>>>>>> .glusterfs/de/b6/deb61291-5176-4b81-8315-3f1cf8e3534d (ls -li) from the
>>>>>> first brick. I'll call this number INODE_NUMBER.
>>>>>> Execute `find . -inum INODE_NUMBER` from the brick root on first
>>>>>> brick to print the hard links against the file in the prev step and share
>>>>>> the output.
>>>>>>
>>>>> [dgossage at ccgl1 ~]$ sudo ls -li
>>>>> /gluster1/BRICK1/1/.glusterfs/de/b6/deb61291-5176-4b81-8315-3f1cf8e3534d
>>>>> 16407 -rw-r--r--. 2 36 36 466 Jun  5 16:52
>>>>> /gluster1/BRICK1/1/.glusterfs/de/b6/deb61291-5176-4b81-8315-3f1cf8e3534d
>>>>> [dgossage at ccgl1 ~]$ cd /gluster1/BRICK1/1/
>>>>> [dgossage at ccgl1 1]$ sudo find . -inum 16407
>>>>> ./7c73a8dd-a72e-4556-ac88-7f6813131e64/dom_md/metadata
>>>>> ./.glusterfs/de/b6/deb61291-5176-4b81-8315-3f1cf8e3534d
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> 3. Did you delete any vms at any point before or after the upgrade?
>>>>>>
>>>>>
>>>>> Immediately before or after on same day pretty sure I deleted
>>>>> nothing.  During week prior I deleted a few dev vm's that were never setup
>>>>> and some the week after upgrade as I was preparing for moving disks off and
>>>>> on storage to get them sharded and felt it would be easier to just recreate
>>>>> some disks that had no data yet rather than move them off and on later.
>>>>>
>>>>>>
>>>>>> -Krutika
>>>>>>
>>>>>> On Mon, Jul 25, 2016 at 11:30 PM, David Gossage <
>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 25, 2016 at 9:58 AM, Krutika Dhananjay <
>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>
>>>>>>>> OK, could you try the following:
>>>>>>>>
>>>>>>>> i. Set network.remote-dio to off
>>>>>>>>         # gluster volume set <VOL> network.remote-dio off
>>>>>>>>
>>>>>>>> ii. Set performance.strict-o-direct to on
>>>>>>>>         # gluster volume set <VOL> performance.strict-o-direct on
>>>>>>>>
>>>>>>>> iii. Stop the affected vm(s) and start again
>>>>>>>>
>>>>>>>> and tell me if you notice any improvement?
>>>>>>>>
>>>>>>>>
>>>>>>> Previous instll I had issue with is still on gluster 3.7.11
>>>>>>>
>>>>>>> My test install of ovirt 3.6.7 and gluster 3.7.13 with 3 bricks on a
>>>>>>> locak disk right now isn't allowing me to add the gluster storage at all.
>>>>>>>
>>>>>>> Keep getting some type of UI error
>>>>>>>
>>>>>>> 2016-07-25 12:49:09,277 ERROR
>>>>>>> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService]
>>>>>>> (default task-33) [] Permutation name: 430985F23DFC1C8BE1C7FDD91EDAA785
>>>>>>> 2016-07-25 12:49:09,277 ERROR
>>>>>>> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService]
>>>>>>> (default task-33) [] Uncaught exception: : java.lang.ClassCastException
>>>>>>>         at Unknown.ps(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@3837)
>>>>>>>    at Unknown.ts(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@20)
>>>>>>>      at Unknown.vs(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@18)
>>>>>>>      at Unknown.iJf(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@19)
>>>>>>>     at Unknown.Xab(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@48)
>>>>>>>     at Unknown.P8o(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@4447)
>>>>>>>   at Unknown.jQr(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@21)
>>>>>>>     at Unknown.A8o(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@51)
>>>>>>>     at Unknown.u8o(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@101)
>>>>>>>    at Unknown.Eap(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10718)
>>>>>>>  at Unknown.p8n(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@161)
>>>>>>>    at Unknown.Cao(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@31)
>>>>>>>     at Unknown.Bap(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10469)
>>>>>>>  at Unknown.kRn(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@49)
>>>>>>>     at Unknown.nRn(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@438)
>>>>>>>    at Unknown.eVn(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@40)
>>>>>>>     at Unknown.hVn(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@25827)
>>>>>>>  at Unknown.MTn(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@25)
>>>>>>>     at Unknown.PTn(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@24052)
>>>>>>>  at Unknown.KJe(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@21125)
>>>>>>>  at Unknown.Izk(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10384)
>>>>>>>  at Unknown.P3(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@137)
>>>>>>>     at Unknown.g4(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@8271)
>>>>>>>    at Unknown.<anonymous>(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@65)
>>>>>>>     at Unknown._t(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@29)
>>>>>>>      at Unknown.du(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@57)
>>>>>>>      at Unknown.<anonymous>(
>>>>>>> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@54
>>>>>>> )
>>>>>>>
>>>>>>>
>>>>>>>> -Krutika
>>>>>>>>
>>>>>>>> On Mon, Jul 25, 2016 at 4:57 PM, Samuli Heinonen <
>>>>>>>> samppah at neutraali.net> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> > On 25 Jul 2016, at 12:34, David Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>> >
>>>>>>>>> > On Mon, Jul 25, 2016 at 1:01 AM, Krutika Dhananjay <
>>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>> > Hi,
>>>>>>>>> >
>>>>>>>>> > Thanks for the logs. So I have identified one issue from the
>>>>>>>>> logs for which the fix is this:
>>>>>>>>> http://review.gluster.org/#/c/14669/. Because of a bug in the
>>>>>>>>> code, ENOENT was getting converted to EPERM and being propagated up the
>>>>>>>>> stack causing the reads to bail out early with 'Operation not permitted'
>>>>>>>>> errors.
>>>>>>>>> > I still need to find out two things:
>>>>>>>>> > i) why there was a readv() sent on a non-existent (ENOENT) file
>>>>>>>>> (this is important since some of the other users have not faced or reported
>>>>>>>>> this issue on gluster-users with 3.7.13)
>>>>>>>>> > ii) need to see if there's a way to work around this issue.
>>>>>>>>> >
>>>>>>>>> > Do you mind sharing the steps needed to be executed to run into
>>>>>>>>> this issue? This is so that we can apply our patches, test and ensure they
>>>>>>>>> fix the problem.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Unfortunately I can’t test this right away nor give exact steps
>>>>>>>>> how to test this. This is just a theory but please correct me if you see
>>>>>>>>> some mistakes.
>>>>>>>>>
>>>>>>>>> oVirt uses cache=none settings for VM’s by default which requires
>>>>>>>>> direct I/O. oVirt also uses dd with iflag=direct to check that storage has
>>>>>>>>> direct I/O enabled. Problems exist with GlusterFS with sharding enabled and
>>>>>>>>> bricks running on ZFS on Linux. Everything seems to be fine with GlusterFS
>>>>>>>>> 3.7.11 and problems exist at least with version .12 and .13. There has been
>>>>>>>>> some posts saying that GlusterFS 3.8.x is also affected.
>>>>>>>>>
>>>>>>>>> Steps to reproduce:
>>>>>>>>> 1. Sharded file is created with GlusterFS 3.7.11. Everything works
>>>>>>>>> ok.
>>>>>>>>> 2. GlusterFS is upgraded to 3.7.12+
>>>>>>>>> 3. Sharded file cannot be read or written with direct I/O enabled.
>>>>>>>>> (Ie. oVirt uses to check storage connection with command "dd
>>>>>>>>> if=/rhev/data-center/00000001-0001-0001-0001-0000000002b6/mastersd/dom_md/inbox
>>>>>>>>> iflag=direct,fullblock count=1 bs=1024000”)
>>>>>>>>>
>>>>>>>>> Please let me know if you need more information.
>>>>>>>>>
>>>>>>>>> -samuli
>>>>>>>>>
>>>>>>>>> > Well after upgrade of gluster all I did was start ovirt hosts up
>>>>>>>>> which launched and started their ha-agent and broker processes.  I don't
>>>>>>>>> believe I started getting any errors till it mounted GLUSTER1.  I had
>>>>>>>>> enabled sharding but had no sharded disk images yet.  Not sure if the check
>>>>>>>>> for shards would have caused that.  Unfortunately I can't just update this
>>>>>>>>> cluster and try and see what caused it as it has sme VM's users expect to
>>>>>>>>> be available in few hours.
>>>>>>>>> >
>>>>>>>>> > I can see if I can get my test setup to recreate it.  I think
>>>>>>>>> I'll need to de-activate data center so I can detach the storage thats on
>>>>>>>>> xfs and attach the one thats over zfs with sharding enabled.  My test is 3
>>>>>>>>> bricks on same local machine, with 3 different volumes but I think im
>>>>>>>>> running into sanlock issue or something as it won't mount more than one
>>>>>>>>> volume that was created locally.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > -Krutika
>>>>>>>>> >
>>>>>>>>> > On Fri, Jul 22, 2016 at 7:17 PM, David Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>> > Trimmed out the logs to just about when I was shutting down
>>>>>>>>> ovirt servers for updates which was 14:30 UTC 2016-07-09
>>>>>>>>> >
>>>>>>>>> > Pre-update settings were
>>>>>>>>> >
>>>>>>>>> > Volume Name: GLUSTER1
>>>>>>>>> > Type: Replicate
>>>>>>>>> > Volume ID: 167b8e57-28c3-447a-95cc-8410cbdf3f7f
>>>>>>>>> > Status: Started
>>>>>>>>> > Number of Bricks: 1 x 3 = 3
>>>>>>>>> > Transport-type: tcp
>>>>>>>>> > Bricks:
>>>>>>>>> > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>>>>> > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>>>>> > Brick3: ccgl3.gl.local:/gluster1/BRICK1/1
>>>>>>>>> > Options Reconfigured:
>>>>>>>>> > performance.readdir-ahead: on
>>>>>>>>> > storage.owner-uid: 36
>>>>>>>>> > storage.owner-gid: 36
>>>>>>>>> > performance.quick-read: off
>>>>>>>>> > performance.read-ahead: off
>>>>>>>>> > performance.io-cache: off
>>>>>>>>> > performance.stat-prefetch: off
>>>>>>>>> > cluster.eager-lock: enable
>>>>>>>>> > network.remote-dio: enable
>>>>>>>>> > cluster.quorum-type: auto
>>>>>>>>> > cluster.server-quorum-type: server
>>>>>>>>> > server.allow-insecure: on
>>>>>>>>> > cluster.self-heal-window-size: 1024
>>>>>>>>> > cluster.background-self-heal-count: 16
>>>>>>>>> > performance.strict-write-ordering: off
>>>>>>>>> > nfs.disable: on
>>>>>>>>> > nfs.addr-namelookup: off
>>>>>>>>> > nfs.enable-ino32: off
>>>>>>>>> >
>>>>>>>>> > At the time of updates ccgl3 was offline from bad nic on server
>>>>>>>>> but had been so for about a week with no issues in volume
>>>>>>>>> >
>>>>>>>>> > Shortly after update I added these settings to enable sharding
>>>>>>>>> but did not as of yet have any VM images sharded.
>>>>>>>>> > features.shard-block-size: 64MB
>>>>>>>>> > features.shard: on
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > David Gossage
>>>>>>>>> > Carousel Checks Inc. | System Administrator
>>>>>>>>> > Office 708.613.2284
>>>>>>>>> >
>>>>>>>>> > On Fri, Jul 22, 2016 at 5:00 AM, Krutika Dhananjay <
>>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>> > Hi David,
>>>>>>>>> >
>>>>>>>>> > Could you also share the brick logs from the affected volume?
>>>>>>>>> They're located at
>>>>>>>>> /var/log/glusterfs/bricks/<hyphenated-path-to-the-brick-directory>.log.
>>>>>>>>> >
>>>>>>>>> > Also, could you share the volume configuration (output of
>>>>>>>>> `gluster volume info <VOL>`) for the affected volume(s) AND at the time you
>>>>>>>>> actually saw this issue?
>>>>>>>>> >
>>>>>>>>> > -Krutika
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > On Thu, Jul 21, 2016 at 11:23 PM, David Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>> > On Thu, Jul 21, 2016 at 11:47 AM, Scott <romracer at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> > Hi David,
>>>>>>>>> >
>>>>>>>>> > My backend storage is ZFS.
>>>>>>>>> >
>>>>>>>>> > I thought about moving from FUSE to NFS mounts for my Gluster
>>>>>>>>> volumes to help test.  But since I use hosted engine this would be a real
>>>>>>>>> pain.  Its difficult to modify the storage domain type/path in the
>>>>>>>>> hosted-engine.conf.  And I don't want to go through the process of
>>>>>>>>> re-deploying hosted engine.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > I found this
>>>>>>>>> >
>>>>>>>>> > https://bugzilla.redhat.com/show_bug.cgi?id=1347553
>>>>>>>>> >
>>>>>>>>> > Not sure if related.
>>>>>>>>> >
>>>>>>>>> > But I also have zfs backend, another user in gluster mailing
>>>>>>>>> list had issues and used zfs backend although she used proxmox and got it
>>>>>>>>> working by changing disk to writeback cache I think it was.
>>>>>>>>> >
>>>>>>>>> > I also use hosted engine, but I run my gluster volume for HE
>>>>>>>>> actually on a LVM separate from zfs on xfs and if i recall it did not have
>>>>>>>>> the issues my gluster on zfs did.  I'm wondering now if the issue was zfs
>>>>>>>>> settings.
>>>>>>>>> >
>>>>>>>>> > Hopefully should have a test machone up soon I can play around
>>>>>>>>> with more.
>>>>>>>>> >
>>>>>>>>> > Scott
>>>>>>>>> >
>>>>>>>>> > On Thu, Jul 21, 2016 at 11:36 AM David Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>> > What back end storage do you run gluster on?  xfs/zfs/ext4 etc?
>>>>>>>>> >
>>>>>>>>> > David Gossage
>>>>>>>>> > Carousel Checks Inc. | System Administrator
>>>>>>>>> > Office 708.613.2284
>>>>>>>>> >
>>>>>>>>> > On Thu, Jul 21, 2016 at 8:18 AM, Scott <romracer at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> > I get similar problems with oVirt 4.0.1 and hosted engine.
>>>>>>>>> After upgrading all my hosts to Gluster 3.7.13 (client and server), I get
>>>>>>>>> the following:
>>>>>>>>> >
>>>>>>>>> > $ sudo hosted-engine --set-maintenance --mode=none
>>>>>>>>> > Traceback (most recent call last):
>>>>>>>>> >   File "/usr/lib64/python2.7/runpy.py", line 162, in
>>>>>>>>> _run_module_as_main
>>>>>>>>> >     "__main__", fname, loader, pkg_name)
>>>>>>>>> >   File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
>>>>>>>>> >     exec code in run_globals
>>>>>>>>> >   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
>>>>>>>>> line 73, in <module>
>>>>>>>>> >     if not maintenance.set_mode(sys.argv[1]):
>>>>>>>>> >   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
>>>>>>>>> line 61, in set_mode
>>>>>>>>> >     value=m_global,
>>>>>>>>> >   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>>>>>>>> line 259, in set_maintenance_mode
>>>>>>>>> >     str(value))
>>>>>>>>> >   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>>>>>>>> line 204, in set_global_md_flag
>>>>>>>>> >     all_stats = broker.get_stats_from_storage(service)
>>>>>>>>> >   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>>>> line 232, in get_stats_from_storage
>>>>>>>>> >     result = self._checked_communicate(request)
>>>>>>>>> >   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>>>> line 260, in _checked_communicate
>>>>>>>>> >     .format(message or response))
>>>>>>>>> > ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request
>>>>>>>>> failed: failed to read metadata: [Errno 1] Operation not permitted
>>>>>>>>> >
>>>>>>>>> > If I only upgrade one host, then things will continue to work
>>>>>>>>> but my nodes are constantly healing shards.  My logs are also flooded with:
>>>>>>>>> >
>>>>>>>>> > [2016-07-21 13:15:14.137734] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 274714: READ => -1
>>>>>>>>> gfid=4
>>>>>>>>> > 41f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation
>>>>>>>>> not permitted)
>>>>>>>>> > The message "W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
>>>>>>>>> operation failed [Operation not permitted]" repeated 6 times between
>>>>>>>>> [2016-07-21 13:13:24.134985] and [2016-07-21 13:15:04.132226]
>>>>>>>>> > The message "W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote
>>>>>>>>> operation failed [Operation not permitted]" repeated 8 times between
>>>>>>>>> [2016-07-21 13:13:34.133116] and [2016-07-21 13:15:14.137178]
>>>>>>>>> > The message "W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
>>>>>>>>> operation failed [Operation not permitted]" repeated 7 times between
>>>>>>>>> [2016-07-21 13:13:24.135071] and [2016-07-21 13:15:14.137666]
>>>>>>>>> > [2016-07-21 13:15:24.134647] W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
>>>>>>>>> operation failed [Operation not permitted]
>>>>>>>>> > [2016-07-21 13:15:24.134764] W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
>>>>>>>>> operation failed [Operation not permitted]
>>>>>>>>> > [2016-07-21 13:15:24.134793] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 274741: READ => -1
>>>>>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> > [2016-07-21 13:15:34.135413] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 274756: READ => -1
>>>>>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> > [2016-07-21 13:15:44.141062] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 274818: READ => -1
>>>>>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> > [2016-07-21 13:15:54.133582] W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote
>>>>>>>>> operation failed [Operation not permitted]
>>>>>>>>> > [2016-07-21 13:15:54.133629] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 274853: READ => -1
>>>>>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> > [2016-07-21 13:16:04.133666] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 274879: READ => -1
>>>>>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> > [2016-07-21 13:16:14.134954] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 274894: READ => -1
>>>>>>>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> >
>>>>>>>>> > Scott
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > On Thu, Jul 21, 2016 at 6:57 AM Frank Rothenstein <
>>>>>>>>> f.rothenstein at bodden-kliniken.de> wrote:
>>>>>>>>> > Hey Devid,
>>>>>>>>> >
>>>>>>>>> > I have the very same problem on my test-cluster, despite on
>>>>>>>>> running ovirt 4.0.
>>>>>>>>> > If you access your volumes via NFS all is fine, problem is FUSE.
>>>>>>>>> I stayed on 3.7.13, but have no solution yet, now I use NFS.
>>>>>>>>> >
>>>>>>>>> > Frank
>>>>>>>>> >
>>>>>>>>> > Am Donnerstag, den 21.07.2016, 04:28 -0500 schrieb David Gossage:
>>>>>>>>> >> Anyone running one of recent 3.6.x lines and gluster using
>>>>>>>>> 3.7.13?  I am looking to upgrade gluster from 3.7.11->3.7.13 for some bug
>>>>>>>>> fixes, but have been told by users on gluster mail list due to some gluster
>>>>>>>>> changes I'd need to change the disk parameters to use writeback cache.
>>>>>>>>> Something to do with aio support being removed.
>>>>>>>>> >>
>>>>>>>>> >> I believe this could be done with custom parameters?  But I
>>>>>>>>> believe strage tests are done using dd and would they fail with current
>>>>>>>>> settings then? Last upgrade to 3.7.13 I had to rollback to 3.7.11 due to
>>>>>>>>> stability isues where gluster storage would go into down state and always
>>>>>>>>> show N/A as space available/used.  Even if hosts saw storage still and VM's
>>>>>>>>> were running on it on all 3 hosts.
>>>>>>>>> >>
>>>>>>>>> >> Saw a lot of messages like these that went away once gluster
>>>>>>>>> rollback finished
>>>>>>>>> >>
>>>>>>>>> >> [2016-07-09 15:27:46.935694] I [fuse-bridge.c:4083:fuse_init]
>>>>>>>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel
>>>>>>>>> 7.22
>>>>>>>>> >> [2016-07-09 15:27:49.555466] W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
>>>>>>>>> operation failed [Operation not permitted]
>>>>>>>>> >> [2016-07-09 15:27:49.556574] W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
>>>>>>>>> operation failed [Operation not permitted]
>>>>>>>>> >> [2016-07-09 15:27:49.556659] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 80: READ => -1
>>>>>>>>> gfid=deb61291-5176-4b81-8315-3f1cf8e3534d fd=0x7f5224002f68 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> >> [2016-07-09 15:27:59.612477] W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
>>>>>>>>> operation failed [Operation not permitted]
>>>>>>>>> >> [2016-07-09 15:27:59.613700] W [MSGID: 114031]
>>>>>>>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
>>>>>>>>> operation failed [Operation not permitted]
>>>>>>>>> >> [2016-07-09 15:27:59.613781] W
>>>>>>>>> [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 168: READ => -1
>>>>>>>>> gfid=deb61291-5176-4b81-8315-3f1cf8e3534d fd=0x7f5224002f68 (Operation not
>>>>>>>>> permitted)
>>>>>>>>> >>
>>>>>>>>> >> David Gossage
>>>>>>>>> >> Carousel Checks Inc. | System Administrator
>>>>>>>>> >> Office 708.613.2284
>>>>>>>>> >> _______________________________________________
>>>>>>>>> >> Users mailing list
>>>>>>>>> >>
>>>>>>>>> >> Users at ovirt.org
>>>>>>>>> >> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> ______________________________________________________________________________
>>>>>>>>> > BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
>>>>>>>>> > Sandhufe 2
>>>>>>>>> > 18311 Ribnitz-Damgarten
>>>>>>>>> >
>>>>>>>>> > Telefon: 03821-700-0
>>>>>>>>> > Fax:       03821-700-240
>>>>>>>>> >
>>>>>>>>> > E-Mail: info at bodden-kliniken.de   Internet:
>>>>>>>>> http://www.bodden-kliniken.de
>>>>>>>>> >
>>>>>>>>> > Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919,
>>>>>>>>> Steuer-Nr.: 079/133/40188
>>>>>>>>> > Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr.
>>>>>>>>> Falko Milski
>>>>>>>>> >
>>>>>>>>> > Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten
>>>>>>>>> Adressaten bestimmt. Wenn Sie nicht der vorge-
>>>>>>>>> > sehene Adressat dieser E-Mail oder dessen Vertreter sein
>>>>>>>>> sollten, beachten Sie bitte, dass jede Form der Veröf-
>>>>>>>>> > fentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser
>>>>>>>>> E-Mail unzulässig ist. Wir bitten Sie, sofort den
>>>>>>>>> > Absender zu informieren und die E-Mail zu löschen.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >              Bodden-Kliniken Ribnitz-Damgarten GmbH 2016
>>>>>>>>> > *** Virenfrei durch Kerio Mail Server und Sophos Antivirus ***
>>>>>>>>> > _______________________________________________
>>>>>>>>> > Users mailing list
>>>>>>>>> > Users at ovirt.org
>>>>>>>>> > http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > _______________________________________________
>>>>>>>>> > Users mailing list
>>>>>>>>> > Users at ovirt.org
>>>>>>>>> > http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > _______________________________________________
>>>>>>>>> > Users mailing list
>>>>>>>>> > Users at ovirt.org
>>>>>>>>> > http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160813/d9dae402/attachment-0001.html>