On Mon, Jul 25, 2016 at 1:07 PM, David Gossage <dgossage(a)carouselchecks.com>
wrote:
On Mon, Jul 25, 2016 at 1:00 PM, David Gossage <
dgossage(a)carouselchecks.com> wrote:
>
> On Mon, Jul 25, 2016 at 9:58 AM, Krutika Dhananjay <kdhananj(a)redhat.com>
> wrote:
>
>> OK, could you try the following:
>>
>> i. Set network.remote-dio to off
>> # gluster volume set <VOL> network.remote-dio off
>>
>> ii. Set performance.strict-o-direct to on
>> # gluster volume set <VOL> performance.strict-o-direct on
>>
>> iii. Stop the affected vm(s) and start again
>>
>> and tell me if you notice any improvement?
>>
>
Not sure if helpful but over the gluster mount it creates even though it
won't attech to data center I get this error from bricks log running
following
dd if=/dev/zero
of=/rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test
oflag=direct count=100 bs=1M
dd: error writing
‘/rhev/data-center/mnt/glusterSD/192.168.71.10:_glustershard/5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test’:
Invalid argument
dd: closing output file
‘/rhev/data-center/mnt/glusterSD/192.168.71.10:_glustershard/5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test’:
Invalid argument
[2016-07-25 18:20:19.393121] E [MSGID: 113039] [posix.c:2939:posix_open]
0-glustershard-posix: open on
/gluster2/brick1/1/.glusterfs/02/f4/02f4783b-2799-46d9-b787-53e4ccd9a052,
flags: 16385 [Invalid argument]
[2016-07-25 18:20:19.393204] E [MSGID: 115070]
[server-rpc-fops.c:1568:server_open_cbk] 0-glustershard-server: 120: OPEN
/5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test
(02f4783b-2799-46d9-b787-53e4ccd9a052) ==> (Invalid argument) [Invalid
argument]
and
/var/log/glusterfs/rhev-data-center-mnt-glusterSD-192.168.71.10\:_glustershard.log
[2016-07-25 18:20:19.393275] E [MSGID: 114031]
[client-rpc-fops.c:466:client3_3_open_cbk] 0-glustershard-client-0: remote
operation failed. Path: /5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test
(02f4783b-2799-46d9-b787-53e4ccd9a052) [Invalid argument]
[2016-07-25 18:20:19.393270] E [MSGID: 114031]
[client-rpc-fops.c:466:client3_3_open_cbk] 0-glustershard-client-1: remote
operation failed. Path: /5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test
(02f4783b-2799-46d9-b787-53e4ccd9a052) [Invalid argument]
[2016-07-25 18:20:19.393317] E [MSGID: 114031]
[client-rpc-fops.c:466:client3_3_open_cbk] 0-glustershard-client-2: remote
operation failed. Path: /5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test
(02f4783b-2799-46d9-b787-53e4ccd9a052) [Invalid argument]
[2016-07-25 18:20:19.393357] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 117: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.393389] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 118: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.393611] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 119: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.393708] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 120: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.393771] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 121: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.393840] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 122: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.393914] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 123: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.393982] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 124: WRITE => -1
gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid
argument)
[2016-07-25 18:20:19.394045] W [fuse-bridge.c:709:fuse_truncate_cbk]
0-glusterfs-fuse: 125: FTRUNCATE() ERR => -1 (Invalid argument)
[2016-07-25 18:20:19.394338] W [fuse-bridge.c:1290:fuse_err_cbk]
0-glusterfs-fuse: 126: FLUSH() ERR => -1 (Invalid argument)
>>
> Previous instll I had issue with is still on gluster 3.7.11
>
> My test install of ovirt 3.6.7 and gluster 3.7.13 with 3 bricks on a
> locak disk right now isn't allowing me to add the gluster storage at all.
>
> Keep getting some type of UI error
>
> 2016-07-25 12:49:09,277 ERROR
> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService]
> (default task-33) [] Permutation name: 430985F23DFC1C8BE1C7FDD91EDAA785
> 2016-07-25 12:49:09,277 ERROR
> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService]
> (default task-33) [] Uncaught exception: : java.lang.ClassCastException
> at Unknown.ps(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.ts(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.vs(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.iJf(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.Xab(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.P8o(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.jQr(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.A8o(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.u8o(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.Eap(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.p8n(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.Cao(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.Bap(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.kRn(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.nRn(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.eVn(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.hVn(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.MTn(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.PTn(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.KJe(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.Izk(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.P3(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.g4(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.<anonymous>(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown._t(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.du(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...)
> at Unknown.<anonymous>(
>
https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC...
> )
>
>
If I add from storage tab it creates storage domaibn but won't attach to a
datacenter
Error while executing action Attach Storage Domain: AcquireHostIdFailure
engine.log
2016-07-25 13:04:45,186 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
(default task-90) [4e0e7cbd] Failed in 'CreateStoragePoolVDS' method
2016-07-25 13:04:45,211 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-90) [4e0e7cbd] Correlation ID: null, Call Stack: null, Custom
Event ID: -1, Message: VDSM local command failed: Cannot acquire host id:
(u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08', SanlockException(1, 'Sanlock
lockspace add failure', 'Operation not permitted'))
2016-07-25 13:04:45,211 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
(default task-90) [4e0e7cbd] Command
'org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand'
return value 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=661,
message=Cannot acquire host id: (u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08',
SanlockException(1, 'Sanlock lockspace add failure', 'Operation not
permitted'))]]'
2016-07-25 13:04:45,211 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
(default task-90) [4e0e7cbd] HostName = local
2016-07-25 13:04:45,212 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
(default task-90) [4e0e7cbd] Command 'CreateStoragePoolVDSCommand(HostName
= local, CreateStoragePoolVDSCommandParameters:{runAsync='true',
hostId='b4d03420-3de8-45b8-a671-45bbe7c05e06',
storagePoolId='7fe4f6ec-71aa-485b-8bba-958e493b66eb',
storagePoolName='NewDefault',
masterDomainId='5b8a4477-4d87-43a1-aa52-b664b1bd9e08',
domainsIdList='[5b8a4477-4d87-43a1-aa52-b664b1bd9e08]',
masterVersion='4'})' execution failed: VDSGenericException:
VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire
host id: (u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08', SanlockException(1,
'Sanlock lockspace add failure', 'Operation not permitted')), code = 661
2016-07-25 13:04:45,212 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
(default task-90) [4e0e7cbd] FINISH, CreateStoragePoolVDSCommand, log id:
2ed8b2b6
2016-07-25 13:04:45,212 ERROR
[org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand]
(default task-90) [4e0e7cbd] Command
'org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand'
failed: EngineException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS,
error = Cannot acquire host id: (u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08',
SanlockException(1, 'Sanlock lockspace add failure', 'Operation not
permitted')), code = 661 (Failed with error AcquireHostIdFailure and code
661)
2016-07-25 13:04:45,220 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-90) [4e0e7cbd] Correlation ID: 4f77f0e0, Job ID:
6aae65f2-ff61-4bec-a513-18b31828442b, Call Stack: null, Custom Event ID:
-1, Message: Failed to attach Storage Domains to Data Center NewDefault.
(User: admin@internal)
2016-07-25 13:04:45,228 INFO
[org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand]
(default task-90) [4e0e7cbd] Lock freed to object
'EngineLock:{exclusiveLocks='[5b8a4477-4d87-43a1-aa52-b664b1bd9e08=<STORAGE,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2016-07-25 13:04:45,229 INFO
[org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand]
(default task-90) [4e0e7cbd] Command
[id=d08f24d6-f0f9-4df8-aa34-3718ab44f454]: Compensating
DELETED_OR_UPDATED_ENTITY of
org.ovirt.engine.core.common.businessentities.StoragePool; snapshot:
id=7fe4f6ec-71aa-485b-8bba-958e493b66eb.
2016-07-25 13:04:45,231 INFO
[org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand]
(default task-90) [4e0e7cbd] Command
[id=d08f24d6-f0f9-4df8-aa34-3718ab44f454]: Compensating NEW_ENTITY_ID of
org.ovirt.engine.core.common.businessentities.StoragePoolIsoMap; snapshot:
StoragePoolIsoMapId:{storagePoolId='7fe4f6ec-71aa-485b-8bba-958e493b66eb',
storageId='5b8a4477-4d87-43a1-aa52-b664b1bd9e08'}.
2016-07-25 13:04:45,231 INFO
[org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand]
(default task-90) [4e0e7cbd] Command
[id=d08f24d6-f0f9-4df8-aa34-3718ab44f454]: Compensating
DELETED_OR_UPDATED_ENTITY of
org.ovirt.engine.core.common.businessentities.StorageDomainStatic;
snapshot: id=5b8a4477-4d87-43a1-aa52-b664b1bd9e08.
2016-07-25 13:04:45,245 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-90) [4e0e7cbd] Correlation ID: 6cae9150, Job ID:
6aae65f2-ff61-4bec-a513-18b31828442b, Call Stack: null, Custom Event ID:
-1, Message: Failed to attach Storage Domain newone to Data Center
NewDefault. (User: admin@internal)
2016-07-25 13:04:45,253 WARN
[org.ovirt.engine.core.bll.lock.InMemoryLockManager] (default task-90)
[4e0e7cbd] Trying to release exclusive lock which does not exist, lock key:
'5b8a4477-4d87-43a1-aa52-b664b1bd9e08STORAGE'
2016-07-25 13:04:45,253 INFO
[org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand]
(default task-90) [4e0e7cbd] Lock freed to object
'EngineLock:{exclusiveLocks='[5b8a4477-4d87-43a1-aa52-b664b1bd9e08=<STORAGE,
ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
> -Krutika
>>
>> On Mon, Jul 25, 2016 at 4:57 PM, Samuli Heinonen <samppah(a)neutraali.net>
>> wrote:
>>
>>> Hi,
>>>
>>> > On 25 Jul 2016, at 12:34, David Gossage
<dgossage(a)carouselchecks.com>
>>> wrote:
>>> >
>>> > On Mon, Jul 25, 2016 at 1:01 AM, Krutika Dhananjay <
>>> kdhananj(a)redhat.com> wrote:
>>> > Hi,
>>> >
>>> > Thanks for the logs. So I have identified one issue from the logs for
>>> which the fix is this:
http://review.gluster.org/#/c/14669/. Because
>>> of a bug in the code, ENOENT was getting converted to EPERM and being
>>> propagated up the stack causing the reads to bail out early with
'Operation
>>> not permitted' errors.
>>> > I still need to find out two things:
>>> > i) why there was a readv() sent on a non-existent (ENOENT) file (this
>>> is important since some of the other users have not faced or reported this
>>> issue on gluster-users with 3.7.13)
>>> > ii) need to see if there's a way to work around this issue.
>>> >
>>> > Do you mind sharing the steps needed to be executed to run into this
>>> issue? This is so that we can apply our patches, test and ensure they fix
>>> the problem.
>>>
>>>
>>> Unfortunately I can’t test this right away nor give exact steps how to
>>> test this. This is just a theory but please correct me if you see some
>>> mistakes.
>>>
>>> oVirt uses cache=none settings for VM’s by default which requires
>>> direct I/O. oVirt also uses dd with iflag=direct to check that storage has
>>> direct I/O enabled. Problems exist with GlusterFS with sharding enabled and
>>> bricks running on ZFS on Linux. Everything seems to be fine with GlusterFS
>>> 3.7.11 and problems exist at least with version .12 and .13. There has been
>>> some posts saying that GlusterFS 3.8.x is also affected.
>>>
>>> Steps to reproduce:
>>> 1. Sharded file is created with GlusterFS 3.7.11. Everything works ok.
>>> 2. GlusterFS is upgraded to 3.7.12+
>>> 3. Sharded file cannot be read or written with direct I/O enabled. (Ie.
>>> oVirt uses to check storage connection with command "dd
>>>
if=/rhev/data-center/00000001-0001-0001-0001-0000000002b6/mastersd/dom_md/inbox
>>> iflag=direct,fullblock count=1 bs=1024000”)
>>>
>>> Please let me know if you need more information.
>>>
>>> -samuli
>>>
>>> > Well after upgrade of gluster all I did was start ovirt hosts up
>>> which launched and started their ha-agent and broker processes. I don't
>>> believe I started getting any errors till it mounted GLUSTER1. I had
>>> enabled sharding but had no sharded disk images yet. Not sure if the check
>>> for shards would have caused that. Unfortunately I can't just update
this
>>> cluster and try and see what caused it as it has sme VM's users expect
to
>>> be available in few hours.
>>> >
>>> > I can see if I can get my test setup to recreate it. I think I'll
>>> need to de-activate data center so I can detach the storage thats on xfs
>>> and attach the one thats over zfs with sharding enabled. My test is 3
>>> bricks on same local machine, with 3 different volumes but I think im
>>> running into sanlock issue or something as it won't mount more than one
>>> volume that was created locally.
>>> >
>>> >
>>> > -Krutika
>>> >
>>> > On Fri, Jul 22, 2016 at 7:17 PM, David Gossage <
>>> dgossage(a)carouselchecks.com> wrote:
>>> > Trimmed out the logs to just about when I was shutting down ovirt
>>> servers for updates which was 14:30 UTC 2016-07-09
>>> >
>>> > Pre-update settings were
>>> >
>>> > Volume Name: GLUSTER1
>>> > Type: Replicate
>>> > Volume ID: 167b8e57-28c3-447a-95cc-8410cbdf3f7f
>>> > Status: Started
>>> > Number of Bricks: 1 x 3 = 3
>>> > Transport-type: tcp
>>> > Bricks:
>>> > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1
>>> > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1
>>> > Brick3: ccgl3.gl.local:/gluster1/BRICK1/1
>>> > Options Reconfigured:
>>> > performance.readdir-ahead: on
>>> > storage.owner-uid: 36
>>> > storage.owner-gid: 36
>>> > performance.quick-read: off
>>> > performance.read-ahead: off
>>> > performance.io-cache: off
>>> > performance.stat-prefetch: off
>>> > cluster.eager-lock: enable
>>> > network.remote-dio: enable
>>> > cluster.quorum-type: auto
>>> > cluster.server-quorum-type: server
>>> > server.allow-insecure: on
>>> > cluster.self-heal-window-size: 1024
>>> > cluster.background-self-heal-count: 16
>>> > performance.strict-write-ordering: off
>>> > nfs.disable: on
>>> > nfs.addr-namelookup: off
>>> > nfs.enable-ino32: off
>>> >
>>> > At the time of updates ccgl3 was offline from bad nic on server but
>>> had been so for about a week with no issues in volume
>>> >
>>> > Shortly after update I added these settings to enable sharding but
>>> did not as of yet have any VM images sharded.
>>> > features.shard-block-size: 64MB
>>> > features.shard: on
>>> >
>>> >
>>> >
>>> >
>>> > David Gossage
>>> > Carousel Checks Inc. | System Administrator
>>> > Office 708.613.2284
>>> >
>>> > On Fri, Jul 22, 2016 at 5:00 AM, Krutika Dhananjay <
>>> kdhananj(a)redhat.com> wrote:
>>> > Hi David,
>>> >
>>> > Could you also share the brick logs from the affected volume?
They're
>>> located at
>>>
/var/log/glusterfs/bricks/<hyphenated-path-to-the-brick-directory>.log.
>>> >
>>> > Also, could you share the volume configuration (output of `gluster
>>> volume info <VOL>`) for the affected volume(s) AND at the time you
actually
>>> saw this issue?
>>> >
>>> > -Krutika
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, Jul 21, 2016 at 11:23 PM, David Gossage <
>>> dgossage(a)carouselchecks.com> wrote:
>>> > On Thu, Jul 21, 2016 at 11:47 AM, Scott <romracer(a)gmail.com>
wrote:
>>> > Hi David,
>>> >
>>> > My backend storage is ZFS.
>>> >
>>> > I thought about moving from FUSE to NFS mounts for my Gluster volumes
>>> to help test. But since I use hosted engine this would be a real pain.
>>> Its difficult to modify the storage domain type/path in the
>>> hosted-engine.conf. And I don't want to go through the process of
>>> re-deploying hosted engine.
>>> >
>>> >
>>> > I found this
>>> >
>>> >
https://bugzilla.redhat.com/show_bug.cgi?id=1347553
>>> >
>>> > Not sure if related.
>>> >
>>> > But I also have zfs backend, another user in gluster mailing list had
>>> issues and used zfs backend although she used proxmox and got it working by
>>> changing disk to writeback cache I think it was.
>>> >
>>> > I also use hosted engine, but I run my gluster volume for HE actually
>>> on a LVM separate from zfs on xfs and if i recall it did not have the
>>> issues my gluster on zfs did. I'm wondering now if the issue was zfs
>>> settings.
>>> >
>>> > Hopefully should have a test machone up soon I can play around with
>>> more.
>>> >
>>> > Scott
>>> >
>>> > On Thu, Jul 21, 2016 at 11:36 AM David Gossage <
>>> dgossage(a)carouselchecks.com> wrote:
>>> > What back end storage do you run gluster on? xfs/zfs/ext4 etc?
>>> >
>>> > David Gossage
>>> > Carousel Checks Inc. | System Administrator
>>> > Office 708.613.2284
>>> >
>>> > On Thu, Jul 21, 2016 at 8:18 AM, Scott <romracer(a)gmail.com>
wrote:
>>> > I get similar problems with oVirt 4.0.1 and hosted engine. After
>>> upgrading all my hosts to Gluster 3.7.13 (client and server), I get the
>>> following:
>>> >
>>> > $ sudo hosted-engine --set-maintenance --mode=none
>>> > Traceback (most recent call last):
>>> > File "/usr/lib64/python2.7/runpy.py", line 162, in
>>> _run_module_as_main
>>> > "__main__", fname, loader, pkg_name)
>>> > File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
>>> > exec code in run_globals
>>> > File
>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
>>> line 73, in <module>
>>> > if not maintenance.set_mode(sys.argv[1]):
>>> > File
>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
>>> line 61, in set_mode
>>> > value=m_global,
>>> > File
>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>> line 259, in set_maintenance_mode
>>> > str(value))
>>> > File
>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>> line 204, in set_global_md_flag
>>> > all_stats = broker.get_stats_from_storage(service)
>>> > File
>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>> line 232, in get_stats_from_storage
>>> > result = self._checked_communicate(request)
>>> > File
>>>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>> line 260, in _checked_communicate
>>> > .format(message or response))
>>> > ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed:
>>> failed to read metadata: [Errno 1] Operation not permitted
>>> >
>>> > If I only upgrade one host, then things will continue to work but my
>>> nodes are constantly healing shards. My logs are also flooded with:
>>> >
>>> > [2016-07-21 13:15:14.137734] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 274714: READ => -1 gfid=4
>>> > 41f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>> permitted)
>>> > The message "W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
>>> operation failed [Operation not permitted]" repeated 6 times between
>>> [2016-07-21 13:13:24.134985] and [2016-07-21 13:15:04.132226]
>>> > The message "W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote
>>> operation failed [Operation not permitted]" repeated 8 times between
>>> [2016-07-21 13:13:34.133116] and [2016-07-21 13:15:14.137178]
>>> > The message "W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
>>> operation failed [Operation not permitted]" repeated 7 times between
>>> [2016-07-21 13:13:24.135071] and [2016-07-21 13:15:14.137666]
>>> > [2016-07-21 13:15:24.134647] W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote
>>> operation failed [Operation not permitted]
>>> > [2016-07-21 13:15:24.134764] W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote
>>> operation failed [Operation not permitted]
>>> > [2016-07-21 13:15:24.134793] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 274741: READ => -1
>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not
>>> permitted)
>>> > [2016-07-21 13:15:34.135413] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 274756: READ => -1
>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>> permitted)
>>> > [2016-07-21 13:15:44.141062] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 274818: READ => -1
>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not
>>> permitted)
>>> > [2016-07-21 13:15:54.133582] W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote
>>> operation failed [Operation not permitted]
>>> > [2016-07-21 13:15:54.133629] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 274853: READ => -1
>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not
>>> permitted)
>>> > [2016-07-21 13:16:04.133666] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 274879: READ => -1
>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not
>>> permitted)
>>> > [2016-07-21 13:16:14.134954] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 274894: READ => -1
>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not
>>> permitted)
>>> >
>>> > Scott
>>> >
>>> >
>>> > On Thu, Jul 21, 2016 at 6:57 AM Frank Rothenstein <
>>> f.rothenstein(a)bodden-kliniken.de> wrote:
>>> > Hey Devid,
>>> >
>>> > I have the very same problem on my test-cluster, despite on running
>>> ovirt 4.0.
>>> > If you access your volumes via NFS all is fine, problem is FUSE. I
>>> stayed on 3.7.13, but have no solution yet, now I use NFS.
>>> >
>>> > Frank
>>> >
>>> > Am Donnerstag, den 21.07.2016, 04:28 -0500 schrieb David Gossage:
>>> >> Anyone running one of recent 3.6.x lines and gluster using 3.7.13?
>>> I am looking to upgrade gluster from 3.7.11->3.7.13 for some bug fixes,
but
>>> have been told by users on gluster mail list due to some gluster changes
>>> I'd need to change the disk parameters to use writeback cache.
Something
>>> to do with aio support being removed.
>>> >>
>>> >> I believe this could be done with custom parameters? But I believe
>>> strage tests are done using dd and would they fail with current settings
>>> then? Last upgrade to 3.7.13 I had to rollback to 3.7.11 due to stability
>>> isues where gluster storage would go into down state and always show N/A as
>>> space available/used. Even if hosts saw storage still and VM's were
>>> running on it on all 3 hosts.
>>> >>
>>> >> Saw a lot of messages like these that went away once gluster
>>> rollback finished
>>> >>
>>> >> [2016-07-09 15:27:46.935694] I [fuse-bridge.c:4083:fuse_init]
>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel
>>> 7.22
>>> >> [2016-07-09 15:27:49.555466] W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
>>> operation failed [Operation not permitted]
>>> >> [2016-07-09 15:27:49.556574] W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
>>> operation failed [Operation not permitted]
>>> >> [2016-07-09 15:27:49.556659] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 80: READ => -1
gfid=deb61291-5176-4b81-8315-3f1cf8e3534d
>>> fd=0x7f5224002f68 (Operation not permitted)
>>> >> [2016-07-09 15:27:59.612477] W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
>>> operation failed [Operation not permitted]
>>> >> [2016-07-09 15:27:59.613700] W [MSGID: 114031]
>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
>>> operation failed [Operation not permitted]
>>> >> [2016-07-09 15:27:59.613781] W [fuse-bridge.c:2227:fuse_readv_cbk]
>>> 0-glusterfs-fuse: 168: READ => -1
gfid=deb61291-5176-4b81-8315-3f1cf8e3534d
>>> fd=0x7f5224002f68 (Operation not permitted)
>>> >>
>>> >> David Gossage
>>> >> Carousel Checks Inc. | System Administrator
>>> >> Office 708.613.2284
>>> >> _______________________________________________
>>> >> Users mailing list
>>> >>
>>> >> Users(a)ovirt.org
>>> >>
http://lists.ovirt.org/mailman/listinfo/users
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
______________________________________________________________________________
>>> > BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
>>> > Sandhufe 2
>>> > 18311 Ribnitz-Damgarten
>>> >
>>> > Telefon: 03821-700-0
>>> > Fax: 03821-700-240
>>> >
>>> > E-Mail: info(a)bodden-kliniken.de Internet:
>>>
http://www.bodden-kliniken.de
>>> >
>>> > Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919,
>>> Steuer-Nr.: 079/133/40188
>>> > Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr. Falko
>>> Milski
>>> >
>>> > Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten
>>> Adressaten bestimmt. Wenn Sie nicht der vorge-
>>> > sehene Adressat dieser E-Mail oder dessen Vertreter sein sollten,
>>> beachten Sie bitte, dass jede Form der Veröf-
>>> > fentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser
>>> E-Mail unzulässig ist. Wir bitten Sie, sofort den
>>> > Absender zu informieren und die E-Mail zu löschen.
>>> >
>>> >
>>> > Bodden-Kliniken Ribnitz-Damgarten GmbH 2016
>>> > *** Virenfrei durch Kerio Mail Server und Sophos Antivirus ***
>>> > _______________________________________________
>>> > Users mailing list
>>> > Users(a)ovirt.org
>>> >
http://lists.ovirt.org/mailman/listinfo/users
>>> >
>>> >
>>> > _______________________________________________
>>> > Users mailing list
>>> > Users(a)ovirt.org
>>> >
http://lists.ovirt.org/mailman/listinfo/users
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Users mailing list
>>> > Users(a)ovirt.org
>>> >
http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>