https://bugzilla.redhat.com/show_bug.cgi?id=1677160 doesn't seem relevant to me?  Is that the correct link?

Like I mentioned in a previous email I'm also having problems with Gluster bricks going offline since upgrading to oVirt 4.3 yesterday (previously I've never had a single issue with gluster nor have had a brick ever go down).  I suspect this will continue to happen daily as some other users on this group have suggested.  I was able to pull some logs from engine and gluster from around the time the brick dropped.  My setup is 3 node HCI and I was previously running the latest 4.2 updates (before upgrading to 4.3).  My hardware is has a lot of overhead and I'm on 10Gbe gluster backend (the servers were certainly not under any significant amount of load when the brick went offline).  To recover I had to place the host in maintenance mode and reboot (although I suspect I could have simply unmounted and remounted gluster mounts). 

grep "2019-02-14" engine.log-20190214 | grep "GLUSTER_BRICK_STATUS_CHANGED"
2019-02-14 02:41:48,018-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler1) [5ff5b093] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b of volume non_prod_b of cluster Default from UP to DOWN via cli.
2019-02-14 03:20:11,189-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler3) [760f7851] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/engine/engine of volume engine of cluster Default from DOWN to UP via cli.
2019-02-14 03:20:14,819-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler3) [760f7851] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/prod_b/prod_b of volume prod_b of cluster Default from DOWN to UP via cli.
2019-02-14 03:20:19,692-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler3) [760f7851] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/isos/isos of volume isos of cluster Default from DOWN to UP via cli.
2019-02-14 03:20:25,022-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler3) [760f7851] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/prod_a/prod_a of volume prod_a of cluster Default from DOWN to UP via cli.
2019-02-14 03:20:29,088-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler3) [760f7851] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b of volume non_prod_b of cluster Default from DOWN to UP via cli.
2019-02-14 03:20:34,099-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler3) [760f7851] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/non_prod_a/non_prod_a of volume non_prod_a of cluster Default from DOWN to UP via cli

glusterd.log

# grep -B20 -A20 "2019-02-14 02:41" glusterd.log
[2019-02-14 02:36:49.585034] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_b
[2019-02-14 02:36:49.597788] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
The message "E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler" repeated 2 times between [2019-02-14 02:36:49.597788] and [2019-02-14 02:36:49.900505]
[2019-02-14 02:36:53.437539] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_a
[2019-02-14 02:36:53.452816] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-02-14 02:36:53.864153] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_a
[2019-02-14 02:36:53.875835] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-02-14 02:36:30.958649] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine
[2019-02-14 02:36:35.322129] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_b
[2019-02-14 02:36:39.639645] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume isos
[2019-02-14 02:36:45.301275] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_a
The message "E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler" repeated 2 times between [2019-02-14 02:36:53.875835] and [2019-02-14 02:36:54.180780]
[2019-02-14 02:37:59.193409] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-02-14 02:38:44.065560] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine
[2019-02-14 02:38:44.072680] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume isos
[2019-02-14 02:38:44.077841] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_a
[2019-02-14 02:38:44.082798] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_b
[2019-02-14 02:38:44.088237] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_a
[2019-02-14 02:38:44.093518] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_b
The message "E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler" repeated 2 times between [2019-02-14 02:37:59.193409] and [2019-02-14 02:38:44.100494]
[2019-02-14 02:41:58.649683] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
The message "E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler" repeated 6 times between [2019-02-14 02:41:58.649683] and [2019-02-14 02:43:00.286999]
[2019-02-14 02:43:46.366743] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine
[2019-02-14 02:43:46.373587] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume isos
[2019-02-14 02:43:46.378997] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_a
[2019-02-14 02:43:46.384324] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_b
[2019-02-14 02:43:46.390310] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_a
[2019-02-14 02:43:46.397031] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_b
[2019-02-14 02:43:46.404083] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-02-14 02:45:47.302884] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine
[2019-02-14 02:45:47.309697] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume isos
[2019-02-14 02:45:47.315149] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_a
[2019-02-14 02:45:47.320806] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_b
[2019-02-14 02:45:47.326865] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_a
[2019-02-14 02:45:47.332192] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_b
[2019-02-14 02:45:47.338991] E [MSGID: 101191] [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler
[2019-02-14 02:46:47.789575] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_b
[2019-02-14 02:46:47.795276] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_a
[2019-02-14 02:46:47.800584] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume prod_b
[2019-02-14 02:46:47.770601] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine
[2019-02-14 02:46:47.778161] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume isos
[2019-02-14 02:46:47.784020] I [MSGID: 106499] [glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management: Received status volume req for volume non_prod_a

engine.log

# grep -B20 -A20 "2019-02-14 02:41:48" engine.log-20190214
2019-02-14 02:41:43,495-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host1, VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}), log id: 172c9ee8
2019-02-14 02:41:43,609-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalLogicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@479fcb69, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@6443e68f, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2b4cf035, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5864f06a, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@6119ac8c, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1a9549be, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5614cf81, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@290c9289, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5dd26e8, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@35355754, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@452deeb4, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@8f8b442, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@647e29d3, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7bee4dff, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@511c4478, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c0bb0bd, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@92e325e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@260731, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@33aaacc9, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@72657c59, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@aa10c89], log id: 172c9ee8
2019-02-14 02:41:43,610-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host1, VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}), log id: 3a0e9d63
2019-02-14 02:41:43,703-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalPhysicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@5ca4a20f, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@57a8a76, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@7bd1b14], log id: 3a0e9d63
2019-02-14 02:41:43,704-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterVDOVolumeListVDSCommand(HostName = Host1, VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}), log id: 49966b05
2019-02-14 02:41:44,213-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterVDOVolumeListVDSCommand, return: [], log id: 49966b05
2019-02-14 02:41:44,214-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host2, VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}), log id: 30db0ce2
2019-02-14 02:41:44,311-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalLogicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@61a309b5, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@ea9cb2e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@749d57bd, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c49f9d0, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@655eb54d, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@256ee273, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3bd079dc, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@6804900f, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@78e0a49f, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2acfbc8a, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@12e92e96, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5ea1502c, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2398c33b, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7464102e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2f221daa, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7b561852, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1eb29d18, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@4a030b80, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@75739027, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3eac8253, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@34fc82c3], log id: 30db0ce2
2019-02-14 02:41:44,312-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host2, VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}), log id: 6671d0d7
2019-02-14 02:41:44,329-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:44,345-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:44,374-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:44,405-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalPhysicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@f6a9696, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@558e3332, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@5b449da], log id: 6671d0d7
2019-02-14 02:41:44,406-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterVDOVolumeListVDSCommand(HostName = Host2, VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}), log id: 6d2bc6d3
2019-02-14 02:41:44,908-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterVDOVolumeListVDSCommand, return: [], log id: 6d2bc6d3
2019-02-14 02:41:44,909-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeAdvancedDetailsVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterVolumeAdvancedDetailsVDSCommand(HostName = Host0, GlusterVolumeAdvancedDetailsVDSParameters:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5', volumeName='non_prod_b'}), log id: 36ae23c6
2019-02-14 02:41:47,336-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:47,351-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:47,379-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:47,979-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeAdvancedDetailsVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterVolumeAdvancedDetailsVDSCommand, return: org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeAdvancedDetails@7a4a787b, log id: 36ae23c6
2019-02-14 02:41:48,018-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler1) [5ff5b093] EVENT_ID: GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b of volume non_prod_b of cluster Default from UP to DOWN via cli.
2019-02-14 02:41:48,046-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler1) [5ff5b093] EVENT_ID: GLUSTER_BRICK_STATUS_DOWN(4,151), Status of brick host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b of volume non_prod_b on cluster Default is down.
2019-02-14 02:41:48,139-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler1) [5ff5b093] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:48,140-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler3) [7b9bd2d] START, GlusterServersListVDSCommand(HostName = Host0, VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}), log id: e1fb23
2019-02-14 02:41:48,911-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler3) [7b9bd2d] FINISH, GlusterServersListVDSCommand, return: [10.12.0.220/24:CONNECTED, host1.replaced.domain.com:CONNECTED, host2.replaced.domain.com:CONNECTED], log id: e1fb23
2019-02-14 02:41:48,930-04 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler1) [5ff5b093] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]', sharedLocks=''}'
2019-02-14 02:41:48,931-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler3) [7b9bd2d] START, GlusterVolumesListVDSCommand(HostName = Host0, GlusterVolumesListVDSParameters:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}), log id: 68f1aecc
2019-02-14 02:41:49,366-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler3) [7b9bd2d] FINISH, GlusterVolumesListVDSCommand, return: {6c05dfc6-4dc0-41e3-a12f-55b4767f1d35=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@1952a85, 3f8f6a0f-aed4-48e3-9129-18a2a3f64eef=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@2f6688ae, 71ff56d9-79b8-445d-b637-72ffc974f109=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@730210fb, 752a9438-cd11-426c-b384-bc3c5f86ed07=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@c3be510c, c3e7447e-8514-4e4a-9ff5-a648fe6aa537=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@450befac, 79e8e93c-57c8-4541-a360-726cec3790cf=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@1926e392}, log id: 68f1aecc
2019-02-14 02:41:49,489-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host0, VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}), log id: 38debe74
2019-02-14 02:41:49,581-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalLogicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5e5a7925, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2cdf5c9e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@443cb62, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@49a3e880, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@443d23c0, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1250bc75, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@8d27d86, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5e6363f4, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@73ed78db, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@64c9d1c7, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7fecbe95, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3a551e5f, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2266926e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@88b380c, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1209279e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3c6466, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@16df63ed, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@47456262, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c2b88c3, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7f57c074, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@12fa0478], log id: 38debe74
2019-02-14 02:41:49,582-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host0, VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}), log id: 7ec02237
2019-02-14 02:41:49,660-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalPhysicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@3eedd0bc, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@7f78e375, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@3d63e126], log id: 7ec02237
2019-02-14 02:41:49,661-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterVDOVolumeListVDSCommand(HostName = Host0, VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}), log id: 42cdad27
2019-02-14 02:41:50,142-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterVDOVolumeListVDSCommand, return: [], log id: 42cdad27
2019-02-14 02:41:50,143-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host1, VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}), log id: 12f5fdf2
2019-02-14 02:41:50,248-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalLogicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2aaed792, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@8e66930, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@276d599e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1aca2aec, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@46846c60, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7d103269, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@30fc25fc, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7baae445, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1ea8603c, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@62578afa, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@33d58089, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1f71d27a, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@4205e828, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c5bbac8, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@395a002, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@12664008, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7f4faec4, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3e03d61f, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1038e46d, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@307e8062, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@32453127], log id: 12f5fdf2
2019-02-14 02:41:50,249-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host1, VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}), log id: 1256aa5e
2019-02-14 02:41:50,338-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalPhysicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@459a2ff5, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@123cab4, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@1af41fbe], log id: 1256aa5e
2019-02-14 02:41:50,339-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterVDOVolumeListVDSCommand(HostName = Host1, VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}), log id: 3dd752e4
2019-02-14 02:41:50,847-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterVDOVolumeListVDSCommand, return: [], log id: 3dd752e4
2019-02-14 02:41:50,848-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host2, VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}), log id: 29a6272c
2019-02-14 02:41:50,954-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalLogicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@364f3ec6, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@c7cce5e, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@b3bed47, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@13bc244b, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5cca81f4, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@36aeba0d, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@62ab384a, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1047d628, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@188a30f5, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5bb79f3b, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@60e5956f, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@4e3df9cd, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7796567, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@60d06cf4, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2cd2d36c, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@d80a4aa, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@411eaa20, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@22cac93b, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@18b927bd, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@101465f4, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@246f927c], log id: 29a6272c
2019-02-14 02:41:50,955-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host2, VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}), log id: 501814db
2019-02-14 02:41:51,044-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterLocalPhysicalVolumeListVDSCommand, return: [org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@1cd55aa, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@32c5aba2, org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@6ae123f4], log id: 501814db
2019-02-14 02:41:51,045-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterVDOVolumeListVDSCommand(HostName = Host2, VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}), log id: 7acf4cbf
2019-02-14 02:41:51,546-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] FINISH, GetGlusterVDOVolumeListVDSCommand, return: [], log id: 7acf4cbf
2019-02-14 02:41:51,547-04 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeAdvancedDetailsVDSCommand] (DefaultQuartzScheduler1) [5ff5b093] START, GetGlusterVolumeAdvancedDetailsVDSCommand(HostName = Host0, GlusterVolumeAdvancedDetailsVDSParameters:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5', volumeName='non_prod_a'}), log id: 11c42649

On Thu, Feb 14, 2019 at 10:16 AM Sandro Bonazzola <sbonazzo@redhat.com> wrote:


Il giorno gio 14 feb 2019 alle ore 07:54 Jayme <jaymef@gmail.com> ha scritto:
I have a three node HCI gluster which was previously running 4.2 with zero problems.  I just upgraded it yesterday.  I ran in to a few bugs right away with the upgrade process, but aside from that I also discovered other users with severe GlusterFS problems since the upgrade to new GlusterFS version.  It is less than 24 hours since I upgrade my cluster and I just got a notice that one of my GlusterFS bricks is offline.  There does appear to be a very real and serious issue here with the latest updates.

tracking the issue on Gluster side on this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1677160
If you can help Gluster community providing requested logs it would be great.


 


On Wed, Feb 13, 2019 at 7:26 PM <dscott@umbctraining.com> wrote:
I'm abandoning my production ovirt cluster due to instability.   I have a 7 host cluster running about 300 vms and have been for over a year.  It has become unstable over the past three days.  I have random hosts both, compute and storage disconnecting.  AND many vms disconnecting and becoming unusable.

7 host are 4 compute hosts running Ovirt 4.2.8 and three glusterfs hosts running 3.12.5.  I submitted a bugzilla bug and they immediately assigned it to the storage people but have not responded with any meaningful information.  I have submitted several logs. 

I have found some discussion on problems with instability with gluster 3.12.5.  I would be willing to upgrade my gluster to a more stable version if that's the culprit.  I installed gluster using the ovirt gui and this is the version the ovirt gui installed.

Is there an ovirt health monitor available?  Where should I be looking to get a resolution the problems I'm facing.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BL4M3JQA3IEXCQUY4IGQXOAALRUQ7TVB/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QULCBXHTKSCPKH4UV6GLMOLJE6J7M5UW/


--

SANDRO BONAZZOLA

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA

sbonazzo@redhat.com