https://bugzilla.redhat.com/show_bug.cgi?id=1677160 doesn't seem relevant
to me? Is that the correct link?
Like I mentioned in a previous email I'm also having problems with Gluster
bricks going offline since upgrading to oVirt 4.3 yesterday (previously
I've never had a single issue with gluster nor have had a brick ever go
down). I suspect this will continue to happen daily as some other users on
this group have suggested. I was able to pull some logs from engine and
gluster from around the time the brick dropped. My setup is 3 node HCI and
I was previously running the latest 4.2 updates (before upgrading to 4.3).
My hardware is has a lot of overhead and I'm on 10Gbe gluster backend (the
servers were certainly not under any significant amount of load when the
brick went offline). To recover I had to place the host in maintenance
mode and reboot (although I suspect I could have simply unmounted and
remounted gluster mounts).
grep "2019-02-14" engine.log-20190214 | grep
"GLUSTER_BRICK_STATUS_CHANGED"
2019-02-14 02:41:48,018-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler1) [5ff5b093] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b of volume
non_prod_b of cluster Default from UP to DOWN via cli.
2019-02-14 03:20:11,189-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler3) [760f7851] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/engine/engine of volume engine of
cluster Default from DOWN to UP via cli.
2019-02-14 03:20:14,819-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler3) [760f7851] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/prod_b/prod_b of volume prod_b of
cluster Default from DOWN to UP via cli.
2019-02-14 03:20:19,692-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler3) [760f7851] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/isos/isos of volume isos of
cluster Default from DOWN to UP via cli.
2019-02-14 03:20:25,022-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler3) [760f7851] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/prod_a/prod_a of volume prod_a of
cluster Default from DOWN to UP via cli.
2019-02-14 03:20:29,088-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler3) [760f7851] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b of volume
non_prod_b of cluster Default from DOWN to UP via cli.
2019-02-14 03:20:34,099-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler3) [760f7851] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/non_prod_a/non_prod_a of volume
non_prod_a of cluster Default from DOWN to UP via cli
glusterd.log
# grep -B20 -A20 "2019-02-14 02:41" glusterd.log
[2019-02-14 02:36:49.585034] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_b
[2019-02-14 02:36:49.597788] E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
The message "E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler" repeated 2 times between [2019-02-14 02:36:49.597788] and
[2019-02-14 02:36:49.900505]
[2019-02-14 02:36:53.437539] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_a
[2019-02-14 02:36:53.452816] E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
[2019-02-14 02:36:53.864153] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_a
[2019-02-14 02:36:53.875835] E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
[2019-02-14 02:36:30.958649] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume engine
[2019-02-14 02:36:35.322129] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_b
[2019-02-14 02:36:39.639645] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume isos
[2019-02-14 02:36:45.301275] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_a
The message "E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler" repeated 2 times between [2019-02-14 02:36:53.875835] and
[2019-02-14 02:36:54.180780]
[2019-02-14 02:37:59.193409] E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
[2019-02-14 02:38:44.065560] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume engine
[2019-02-14 02:38:44.072680] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume isos
[2019-02-14 02:38:44.077841] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_a
[2019-02-14 02:38:44.082798] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_b
[2019-02-14 02:38:44.088237] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_a
[2019-02-14 02:38:44.093518] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_b
The message "E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler" repeated 2 times between [2019-02-14 02:37:59.193409] and
[2019-02-14 02:38:44.100494]
[2019-02-14 02:41:58.649683] E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
The message "E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler" repeated 6 times between [2019-02-14 02:41:58.649683] and
[2019-02-14 02:43:00.286999]
[2019-02-14 02:43:46.366743] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume engine
[2019-02-14 02:43:46.373587] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume isos
[2019-02-14 02:43:46.378997] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_a
[2019-02-14 02:43:46.384324] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_b
[2019-02-14 02:43:46.390310] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_a
[2019-02-14 02:43:46.397031] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_b
[2019-02-14 02:43:46.404083] E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
[2019-02-14 02:45:47.302884] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume engine
[2019-02-14 02:45:47.309697] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume isos
[2019-02-14 02:45:47.315149] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_a
[2019-02-14 02:45:47.320806] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_b
[2019-02-14 02:45:47.326865] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_a
[2019-02-14 02:45:47.332192] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_b
[2019-02-14 02:45:47.338991] E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler
[2019-02-14 02:46:47.789575] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_b
[2019-02-14 02:46:47.795276] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_a
[2019-02-14 02:46:47.800584] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume prod_b
[2019-02-14 02:46:47.770601] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume engine
[2019-02-14 02:46:47.778161] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume isos
[2019-02-14 02:46:47.784020] I [MSGID: 106499]
[glusterd-handler.c:4389:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume non_prod_a
engine.log
# grep -B20 -A20 "2019-02-14 02:41:48" engine.log-20190214
2019-02-14 02:41:43,495-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host1,
VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}),
log id: 172c9ee8
2019-02-14 02:41:43,609-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalLogicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@479fcb69,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@6443e68f,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2b4cf035,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5864f06a,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@6119ac8c,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1a9549be,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5614cf81,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@290c9289,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5dd26e8,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@35355754,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@452deeb4,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@8f8b442,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@647e29d3,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7bee4dff,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@511c4478,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c0bb0bd,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@92e325e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@260731,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@33aaacc9,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@72657c59,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@aa10c89],
log id: 172c9ee8
2019-02-14 02:41:43,610-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host1,
VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}),
log id: 3a0e9d63
2019-02-14 02:41:43,703-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalPhysicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@5ca4a20f,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@57a8a76,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@7bd1b14],
log id: 3a0e9d63
2019-02-14 02:41:43,704-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterVDOVolumeListVDSCommand(HostName = Host1,
VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}),
log id: 49966b05
2019-02-14 02:41:44,213-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterVDOVolumeListVDSCommand, return: [], log id: 49966b05
2019-02-14 02:41:44,214-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host2,
VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}),
log id: 30db0ce2
2019-02-14 02:41:44,311-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalLogicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@61a309b5,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@ea9cb2e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@749d57bd,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c49f9d0,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@655eb54d,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@256ee273,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3bd079dc,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@6804900f,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@78e0a49f,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2acfbc8a,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@12e92e96,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5ea1502c,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2398c33b,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7464102e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2f221daa,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7b561852,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1eb29d18,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@4a030b80,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@75739027,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3eac8253,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@34fc82c3],
log id: 30db0ce2
2019-02-14 02:41:44,312-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host2,
VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}),
log id: 6671d0d7
2019-02-14 02:41:44,329-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:44,345-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:44,374-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:44,405-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalPhysicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@f6a9696,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@558e3332,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@5b449da],
log id: 6671d0d7
2019-02-14 02:41:44,406-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterVDOVolumeListVDSCommand(HostName = Host2,
VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}),
log id: 6d2bc6d3
2019-02-14 02:41:44,908-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterVDOVolumeListVDSCommand, return: [], log id: 6d2bc6d3
2019-02-14 02:41:44,909-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeAdvancedDetailsVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterVolumeAdvancedDetailsVDSCommand(HostName = Host0,
GlusterVolumeAdvancedDetailsVDSParameters:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5',
volumeName='non_prod_b'}), log id: 36ae23c6
2019-02-14 02:41:47,336-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:47,351-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:47,379-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler3) [7b9bd2d] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:47,979-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeAdvancedDetailsVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterVolumeAdvancedDetailsVDSCommand, return:
org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeAdvancedDetails@7a4a787b,
log id: 36ae23c6
2019-02-14 02:41:48,018-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler1) [5ff5b093] EVENT_ID:
GLUSTER_BRICK_STATUS_CHANGED(4,086), Detected change in status of brick
host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b of volume
non_prod_b of cluster Default from UP to DOWN via cli.
2019-02-14 02:41:48,046-04 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler1) [5ff5b093] EVENT_ID:
GLUSTER_BRICK_STATUS_DOWN(4,151), Status of brick
host2.replaced.domain.com:/gluster_bricks/non_prod_b/non_prod_b
of volume non_prod_b on cluster Default is down.
2019-02-14 02:41:48,139-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler1) [5ff5b093] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:48,140-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler3) [7b9bd2d] START,
GlusterServersListVDSCommand(HostName = Host0,
VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: e1fb23
2019-02-14 02:41:48,911-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler3) [7b9bd2d] FINISH, GlusterServersListVDSCommand,
return: [10.12.0.220/24:CONNECTED, host1.replaced.domain.com:CONNECTED,
host2.replaced.domain.com:CONNECTED], log id: e1fb23
2019-02-14 02:41:48,930-04 INFO
[org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(DefaultQuartzScheduler1) [5ff5b093] Failed to acquire lock and wait lock
'EngineLock:{exclusiveLocks='[a45fe964-9989-11e8-b3f7-00163e4bf18a=GLUSTER]',
sharedLocks=''}'
2019-02-14 02:41:48,931-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler3) [7b9bd2d] START,
GlusterVolumesListVDSCommand(HostName = Host0,
GlusterVolumesListVDSParameters:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: 68f1aecc
2019-02-14 02:41:49,366-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler3) [7b9bd2d] FINISH, GlusterVolumesListVDSCommand,
return:
{6c05dfc6-4dc0-41e3-a12f-55b4767f1d35=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@1952a85,
3f8f6a0f-aed4-48e3-9129-18a2a3f64eef=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@2f6688ae,
71ff56d9-79b8-445d-b637-72ffc974f109=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@730210fb,
752a9438-cd11-426c-b384-bc3c5f86ed07=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@c3be510c,
c3e7447e-8514-4e4a-9ff5-a648fe6aa537=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@450befac,
79e8e93c-57c8-4541-a360-726cec3790cf=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@1926e392},
log id: 68f1aecc
2019-02-14 02:41:49,489-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host0,
VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: 38debe74
2019-02-14 02:41:49,581-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalLogicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5e5a7925,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2cdf5c9e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@443cb62,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@49a3e880,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@443d23c0,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1250bc75,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@8d27d86,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5e6363f4,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@73ed78db,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@64c9d1c7,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7fecbe95,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3a551e5f,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2266926e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@88b380c,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1209279e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3c6466,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@16df63ed,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@47456262,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c2b88c3,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7f57c074,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@12fa0478],
log id: 38debe74
2019-02-14 02:41:49,582-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host0,
VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: 7ec02237
2019-02-14 02:41:49,660-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalPhysicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@3eedd0bc,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@7f78e375,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@3d63e126],
log id: 7ec02237
2019-02-14 02:41:49,661-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterVDOVolumeListVDSCommand(HostName = Host0,
VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: 42cdad27
2019-02-14 02:41:50,142-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterVDOVolumeListVDSCommand, return: [], log id: 42cdad27
2019-02-14 02:41:50,143-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host1,
VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}),
log id: 12f5fdf2
2019-02-14 02:41:50,248-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalLogicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2aaed792,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@8e66930,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@276d599e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1aca2aec,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@46846c60,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7d103269,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@30fc25fc,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7baae445,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1ea8603c,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@62578afa,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@33d58089,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1f71d27a,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@4205e828,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1c5bbac8,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@395a002,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@12664008,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7f4faec4,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@3e03d61f,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1038e46d,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@307e8062,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@32453127],
log id: 12f5fdf2
2019-02-14 02:41:50,249-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host1,
VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}),
log id: 1256aa5e
2019-02-14 02:41:50,338-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalPhysicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@459a2ff5,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@123cab4,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@1af41fbe],
log id: 1256aa5e
2019-02-14 02:41:50,339-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterVDOVolumeListVDSCommand(HostName = Host1,
VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'}),
log id: 3dd752e4
2019-02-14 02:41:50,847-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterVDOVolumeListVDSCommand, return: [], log id: 3dd752e4
2019-02-14 02:41:50,848-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalLogicalVolumeListVDSCommand(HostName = Host2,
VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}),
log id: 29a6272c
2019-02-14 02:41:50,954-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalLogicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalLogicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@364f3ec6,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@c7cce5e,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@b3bed47,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@13bc244b,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5cca81f4,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@36aeba0d,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@62ab384a,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@1047d628,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@188a30f5,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@5bb79f3b,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@60e5956f,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@4e3df9cd,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@7796567,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@60d06cf4,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@2cd2d36c,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@d80a4aa,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@411eaa20,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@22cac93b,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@18b927bd,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@101465f4,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalLogicalVolume@246f927c],
log id: 29a6272c
2019-02-14 02:41:50,955-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterLocalPhysicalVolumeListVDSCommand(HostName = Host2,
VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}),
log id: 501814db
2019-02-14 02:41:51,044-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterLocalPhysicalVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterLocalPhysicalVolumeListVDSCommand, return:
[org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@1cd55aa,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@32c5aba2,
org.ovirt.engine.core.common.businessentities.gluster.GlusterLocalPhysicalVolume@6ae123f4],
log id: 501814db
2019-02-14 02:41:51,045-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterVDOVolumeListVDSCommand(HostName = Host2,
VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}),
log id: 7acf4cbf
2019-02-14 02:41:51,546-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVDOVolumeListVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] FINISH,
GetGlusterVDOVolumeListVDSCommand, return: [], log id: 7acf4cbf
2019-02-14 02:41:51,547-04 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GetGlusterVolumeAdvancedDetailsVDSCommand]
(DefaultQuartzScheduler1) [5ff5b093] START,
GetGlusterVolumeAdvancedDetailsVDSCommand(HostName = Host0,
GlusterVolumeAdvancedDetailsVDSParameters:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5',
volumeName='non_prod_a'}), log id: 11c42649
On Thu, Feb 14, 2019 at 10:16 AM Sandro Bonazzola <sbonazzo(a)redhat.com>
wrote:
Il giorno gio 14 feb 2019 alle ore 07:54 Jayme <jaymef(a)gmail.com> ha
scritto:
> I have a three node HCI gluster which was previously running 4.2 with
> zero problems. I just upgraded it yesterday. I ran in to a few bugs right
> away with the upgrade process, but aside from that I also discovered other
> users with severe GlusterFS problems since the upgrade to new GlusterFS
> version. It is less than 24 hours since I upgrade my cluster and I just
> got a notice that one of my GlusterFS bricks is offline. There does appear
> to be a very real and serious issue here with the latest updates.
>
tracking the issue on Gluster side on this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1677160
If you can help Gluster community providing requested logs it would be
great.
>
>
> On Wed, Feb 13, 2019 at 7:26 PM <dscott(a)umbctraining.com> wrote:
>
>> I'm abandoning my production ovirt cluster due to instability. I have
>> a 7 host cluster running about 300 vms and have been for over a year. It
>> has become unstable over the past three days. I have random hosts both,
>> compute and storage disconnecting. AND many vms disconnecting and becoming
>> unusable.
>>
>> 7 host are 4 compute hosts running Ovirt 4.2.8 and three glusterfs hosts
>> running 3.12.5. I submitted a bugzilla bug and they immediately assigned
>> it to the storage people but have not responded with any meaningful
>> information. I have submitted several logs.
>>
>> I have found some discussion on problems with instability with gluster
>> 3.12.5. I would be willing to upgrade my gluster to a more stable version
>> if that's the culprit. I installed gluster using the ovirt gui and this is
>> the version the ovirt gui installed.
>>
>> Is there an ovirt health monitor available? Where should I be looking
>> to get a resolution the problems I'm facing.
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>>
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BL4M3JQA3IE...
>>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
>
https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QULCBXHTKSC...
>
--
SANDRO BONAZZOLA
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <
https://www.redhat.com/>
sbonazzo(a)redhat.com
<
https://red.ht/sig>