
This is a multi-part message in MIME format. --------------020003090101020807030701 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hello, can anybody help me with this timeouts ?? Volumes are not active yes ( bricks down ) desc. of gluster bellow ... */var/log/glusterfs/**etc-glusterfs-glusterd.vol.log* [2015-11-26 14:44:47.174221] I [MSGID: 106004] [glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer <1hp1-SAN> (<87fc7db8-aba8-41f2-a1cd-b77e83b17436>), in state <Peer in Cluster>, has disconnected from glusterd. [2015-11-26 14:44:47.174354] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P1 not held [2015-11-26 14:44:47.174444] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P3 not held [2015-11-26 14:44:47.174521] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P1 not held [2015-11-26 14:44:47.174662] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P3 not held [2015-11-26 14:44:47.174532] W [MSGID: 106118] [glusterd-handler.c:5087:__glusterd_peer_rpc_notify] 0-management: Lock not released for 2HP12-P1 [2015-11-26 14:44:47.174675] W [MSGID: 106118] [glusterd-handler.c:5087:__glusterd_peer_rpc_notify] 0-management: Lock not released for 2HP12-P3 [2015-11-26 14:44:49.423334] I [MSGID: 106488] [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req The message "I [MSGID: 106488] [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req" repeated 4 times between [2015-11-26 14:44:49.423334] and [2015-11-26 14:44:49.429781] [2015-11-26 14:44:51.148711] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30702 [2015-11-26 14:44:52.177266] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 12, Invalid argument [2015-11-26 14:44:52.177291] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2015-11-26 14:44:53.180426] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 17, Invalid argument [2015-11-26 14:44:53.180447] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2015-11-26 14:44:52.395468] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30702 [2015-11-26 14:44:54.851958] I [MSGID: 106488] [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-11-26 14:44:57.183969] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 19, Invalid argument [2015-11-26 14:44:57.183990] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument After volumes creation all works fine ( volumes up ) , but then, after several reboots ( yum updates) volumes failed due timeouts . Gluster description: 4 nodes with 4 volumes replica 2 oVirt 3.6 - the last gluster 3.7.6 - the last vdsm 4.17.999 - from git repo oVirt - mgmt.nodes 172.16.0.0 oVirt - bricks 16.0.0.0 ( "SAN" - defined as "gluster" net) Network works fine, no lost packets # gluster volume status Staging failed on 2hp1-SAN. Please check log file for details. Staging failed on 1hp2-SAN. Please check log file for details. Staging failed on 2hp2-SAN. Please check log file for details. # gluster volume info Volume Name: 1HP12-P1 Type: Replicate Volume ID: 6991e82c-9745-4203-9b0a-df202060f455 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 1hp1-SAN:/STORAGE/p1/G Brick2: 1hp2-SAN:/STORAGE/p1/G Options Reconfigured: performance.readdir-ahead: on Volume Name: 1HP12-P3 Type: Replicate Volume ID: 8bbdf0cb-f9b9-4733-8388-90487aa70b30 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 1hp1-SAN:/STORAGE/p3/G Brick2: 1hp2-SAN:/STORAGE/p3/G Options Reconfigured: performance.readdir-ahead: on Volume Name: 2HP12-P1 Type: Replicate Volume ID: e2cd5559-f789-4636-b06a-683e43e0d6bb Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 2hp1-SAN:/STORAGE/p1/G Brick2: 2hp2-SAN:/STORAGE/p1/G Options Reconfigured: performance.readdir-ahead: on Volume Name: 2HP12-P3 Type: Replicate Volume ID: b5300c68-10b3-4ebe-9f29-805d3a641702 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 2hp1-SAN:/STORAGE/p3/G Brick2: 2hp2-SAN:/STORAGE/p3/G Options Reconfigured: performance.readdir-ahead: on regs. for any hints Paf1 --------------020003090101020807030701 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body text="#000066" bgcolor="#FFFFFF"> Hello, <br> can anybody help me with this timeouts ??<br> Volumes are not active yes ( bricks down )<br> <br> desc. of gluster bellow ...<br> <br> <b>/var/log/glusterfs/</b><b>etc-glusterfs-glusterd.vol.log</b><br> [2015-11-26 14:44:47.174221] I [MSGID: 106004] [glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer <1hp1-SAN> (<87fc7db8-aba8-41f2-a1cd-b77e83b17436>), in state <Peer in Cluster>, has disconnected from glusterd.<br> [2015-11-26 14:44:47.174354] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P1 not held<br> [2015-11-26 14:44:47.174444] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P3 not held<br> [2015-11-26 14:44:47.174521] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P1 not held<br> [2015-11-26 14:44:47.174662] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) [0x7fb7039d44dc] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) [0x7fb7039de542] -->/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a) [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P3 not held<br> [2015-11-26 14:44:47.174532] W [MSGID: 106118] [glusterd-handler.c:5087:__glusterd_peer_rpc_notify] 0-management: Lock not released for 2HP12-P1<br> [2015-11-26 14:44:47.174675] W [MSGID: 106118] [glusterd-handler.c:5087:__glusterd_peer_rpc_notify] 0-management: Lock not released for 2HP12-P3<br> [2015-11-26 14:44:49.423334] I [MSGID: 106488] [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req<br> The message "I [MSGID: 106488] [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req" repeated 4 times between [2015-11-26 14:44:49.423334] and [2015-11-26 14:44:49.429781]<br> [2015-11-26 14:44:51.148711] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30702<br> [2015-11-26 14:44:52.177266] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 12, Invalid argument<br> [2015-11-26 14:44:52.177291] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument<br> [2015-11-26 14:44:53.180426] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 17, Invalid argument<br> [2015-11-26 14:44:53.180447] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument<br> [2015-11-26 14:44:52.395468] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30702<br> [2015-11-26 14:44:54.851958] I [MSGID: 106488] [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req<br> [2015-11-26 14:44:57.183969] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 19, Invalid argument<br> [2015-11-26 14:44:57.183990] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument<br> <br> After volumes creation all works fine ( volumes up ) , but then, after several reboots ( yum updates) volumes failed due timeouts .<br> <br> Gluster description:<br> <br> 4 nodes with 4 volumes replica 2 <br> oVirt 3.6 - the last<br> gluster 3.7.6 - the last <br> vdsm 4.17.999 - from git repo<br> oVirt - mgmt.nodes 172.16.0.0<br> oVirt - bricks 16.0.0.0 ( "SAN" - defined as "gluster" net)<br> Network works fine, no lost packets<br> <br> # gluster volume status <br> Staging failed on 2hp1-SAN. Please check log file for details.<br> Staging failed on 1hp2-SAN. Please check log file for details.<br> Staging failed on 2hp2-SAN. Please check log file for details.<br> <br> # gluster volume info<br> <br> Volume Name: 1HP12-P1<br> Type: Replicate<br> Volume ID: 6991e82c-9745-4203-9b0a-df202060f455<br> Status: Started<br> Number of Bricks: 1 x 2 = 2<br> Transport-type: tcp<br> Bricks:<br> Brick1: 1hp1-SAN:/STORAGE/p1/G<br> Brick2: 1hp2-SAN:/STORAGE/p1/G<br> Options Reconfigured:<br> performance.readdir-ahead: on<br> <br> Volume Name: 1HP12-P3<br> Type: Replicate<br> Volume ID: 8bbdf0cb-f9b9-4733-8388-90487aa70b30<br> Status: Started<br> Number of Bricks: 1 x 2 = 2<br> Transport-type: tcp<br> Bricks:<br> Brick1: 1hp1-SAN:/STORAGE/p3/G<br> Brick2: 1hp2-SAN:/STORAGE/p3/G<br> Options Reconfigured:<br> performance.readdir-ahead: on<br> <br> Volume Name: 2HP12-P1<br> Type: Replicate<br> Volume ID: e2cd5559-f789-4636-b06a-683e43e0d6bb<br> Status: Started<br> Number of Bricks: 1 x 2 = 2<br> Transport-type: tcp<br> Bricks:<br> Brick1: 2hp1-SAN:/STORAGE/p1/G<br> Brick2: 2hp2-SAN:/STORAGE/p1/G<br> Options Reconfigured:<br> performance.readdir-ahead: on<br> <br> Volume Name: 2HP12-P3<br> Type: Replicate<br> Volume ID: b5300c68-10b3-4ebe-9f29-805d3a641702<br> Status: Started<br> Number of Bricks: 1 x 2 = 2<br> Transport-type: tcp<br> Bricks:<br> Brick1: 2hp1-SAN:/STORAGE/p3/G<br> Brick2: 2hp2-SAN:/STORAGE/p3/G<br> Options Reconfigured:<br> performance.readdir-ahead: on<br> <br> regs. for any hints<br> Paf1<br> </body> </html> --------------020003090101020807030701--