Adding Aravinda
On Sat, May 13, 2017 at 11:21 PM, Jim Kusznir <jim(a)palousetech.com> wrote:
Hi All:
I've been trying to set up georeplication for a while now, but can't seem
to make it work. I've found documentation on the web (mostly
https://gluster.readthedocs.io/en/refactor/Administrator%20Guide/Geo%
20Replication/), and I found
http://blog.gluster.org/2015/09/introducing-
georepsetup-gluster-geo-replication-setup-tool/
Unfortunately, it seems that some critical steps are missing from both,
and I can't figure out for sure what they are.
My environment:
Production: replica 2 + arbitrator running on my 3-node oVirt cluster, 3
volumes (engine, data, iso).
New geo-replication: Raspberry Pi3 with USB hard drive shoved in some
other data closet off-site.
I've installed rasbian-lite, and after much fighting, got
glusterfs-*-3.8.11 installed. I've created my mountpoint (USB hard drive,
much larger than my gluster volumes), and then ran the command. I get this
far:
[ OK]
georep.nwfiber.com is Reachable(Port 22)
[ OK] SSH Connection established root(a)georep.nwfiber.com
[ OK] Master Volume and Slave Volume are compatible (Version: 3.8.11)
[NOT OK] Unable to Mount Gluster Volume georep.nwfiber.com:engine-rep
Trying it with the steps in the gluster docs also has the same problem.
No long files are generated on the slave. Log files on the master include:
[root@ovirt1 geo-replication]# more georepsetup.mount.log
[2017-05-13 17:26:27.318599] I [MSGID: 100030] [glusterfsd.c:2454:main]
0-glusterfs: Started running glusterfs version 3.8.11 (args:
glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server
localhost --volfile-id engine -l /var/log/glusterfs/geo-repli
cation/georepsetup.mount.log --client-pid=-1 /tmp/georepsetup_wZtfkN)
[2017-05-13 17:26:27.341170] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2017-05-13 17:26:27.341260] E [socket.c:2309:socket_connect_finish]
0-glusterfs: connection to ::1:24007 failed (Connection refused
)
[2017-05-13 17:26:27.341846] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify]
0-glusterfsd-mgmt: failed to connect with remote-host: local
host (Transport endpoint is not connected)
[2017-05-13 17:26:31.335849] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker]
0-epoll: Started thread with index 2
[2017-05-13 17:26:31.337545] I [MSGID: 114020] [client.c:2356:notify]
0-engine-client-0: parent translators are ready, attempting co
nnect on transport
[2017-05-13 17:26:31.344485] I [MSGID: 114020] [client.c:2356:notify]
0-engine-client-1: parent translators are ready, attempting co
nnect on transport
[2017-05-13 17:26:31.345146] I [rpc-clnt.c:1965:rpc_clnt_reconfig]
0-engine-client-0: changing port to 49157 (from 0)
[2017-05-13 17:26:31.350868] I [MSGID: 114020] [client.c:2356:notify]
0-engine-client-2: parent translators are ready, attempting co
nnect on transport
[2017-05-13 17:26:31.355946] I [MSGID: 114057] [client-handshake.c:1440:
select_server_supported_programs] 0-engine-client-0: Using P
rogram GlusterFS 3.3, Num (1298437), Version (330)
[2017-05-13 17:26:31.356280] I [rpc-clnt.c:1965:rpc_clnt_reconfig]
0-engine-client-1: changing port to 49157 (from 0)
Final graph:
+-----------------------------------------------------------
-------------------+
1: volume engine-client-0
2: type protocol/client
3: option clnt-lk-version 1
4: option volfile-checksum 0
5: option volfile-key engine
6: option client-version 3.8.11
7: option process-uuid ovirt1.nwfiber.com-25660-2017/
05/13-17:26:27:311929-engine-client-0-0-0
8: option fops-version 1298437
9: option ping-timeout 30
10: option remote-host
ovirt1.nwfiber.com
11: option remote-subvolume /gluster/brick1/engine
12: option transport-type socket
13: option username 028984cf-0399-42e6-b04b-bb9b1685c536
14: option password eae737cc-9659-405f-865e-9a7ef97a3307
15: option filter-O_DIRECT off
16: option send-gids true
17: end-volume
18:
19: volume engine-client-1
20: type protocol/client
21: option ping-timeout 30
22: option remote-host
ovirt2.nwfiber.com
23: option remote-subvolume /gluster/brick1/engine
24: option transport-type socket
25: option username 028984cf-0399-42e6-b04b-bb9b1685c536
26: option password eae737cc-9659-405f-865e-9a7ef97a3307
27: option filter-O_DIRECT off
28: option send-gids true
29: end-volume
30:
31: volume engine-client-2
32: type protocol/client
33: option ping-timeout 30
34: option remote-host
ovirt3.nwfiber.com
35: option remote-subvolume /gluster/brick1/engine
36: option transport-type socket
37: option username 028984cf-0399-42e6-b04b-bb9b1685c536
38: option password eae737cc-9659-405f-865e-9a7ef97a3307
39: option filter-O_DIRECT off
40: option send-gids true
41: end-volume
42:
43: volume engine-replicate-0
44: type cluster/replicate
45: option arbiter-count 1
46: option data-self-heal-algorithm full
47: option eager-lock enable
48: option quorum-type auto
49: option shd-max-threads 6
50: option shd-wait-qlength 10000
51: option locking-scheme granular
52: subvolumes engine-client-0 engine-client-1 engine-client-2
53: end-volume
54:
55: volume engine-dht
56: type cluster/distribute
57: option lock-migration off
58: subvolumes engine-replicate-0
59: end-volume
60:
61: volume engine-shard
62: type features/shard
63: option shard-block-size 512MB
64: subvolumes engine-dht
65: end-volume
66:
67: volume engine-write-behind
68: type performance/write-behind
69: option strict-O_DIRECT on
70: subvolumes engine-shard
71: end-volume
72:
73: volume engine-readdir-ahead
74: type performance/readdir-ahead
75: subvolumes engine-write-behind
76: end-volume
77:
78: volume engine-open-behind
79: type performance/open-behind
80: subvolumes engine-readdir-ahead
81: end-volume
82:
83: volume engine
84: type debug/io-stats
85: option log-level INFO
86: option latency-measurement off
87: option count-fop-hits off
88: subvolumes engine-open-behind
89: end-volume
90:
91: volume meta-autoload
92: type meta
93: subvolumes engine
94: end-volume
95:
+-----------------------------------------------------------
-------------------+
[2017-05-13 17:26:31.360579] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk]
0-engine-client-0: Connected to engine
-client-0, attached to remote volume '/gluster/brick1/engine'.
[2017-05-13 17:26:31.360599] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk]
0-engine-client-0: Server and Client l
k-version numbers are not same, reopening the fds
[2017-05-13 17:26:31.360707] I [MSGID: 108005]
[afr-common.c:4387:afr_notify] 0-engine-replicate-0: Subvolume
'engine-client-0' came
back up; going online.
[2017-05-13 17:26:31.360793] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
0-engine-client-0: Server lk versi
on = 1
[2017-05-13 17:26:31.361284] I [rpc-clnt.c:1965:rpc_clnt_reconfig]
0-engine-client-2: changing port to 49158 (from 0)
[2017-05-13 17:26:31.365070] I [MSGID: 114057] [client-handshake.c:1440:
select_server_supported_programs] 0-engine-client-1: Using P
rogram GlusterFS 3.3, Num (1298437), Version (330)
[2017-05-13 17:26:31.365788] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk]
0-engine-client-1: Connected to engine
-client-1, attached to remote volume '/gluster/brick1/engine'.
[2017-05-13 17:26:31.365821] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk]
0-engine-client-1: Server and Client l
k-version numbers are not same, reopening the fds
[2017-05-13 17:26:31.366059] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
0-engine-client-1: Server lk versi
on = 1
[2017-05-13 17:26:31.369948] I [MSGID: 114057] [client-handshake.c:1440:
select_server_supported_programs] 0-engine-client-2: Using P
rogram GlusterFS 3.3, Num (1298437), Version (330)
[2017-05-13 17:26:31.370657] I [MSGID: 114046]
[client-handshake.c:1216:client_setvolume_cbk]
0-engine-client-2: Connected to engine
-client-2, attached to remote volume '/gluster/brick1/engine'.
[2017-05-13 17:26:31.370683] I [MSGID: 114047]
[client-handshake.c:1227:client_setvolume_cbk]
0-engine-client-2: Server and Client l
k-version numbers are not same, reopening the fds
[2017-05-13 17:26:31.383548] I [MSGID: 114035]
[client-handshake.c:202:client_set_lk_version_cbk]
0-engine-client-2: Server lk versi
on = 1
[2017-05-13 17:26:31.383649] I [fuse-bridge.c:4147:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 k
ernel 7.22
[2017-05-13 17:26:31.383676] I [fuse-bridge.c:4832:fuse_graph_sync]
0-fuse: switched to graph 0
[2017-05-13 17:26:31.385453] I [MSGID: 108031]
[afr-common.c:2157:afr_local_discovery_cbk] 0-engine-replicate-0:
selecting local rea
d_child engine-client-0
[2017-05-13 17:26:31.396741] I [fuse-bridge.c:5080:fuse_thread_proc]
0-fuse: unmounting /tmp/georepsetup_wZtfkN
[2017-05-13 17:26:31.397086] W [glusterfsd.c:1327:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dc5) [0x7f8838df6dc5] -->glusterf
s(glusterfs_sigwaiter+0xe5) [0x7f883a488cd5] -->glusterfs(cleanup_and_exit+0x6b)
[0x7f883a488b4b] ) 0-: received signum (15), shutti
ng down
[2017-05-13 17:26:31.397112] I [fuse-bridge.c:5788:fini] 0-fuse:
Unmounting '/tmp/georepsetup_wZtfkN'.
[2017-05-13 17:26:31.413901] I [MSGID: 100030] [glusterfsd.c:2454:main]
0-glusterfs: Started running glusterfs version 3.8.11 (args:
glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server
georep.nwfiber.com --volfile-id engine -l /var/log/glusterfs/
geo-replication/georepsetup.mount.log --client-pid=-1
/tmp/georepsetup_M5poIr)
[2017-05-13 17:26:31.458733] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker]
0-epoll: Started thread with index 1
[2017-05-13 17:26:31.458833] E [socket.c:2309:socket_connect_finish]
0-glusterfs: connection to 192.168.8.126:24007 failed (Connecti
on refused)
[2017-05-13 17:26:31.458886] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify]
0-glusterfsd-mgmt: failed to connect with remote-host: geore
p.nwfiber.com (Transport endpoint is not connected)
[2017-05-13 17:26:31.458900] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify]
0-glusterfsd-mgmt: Exhausted all volfile servers
[2017-05-13 17:26:31.459173] W [glusterfsd.c:1327:cleanup_and_exit]
(-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xdb) [0x7f18d6c89aab]
-->glusterfs(+0x10309) [0x7f18d73b9309] -->glusterfs(cleanup_and_exit+0x6b)
[0x7f18d73b2b4b] ) 0-: received signum (1), shutting dow
n
[2017-05-13 17:26:31.459218] I [fuse-bridge.c:5788:fini] 0-fuse:
Unmounting '/tmp/georepsetup_M5poIr'.
[2017-05-13 17:26:31.459887] W [glusterfsd.c:1327:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dc5) [0x7f18d5d20dc5] -->glusterf
s(glusterfs_sigwaiter+0xe5) [0x7f18d73b2cd5] -->glusterfs(cleanup_and_exit+0x6b)
[0x7f18d73b2b4b] ) 0-: received signum (15), shutti
ng down
I don't know what to make of that.
On a whim, I thought that perhaps the georep setup does not set up the
remote volume (I assumed it would, I thought that was what the ssh was
required for, and none of the instructions mentioned create your
destination (replication) volume. So I tried to create it, but it won't
let me create a volume with replica 1. this is already a backup, I don't
need a backup of a backup. This further supported my thought that the
volume needs to be created by the georep setup commands.
The destination or slave volume needs to be created prior to setting up the
geo-replication session. You should be able to create a replica 1 volume as
destination volume. How did you try to create this?
Is glusterd running on
Where am I wrong / what do I need to do to fix this?
--Jim
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users