Hi All:I've been trying to set up georeplication for a while now, but can't seem to make it work. I've found documentation on the web (mostly https://gluster.readthedocs.io/en/refactor/ ), and I found http://blog.gluster.org/Administrator%20Guide/Geo% 20Replication/ 2015/09/introducing- georepsetup-gluster-geo- replication-setup-tool/ Unfortunately, it seems that some critical steps are missing from both, and I can't figure out for sure what they are.My environment:Production: replica 2 + arbitrator running on my 3-node oVirt cluster, 3 volumes (engine, data, iso).New geo-replication: Raspberry Pi3 with USB hard drive shoved in some other data closet off-site.I've installed rasbian-lite, and after much fighting, got glusterfs-*-3.8.11 installed. I've created my mountpoint (USB hard drive, much larger than my gluster volumes), and then ran the command. I get this far:[ OK] georep.nwfiber.com is Reachable(Port 22)[ OK] SSH Connection established root@georep.nwfiber.com[ OK] Master Volume and Slave Volume are compatible (Version: 3.8.11)[NOT OK] Unable to Mount Gluster Volume georep.nwfiber.com:engine-repTrying it with the steps in the gluster docs also has the same problem. No long files are generated on the slave. Log files on the master include:[root@ovirt1 geo-replication]# more georepsetup.mount.log[2017-05-13 17:26:27.318599] I [MSGID: 100030] [glusterfsd.c:2454:main] 0-glusterfs: Started running glusterfs version 3.8.11 (args:glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server localhost --volfile-id engine -l /var/log/glusterfs/geo-repli cation/georepsetup.mount.log --client-pid=-1 /tmp/georepsetup_wZtfkN)[2017-05-13 17:26:27.341170] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-05-13 17:26:27.341260] E [socket.c:2309:socket_connect_finish] 0-glusterfs: connection to ::1:24007 failed (Connection refused )[2017-05-13 17:26:27.341846] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: local host (Transport endpoint is not connected)[2017-05-13 17:26:31.335849] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2017-05-13 17:26:31.337545] I [MSGID: 114020] [client.c:2356:notify] 0-engine-client-0: parent translators are ready, attempting connect on transport[2017-05-13 17:26:31.344485] I [MSGID: 114020] [client.c:2356:notify] 0-engine-client-1: parent translators are ready, attempting connect on transport[2017-05-13 17:26:31.345146] I [rpc-clnt.c:1965:rpc_clnt_reconfig] 0-engine-client-0: changing port to 49157 (from 0) [2017-05-13 17:26:31.350868] I [MSGID: 114020] [client.c:2356:notify] 0-engine-client-2: parent translators are ready, attempting connect on transport[2017-05-13 17:26:31.355946] I [MSGID: 114057] [client-handshake.c:1440:select_server_supported_ programs] 0-engine-client-0: Using P rogram GlusterFS 3.3, Num (1298437), Version (330)[2017-05-13 17:26:31.356280] I [rpc-clnt.c:1965:rpc_clnt_reconfig] 0-engine-client-1: changing port to 49157 (from 0) Final graph:+----------------------------------------------------------- -------------------+ 1: volume engine-client-02: type protocol/client3: option clnt-lk-version 14: option volfile-checksum 05: option volfile-key engine6: option client-version 3.8.117: option process-uuid ovirt1.nwfiber.com-25660-2017/05/13-17:26:27:311929-engine- client-0-0-0 8: option fops-version 12984379: option ping-timeout 3010: option remote-host ovirt1.nwfiber.com11: option remote-subvolume /gluster/brick1/engine12: option transport-type socket13: option username 028984cf-0399-42e6-b04b-bb9b1685c536 14: option password eae737cc-9659-405f-865e-9a7ef97a3307 15: option filter-O_DIRECT off16: option send-gids true17: end-volume18:19: volume engine-client-120: type protocol/client21: option ping-timeout 3022: option remote-host ovirt2.nwfiber.com23: option remote-subvolume /gluster/brick1/engine24: option transport-type socket25: option username 028984cf-0399-42e6-b04b-bb9b1685c536 26: option password eae737cc-9659-405f-865e-9a7ef97a3307 27: option filter-O_DIRECT off28: option send-gids true29: end-volume30:31: volume engine-client-232: type protocol/client33: option ping-timeout 3034: option remote-host ovirt3.nwfiber.com35: option remote-subvolume /gluster/brick1/engine36: option transport-type socket37: option username 028984cf-0399-42e6-b04b-bb9b1685c536 38: option password eae737cc-9659-405f-865e-9a7ef97a3307 39: option filter-O_DIRECT off40: option send-gids true41: end-volume42:43: volume engine-replicate-044: type cluster/replicate45: option arbiter-count 146: option data-self-heal-algorithm full47: option eager-lock enable48: option quorum-type auto49: option shd-max-threads 650: option shd-wait-qlength 1000051: option locking-scheme granular52: subvolumes engine-client-0 engine-client-1 engine-client-253: end-volume54:55: volume engine-dht56: type cluster/distribute57: option lock-migration off58: subvolumes engine-replicate-059: end-volume60:61: volume engine-shard62: type features/shard63: option shard-block-size 512MB64: subvolumes engine-dht65: end-volume66:67: volume engine-write-behind68: type performance/write-behind69: option strict-O_DIRECT on70: subvolumes engine-shard71: end-volume72:73: volume engine-readdir-ahead74: type performance/readdir-ahead75: subvolumes engine-write-behind76: end-volume77:78: volume engine-open-behind79: type performance/open-behind80: subvolumes engine-readdir-ahead81: end-volume82:83: volume engine84: type debug/io-stats85: option log-level INFO86: option latency-measurement off87: option count-fop-hits off88: subvolumes engine-open-behind89: end-volume90:91: volume meta-autoload92: type meta93: subvolumes engine94: end-volume95:+----------------------------------------------------------- -------------------+ [2017-05-13 17:26:31.360579] I [MSGID: 114046] [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-0: Connected to engine -client-0, attached to remote volume '/gluster/brick1/engine'.[2017-05-13 17:26:31.360599] I [MSGID: 114047] [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-0: Server and Client l k-version numbers are not same, reopening the fds[2017-05-13 17:26:31.360707] I [MSGID: 108005] [afr-common.c:4387:afr_notify] 0-engine-replicate-0: Subvolume 'engine-client-0' cameback up; going online.[2017-05-13 17:26:31.360793] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-0: Server lk versi on = 1[2017-05-13 17:26:31.361284] I [rpc-clnt.c:1965:rpc_clnt_reconfig] 0-engine-client-2: changing port to 49158 (from 0) [2017-05-13 17:26:31.365070] I [MSGID: 114057] [client-handshake.c:1440:select_server_supported_ programs] 0-engine-client-1: Using P rogram GlusterFS 3.3, Num (1298437), Version (330)[2017-05-13 17:26:31.365788] I [MSGID: 114046] [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-1: Connected to engine -client-1, attached to remote volume '/gluster/brick1/engine'.[2017-05-13 17:26:31.365821] I [MSGID: 114047] [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-1: Server and Client l k-version numbers are not same, reopening the fds[2017-05-13 17:26:31.366059] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-1: Server lk versi on = 1[2017-05-13 17:26:31.369948] I [MSGID: 114057] [client-handshake.c:1440:select_server_supported_ programs] 0-engine-client-2: Using P rogram GlusterFS 3.3, Num (1298437), Version (330)[2017-05-13 17:26:31.370657] I [MSGID: 114046] [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-2: Connected to engine -client-2, attached to remote volume '/gluster/brick1/engine'.[2017-05-13 17:26:31.370683] I [MSGID: 114047] [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-2: Server and Client l k-version numbers are not same, reopening the fds[2017-05-13 17:26:31.383548] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-2: Server lk versi on = 1[2017-05-13 17:26:31.383649] I [fuse-bridge.c:4147:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.22[2017-05-13 17:26:31.383676] I [fuse-bridge.c:4832:fuse_graph_sync] 0-fuse: switched to graph 0 [2017-05-13 17:26:31.385453] I [MSGID: 108031] [afr-common.c:2157:afr_local_discovery_cbk] 0-engine-replicate-0: selecting local rea d_child engine-client-0[2017-05-13 17:26:31.396741] I [fuse-bridge.c:5080:fuse_thread_proc] 0-fuse: unmounting /tmp/georepsetup_wZtfkN [2017-05-13 17:26:31.397086] W [glusterfsd.c:1327:cleanup_and_exit] (-->/lib64/libpthread.so.0(+ 0x7dc5) [0x7f8838df6dc5] -->glusterf s(glusterfs_sigwaiter+0xe5) [0x7f883a488cd5] -->glusterfs(cleanup_and_exit+0x6b) [0x7f883a488b4b] ) 0-: received signum (15), shutti ng down[2017-05-13 17:26:31.397112] I [fuse-bridge.c:5788:fini] 0-fuse: Unmounting '/tmp/georepsetup_wZtfkN'.[2017-05-13 17:26:31.413901] I [MSGID: 100030] [glusterfsd.c:2454:main] 0-glusterfs: Started running glusterfs version 3.8.11 (args:glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server georep.nwfiber.com --volfile-id engine -l /var/log/glusterfs/ geo-replication/georepsetup.mount.log --client-pid=-1 /tmp/georepsetup_M5poIr) [2017-05-13 17:26:31.458733] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-05-13 17:26:31.458833] E [socket.c:2309:socket_connect_finish] 0-glusterfs: connection to 192.168.8.126:24007 failed (Connecti on refused)[2017-05-13 17:26:31.458886] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: geore p.nwfiber.com (Transport endpoint is not connected)[2017-05-13 17:26:31.458900] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers [2017-05-13 17:26:31.459173] W [glusterfsd.c:1327:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_ clnt_notify+0xdb) [0x7f18d6c89aab] -->glusterfs(+0x10309) [0x7f18d73b9309] -->glusterfs(cleanup_and_exit+0x6b) [0x7f18d73b2b4b] ) 0-: received signum (1), shutting dow n[2017-05-13 17:26:31.459218] I [fuse-bridge.c:5788:fini] 0-fuse: Unmounting '/tmp/georepsetup_M5poIr'.[2017-05-13 17:26:31.459887] W [glusterfsd.c:1327:cleanup_and_exit] (-->/lib64/libpthread.so.0(+ 0x7dc5) [0x7f18d5d20dc5] -->glusterf s(glusterfs_sigwaiter+0xe5) [0x7f18d73b2cd5] -->glusterfs(cleanup_and_exit+0x6b) [0x7f18d73b2b4b] ) 0-: received signum (15), shutti ng downI don't know what to make of that.On a whim, I thought that perhaps the georep setup does not set up the remote volume (I assumed it would, I thought that was what the ssh was required for, and none of the instructions mentioned create your destination (replication) volume. So I tried to create it, but it won't let me create a volume with replica 1. this is already a backup, I don't need a backup of a backup. This further supported my thought that the volume needs to be created by the georep setup commands.
Where am I wrong / what do I need to do to fix this?--Jim
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users