
I tried to create a gluster volume on the georep node by running: gluster volume create engine-rep replica 1 georep.nwfiber.com: /mnt/gluster/engine-rep I got back an error saying replica must be > 1. So I tried to create it again: gluster volume create engine-rep replica 2 georep.nwfiber.com:/mnt/gluster/engine-rep server2.nwfiber.com:/mnt/gluster/engine-rep where server2 did not exist. That failed too, but I don't recall the error message. gluster is installed, but when I try and start it with the init script, it fails to start with a complaint about reading the block file; my googling indicated that's the error you get until you've created a gluster volume, and that was the first clue to me that maybe I needed to create one first. So, how do I create a replica 1 volume? Thinking way ahead, I have a related replica question: Currently my ovirt nodes are also my gluster nodes (replica 2 arbitrar 1). Eventually I'll want to pull my gluster off onto dedicated hardware I suspect. If I do so, do I need 3 servers, or is a replica 2 sufficient? I guess I could have an ovirt node continue to be an arbitrar... I would eventually like to distribute my ovirt cluster accross multiple locations with the option for remote failover (say location A looses all its network and/or power; have important VMs started at location B in addition to location B's normal VMs). I assume at this point the recommended arch would be: 2 Gluster servers at each location Each location has a gluster volume for that location, and is georep for the other location (so all my data will physically exist on 4 gluster servers). I probably won't have more than 2 or 3 ovirt hosts at each location, so I don't expect this to be a "heavy use" system. Am I on track? I'd be interested to learn what others suggest for this deployment model. On Sun, May 14, 2017 at 11:09 PM, Sahina Bose <sabose@redhat.com> wrote:
Adding Aravinda
On Sat, May 13, 2017 at 11:21 PM, Jim Kusznir <jim@palousetech.com> wrote:
Hi All:
I've been trying to set up georeplication for a while now, but can't seem to make it work. I've found documentation on the web (mostly https://gluster.readthedocs.io/en/refactor/Administr ator%20Guide/Geo%20Replication/), and I found http://blog.gluster.org/ 2015/09/introducing-georepsetup-gluster-geo-replication-setup-tool/
Unfortunately, it seems that some critical steps are missing from both, and I can't figure out for sure what they are.
My environment:
Production: replica 2 + arbitrator running on my 3-node oVirt cluster, 3 volumes (engine, data, iso).
New geo-replication: Raspberry Pi3 with USB hard drive shoved in some other data closet off-site.
I've installed rasbian-lite, and after much fighting, got glusterfs-*-3.8.11 installed. I've created my mountpoint (USB hard drive, much larger than my gluster volumes), and then ran the command. I get this far:
[ OK] georep.nwfiber.com is Reachable(Port 22) [ OK] SSH Connection established root@georep.nwfiber.com [ OK] Master Volume and Slave Volume are compatible (Version: 3.8.11) [NOT OK] Unable to Mount Gluster Volume georep.nwfiber.com:engine-rep
Trying it with the steps in the gluster docs also has the same problem. No long files are generated on the slave. Log files on the master include:
[root@ovirt1 geo-replication]# more georepsetup.mount.log [2017-05-13 17:26:27.318599] I [MSGID: 100030] [glusterfsd.c:2454:main] 0-glusterfs: Started running glusterfs version 3.8.11 (args: glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server localhost --volfile-id engine -l /var/log/glusterfs/geo-repli cation/georepsetup.mount.log --client-pid=-1 /tmp/georepsetup_wZtfkN) [2017-05-13 17:26:27.341170] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-05-13 17:26:27.341260] E [socket.c:2309:socket_connect_finish] 0-glusterfs: connection to ::1:24007 failed (Connection refused ) [2017-05-13 17:26:27.341846] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: local host (Transport endpoint is not connected) [2017-05-13 17:26:31.335849] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2017-05-13 17:26:31.337545] I [MSGID: 114020] [client.c:2356:notify] 0-engine-client-0: parent translators are ready, attempting co nnect on transport [2017-05-13 17:26:31.344485] I [MSGID: 114020] [client.c:2356:notify] 0-engine-client-1: parent translators are ready, attempting co nnect on transport [2017-05-13 17:26:31.345146] I [rpc-clnt.c:1965:rpc_clnt_reconfig] 0-engine-client-0: changing port to 49157 (from 0) [2017-05-13 17:26:31.350868] I [MSGID: 114020] [client.c:2356:notify] 0-engine-client-2: parent translators are ready, attempting co nnect on transport [2017-05-13 17:26:31.355946] I [MSGID: 114057] [client-handshake.c:1440:select_server_supported_programs] 0-engine-client-0: Using P rogram GlusterFS 3.3, Num (1298437), Version (330) [2017-05-13 17:26:31.356280] I [rpc-clnt.c:1965:rpc_clnt_reconfig] 0-engine-client-1: changing port to 49157 (from 0) Final graph: +----------------------------------------------------------- -------------------+ 1: volume engine-client-0 2: type protocol/client 3: option clnt-lk-version 1 4: option volfile-checksum 0 5: option volfile-key engine 6: option client-version 3.8.11 7: option process-uuid ovirt1.nwfiber.com-25660-2017/ 05/13-17:26:27:311929-engine-client-0-0-0 8: option fops-version 1298437 9: option ping-timeout 30 10: option remote-host ovirt1.nwfiber.com 11: option remote-subvolume /gluster/brick1/engine 12: option transport-type socket 13: option username 028984cf-0399-42e6-b04b-bb9b1685c536 14: option password eae737cc-9659-405f-865e-9a7ef97a3307 15: option filter-O_DIRECT off 16: option send-gids true 17: end-volume 18: 19: volume engine-client-1 20: type protocol/client 21: option ping-timeout 30 22: option remote-host ovirt2.nwfiber.com 23: option remote-subvolume /gluster/brick1/engine 24: option transport-type socket 25: option username 028984cf-0399-42e6-b04b-bb9b1685c536 26: option password eae737cc-9659-405f-865e-9a7ef97a3307 27: option filter-O_DIRECT off 28: option send-gids true 29: end-volume 30: 31: volume engine-client-2 32: type protocol/client 33: option ping-timeout 30 34: option remote-host ovirt3.nwfiber.com 35: option remote-subvolume /gluster/brick1/engine 36: option transport-type socket 37: option username 028984cf-0399-42e6-b04b-bb9b1685c536 38: option password eae737cc-9659-405f-865e-9a7ef97a3307 39: option filter-O_DIRECT off 40: option send-gids true 41: end-volume 42: 43: volume engine-replicate-0 44: type cluster/replicate 45: option arbiter-count 1 46: option data-self-heal-algorithm full 47: option eager-lock enable 48: option quorum-type auto 49: option shd-max-threads 6 50: option shd-wait-qlength 10000 51: option locking-scheme granular 52: subvolumes engine-client-0 engine-client-1 engine-client-2 53: end-volume 54: 55: volume engine-dht 56: type cluster/distribute 57: option lock-migration off 58: subvolumes engine-replicate-0 59: end-volume 60: 61: volume engine-shard 62: type features/shard 63: option shard-block-size 512MB 64: subvolumes engine-dht 65: end-volume 66: 67: volume engine-write-behind 68: type performance/write-behind 69: option strict-O_DIRECT on 70: subvolumes engine-shard 71: end-volume 72: 73: volume engine-readdir-ahead 74: type performance/readdir-ahead 75: subvolumes engine-write-behind 76: end-volume 77: 78: volume engine-open-behind 79: type performance/open-behind 80: subvolumes engine-readdir-ahead 81: end-volume 82: 83: volume engine 84: type debug/io-stats 85: option log-level INFO 86: option latency-measurement off 87: option count-fop-hits off 88: subvolumes engine-open-behind 89: end-volume 90: 91: volume meta-autoload 92: type meta 93: subvolumes engine 94: end-volume 95: +----------------------------------------------------------- -------------------+ [2017-05-13 17:26:31.360579] I [MSGID: 114046] [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-0: Connected to engine -client-0, attached to remote volume '/gluster/brick1/engine'. [2017-05-13 17:26:31.360599] I [MSGID: 114047] [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-0: Server and Client l k-version numbers are not same, reopening the fds [2017-05-13 17:26:31.360707] I [MSGID: 108005] [afr-common.c:4387:afr_notify] 0-engine-replicate-0: Subvolume 'engine-client-0' came back up; going online. [2017-05-13 17:26:31.360793] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-0: Server lk versi on = 1 [2017-05-13 17:26:31.361284] I [rpc-clnt.c:1965:rpc_clnt_reconfig] 0-engine-client-2: changing port to 49158 (from 0) [2017-05-13 17:26:31.365070] I [MSGID: 114057] [client-handshake.c:1440:select_server_supported_programs] 0-engine-client-1: Using P rogram GlusterFS 3.3, Num (1298437), Version (330) [2017-05-13 17:26:31.365788] I [MSGID: 114046] [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-1: Connected to engine -client-1, attached to remote volume '/gluster/brick1/engine'. [2017-05-13 17:26:31.365821] I [MSGID: 114047] [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-1: Server and Client l k-version numbers are not same, reopening the fds [2017-05-13 17:26:31.366059] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-1: Server lk versi on = 1 [2017-05-13 17:26:31.369948] I [MSGID: 114057] [client-handshake.c:1440:select_server_supported_programs] 0-engine-client-2: Using P rogram GlusterFS 3.3, Num (1298437), Version (330) [2017-05-13 17:26:31.370657] I [MSGID: 114046] [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-2: Connected to engine -client-2, attached to remote volume '/gluster/brick1/engine'. [2017-05-13 17:26:31.370683] I [MSGID: 114047] [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-2: Server and Client l k-version numbers are not same, reopening the fds [2017-05-13 17:26:31.383548] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-2: Server lk versi on = 1 [2017-05-13 17:26:31.383649] I [fuse-bridge.c:4147:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 k ernel 7.22 [2017-05-13 17:26:31.383676] I [fuse-bridge.c:4832:fuse_graph_sync] 0-fuse: switched to graph 0 [2017-05-13 17:26:31.385453] I [MSGID: 108031] [afr-common.c:2157:afr_local_discovery_cbk] 0-engine-replicate-0: selecting local rea d_child engine-client-0 [2017-05-13 17:26:31.396741] I [fuse-bridge.c:5080:fuse_thread_proc] 0-fuse: unmounting /tmp/georepsetup_wZtfkN [2017-05-13 17:26:31.397086] W [glusterfsd.c:1327:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f8838df6dc5] -->glusterf s(glusterfs_sigwaiter+0xe5) [0x7f883a488cd5] -->glusterfs(cleanup_and_exit+0x6b) [0x7f883a488b4b] ) 0-: received signum (15), shutti ng down [2017-05-13 17:26:31.397112] I [fuse-bridge.c:5788:fini] 0-fuse: Unmounting '/tmp/georepsetup_wZtfkN'. [2017-05-13 17:26:31.413901] I [MSGID: 100030] [glusterfsd.c:2454:main] 0-glusterfs: Started running glusterfs version 3.8.11 (args: glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server georep.nwfiber.com --volfile-id engine -l /var/log/glusterfs/ geo-replication/georepsetup.mount.log --client-pid=-1 /tmp/georepsetup_M5poIr) [2017-05-13 17:26:31.458733] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-05-13 17:26:31.458833] E [socket.c:2309:socket_connect_finish] 0-glusterfs: connection to 192.168.8.126:24007 failed (Connecti on refused) [2017-05-13 17:26:31.458886] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: geore p.nwfiber.com (Transport endpoint is not connected) [2017-05-13 17:26:31.458900] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers [2017-05-13 17:26:31.459173] W [glusterfsd.c:1327:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xdb) [0x7f18d6c89aab] -->glusterfs(+0x10309) [0x7f18d73b9309] -->glusterfs(cleanup_and_exit+0x6b) [0x7f18d73b2b4b] ) 0-: received signum (1), shutting dow n [2017-05-13 17:26:31.459218] I [fuse-bridge.c:5788:fini] 0-fuse: Unmounting '/tmp/georepsetup_M5poIr'. [2017-05-13 17:26:31.459887] W [glusterfsd.c:1327:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f18d5d20dc5] -->glusterf s(glusterfs_sigwaiter+0xe5) [0x7f18d73b2cd5] -->glusterfs(cleanup_and_exit+0x6b) [0x7f18d73b2b4b] ) 0-: received signum (15), shutti ng down
I don't know what to make of that.
On a whim, I thought that perhaps the georep setup does not set up the remote volume (I assumed it would, I thought that was what the ssh was required for, and none of the instructions mentioned create your destination (replication) volume. So I tried to create it, but it won't let me create a volume with replica 1. this is already a backup, I don't need a backup of a backup. This further supported my thought that the volume needs to be created by the georep setup commands.
The destination or slave volume needs to be created prior to setting up the geo-replication session. You should be able to create a replica 1 volume as destination volume. How did you try to create this?
Is glusterd running on georep.nwfiber.com ? And are the gluster ports open?
Where am I wrong / what do I need to do to fix this?
--Jim
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users