[ovirt-users] Setting up GeoReplication

Mon May 15 14:10:41 UTC 2017

I tried to create a gluster volume on the georep node by running:

gluster volume create engine-rep replica 1 georep.nwfiber.com:
/mnt/gluster/engine-rep

I got back an error saying replica must be > 1.  So I tried to create it
again:

gluster volume create engine-rep replica 2
georep.nwfiber.com:/mnt/gluster/engine-rep
server2.nwfiber.com:/mnt/gluster/engine-rep

where server2 did not exist.  That failed too, but I don't recall the error
message.

gluster is installed, but when I try and start it with the init script, it
fails to start with a complaint about reading the block file; my googling
indicated that's the error you get until you've created a gluster volume,
and that was the first clue to me that maybe I needed to create one first.

So, how do I create a replica 1 volume?

Thinking way ahead, I have a related replica question:  Currently my ovirt
nodes are also my gluster nodes (replica 2 arbitrar 1).  Eventually I'll
want to pull my gluster off onto dedicated hardware I suspect.  If I do so,
do I need 3 servers, or is a replica 2 sufficient?  I guess I could have an
ovirt node continue to be an arbitrar...  I would eventually like to
distribute my ovirt cluster accross multiple locations with the option for
remote failover (say location A looses all its network and/or power; have
important VMs started at location B in addition to location B's normal
VMs).  I assume at this point the recommended arch would be:

2 Gluster servers at each location
Each location has a gluster volume for that location, and is georep for the
other location (so all my data will physically exist on 4 gluster
servers).  I probably won't have more than 2 or 3 ovirt hosts at each
location, so I don't expect this to be a "heavy use" system.

Am I on track?  I'd be interested to learn what others suggest for this
deployment model.

On Sun, May 14, 2017 at 11:09 PM, Sahina Bose <sabose at redhat.com> wrote:

> Adding Aravinda
>
> On Sat, May 13, 2017 at 11:21 PM, Jim Kusznir <jim at palousetech.com> wrote:
>
>> Hi All:
>>
>> I've been trying to set up georeplication for a while now, but can't seem
>> to make it work.  I've found documentation on the web (mostly
>> https://gluster.readthedocs.io/en/refactor/Administr
>> ator%20Guide/Geo%20Replication/), and I found http://blog.gluster.org/
>> 2015/09/introducing-georepsetup-gluster-geo-replication-setup-tool/
>>
>> Unfortunately, it seems that some critical steps are missing from both,
>> and I can't figure out for sure what they are.
>>
>> My environment:
>>
>> Production: replica 2 + arbitrator running on my 3-node oVirt cluster, 3
>> volumes (engine, data, iso).
>>
>> New geo-replication: Raspberry Pi3 with USB hard drive shoved in some
>> other data closet off-site.
>>
>> I've installed rasbian-lite, and after much fighting, got
>> glusterfs-*-3.8.11 installed.  I've created my mountpoint (USB hard drive,
>> much larger than my gluster volumes), and then ran the command.  I get this
>> far:
>>
>> [    OK] georep.nwfiber.com is Reachable(Port 22)
>> [    OK] SSH Connection established root at georep.nwfiber.com
>> [    OK] Master Volume and Slave Volume are compatible (Version: 3.8.11)
>> [NOT OK] Unable to Mount Gluster Volume georep.nwfiber.com:engine-rep
>>
>> Trying it with the steps in the gluster docs also has the same problem.
>> No long files are generated on the slave.  Log files on the master include:
>>
>> [root at ovirt1 geo-replication]# more georepsetup.mount.log
>> [2017-05-13 17:26:27.318599] I [MSGID: 100030] [glusterfsd.c:2454:main]
>> 0-glusterfs: Started running glusterfs version 3.8.11 (args:
>>  glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server
>> localhost --volfile-id engine -l /var/log/glusterfs/geo-repli
>> cation/georepsetup.mount.log --client-pid=-1 /tmp/georepsetup_wZtfkN)
>> [2017-05-13 17:26:27.341170] I [MSGID: 101190]
>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 1
>> [2017-05-13 17:26:27.341260] E [socket.c:2309:socket_connect_finish]
>> 0-glusterfs: connection to ::1:24007 failed (Connection refused
>> )
>> [2017-05-13 17:26:27.341846] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify]
>> 0-glusterfsd-mgmt: failed to connect with remote-host: local
>> host (Transport endpoint is not connected)
>> [2017-05-13 17:26:31.335849] I [MSGID: 101190]
>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 2
>> [2017-05-13 17:26:31.337545] I [MSGID: 114020] [client.c:2356:notify]
>> 0-engine-client-0: parent translators are ready, attempting co
>> nnect on transport
>> [2017-05-13 17:26:31.344485] I [MSGID: 114020] [client.c:2356:notify]
>> 0-engine-client-1: parent translators are ready, attempting co
>> nnect on transport
>> [2017-05-13 17:26:31.345146] I [rpc-clnt.c:1965:rpc_clnt_reconfig]
>> 0-engine-client-0: changing port to 49157 (from 0)
>> [2017-05-13 17:26:31.350868] I [MSGID: 114020] [client.c:2356:notify]
>> 0-engine-client-2: parent translators are ready, attempting co
>> nnect on transport
>> [2017-05-13 17:26:31.355946] I [MSGID: 114057]
>> [client-handshake.c:1440:select_server_supported_programs]
>> 0-engine-client-0: Using P
>> rogram GlusterFS 3.3, Num (1298437), Version (330)
>> [2017-05-13 17:26:31.356280] I [rpc-clnt.c:1965:rpc_clnt_reconfig]
>> 0-engine-client-1: changing port to 49157 (from 0)
>> Final graph:
>> +-----------------------------------------------------------
>> -------------------+
>>   1: volume engine-client-0
>>   2:     type protocol/client
>>   3:     option clnt-lk-version 1
>>   4:     option volfile-checksum 0
>>   5:     option volfile-key engine
>>   6:     option client-version 3.8.11
>>   7:     option process-uuid ovirt1.nwfiber.com-25660-2017/
>> 05/13-17:26:27:311929-engine-client-0-0-0
>>   8:     option fops-version 1298437
>>   9:     option ping-timeout 30
>>  10:     option remote-host ovirt1.nwfiber.com
>>  11:     option remote-subvolume /gluster/brick1/engine
>>  12:     option transport-type socket
>>  13:     option username 028984cf-0399-42e6-b04b-bb9b1685c536
>>  14:     option password eae737cc-9659-405f-865e-9a7ef97a3307
>>  15:     option filter-O_DIRECT off
>>  16:     option send-gids true
>>  17: end-volume
>>  18:
>>  19: volume engine-client-1
>>  20:     type protocol/client
>>  21:     option ping-timeout 30
>>  22:     option remote-host ovirt2.nwfiber.com
>>  23:     option remote-subvolume /gluster/brick1/engine
>>  24:     option transport-type socket
>>  25:     option username 028984cf-0399-42e6-b04b-bb9b1685c536
>>  26:     option password eae737cc-9659-405f-865e-9a7ef97a3307
>>  27:     option filter-O_DIRECT off
>>  28:     option send-gids true
>>  29: end-volume
>>  30:
>>  31: volume engine-client-2
>>  32:     type protocol/client
>>  33:     option ping-timeout 30
>>  34:     option remote-host ovirt3.nwfiber.com
>>  35:     option remote-subvolume /gluster/brick1/engine
>>  36:     option transport-type socket
>>  37:     option username 028984cf-0399-42e6-b04b-bb9b1685c536
>>  38:     option password eae737cc-9659-405f-865e-9a7ef97a3307
>>  39:     option filter-O_DIRECT off
>>  40:     option send-gids true
>>  41: end-volume
>>  42:
>>  43: volume engine-replicate-0
>>  44:     type cluster/replicate
>>  45:     option arbiter-count 1
>>  46:     option data-self-heal-algorithm full
>>  47:     option eager-lock enable
>>  48:     option quorum-type auto
>>  49:     option shd-max-threads 6
>>  50:     option shd-wait-qlength 10000
>>  51:     option locking-scheme granular
>>  52:     subvolumes engine-client-0 engine-client-1 engine-client-2
>>  53: end-volume
>>  54:
>>  55: volume engine-dht
>>  56:     type cluster/distribute
>>  57:     option lock-migration off
>>  58:     subvolumes engine-replicate-0
>>  59: end-volume
>>  60:
>>  61: volume engine-shard
>>  62:     type features/shard
>>  63:     option shard-block-size 512MB
>>  64:     subvolumes engine-dht
>>  65: end-volume
>>  66:
>>  67: volume engine-write-behind
>>  68:     type performance/write-behind
>>  69:     option strict-O_DIRECT on
>>  70:     subvolumes engine-shard
>>  71: end-volume
>>  72:
>>  73: volume engine-readdir-ahead
>>  74:     type performance/readdir-ahead
>>  75:     subvolumes engine-write-behind
>>  76: end-volume
>>  77:
>>  78: volume engine-open-behind
>>  79:     type performance/open-behind
>>  80:     subvolumes engine-readdir-ahead
>>  81: end-volume
>>  82:
>>  83: volume engine
>>  84:     type debug/io-stats
>>  85:     option log-level INFO
>>  86:     option latency-measurement off
>>  87:     option count-fop-hits off
>>  88:     subvolumes engine-open-behind
>>  89: end-volume
>>  90:
>>  91: volume meta-autoload
>>  92:     type meta
>>  93:     subvolumes engine
>>  94: end-volume
>>  95:
>> +-----------------------------------------------------------
>> -------------------+
>> [2017-05-13 17:26:31.360579] I [MSGID: 114046]
>> [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-0:
>> Connected to engine
>> -client-0, attached to remote volume '/gluster/brick1/engine'.
>> [2017-05-13 17:26:31.360599] I [MSGID: 114047]
>> [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-0: Server
>> and Client l
>> k-version numbers are not same, reopening the fds
>> [2017-05-13 17:26:31.360707] I [MSGID: 108005]
>> [afr-common.c:4387:afr_notify] 0-engine-replicate-0: Subvolume
>> 'engine-client-0' came
>>  back up; going online.
>> [2017-05-13 17:26:31.360793] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-0:
>> Server lk versi
>> on = 1
>> [2017-05-13 17:26:31.361284] I [rpc-clnt.c:1965:rpc_clnt_reconfig]
>> 0-engine-client-2: changing port to 49158 (from 0)
>> [2017-05-13 17:26:31.365070] I [MSGID: 114057]
>> [client-handshake.c:1440:select_server_supported_programs]
>> 0-engine-client-1: Using P
>> rogram GlusterFS 3.3, Num (1298437), Version (330)
>> [2017-05-13 17:26:31.365788] I [MSGID: 114046]
>> [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-1:
>> Connected to engine
>> -client-1, attached to remote volume '/gluster/brick1/engine'.
>> [2017-05-13 17:26:31.365821] I [MSGID: 114047]
>> [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-1: Server
>> and Client l
>> k-version numbers are not same, reopening the fds
>> [2017-05-13 17:26:31.366059] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-1:
>> Server lk versi
>> on = 1
>> [2017-05-13 17:26:31.369948] I [MSGID: 114057]
>> [client-handshake.c:1440:select_server_supported_programs]
>> 0-engine-client-2: Using P
>> rogram GlusterFS 3.3, Num (1298437), Version (330)
>> [2017-05-13 17:26:31.370657] I [MSGID: 114046]
>> [client-handshake.c:1216:client_setvolume_cbk] 0-engine-client-2:
>> Connected to engine
>> -client-2, attached to remote volume '/gluster/brick1/engine'.
>> [2017-05-13 17:26:31.370683] I [MSGID: 114047]
>> [client-handshake.c:1227:client_setvolume_cbk] 0-engine-client-2: Server
>> and Client l
>> k-version numbers are not same, reopening the fds
>> [2017-05-13 17:26:31.383548] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk] 0-engine-client-2:
>> Server lk versi
>> on = 1
>> [2017-05-13 17:26:31.383649] I [fuse-bridge.c:4147:fuse_init]
>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 k
>> ernel 7.22
>> [2017-05-13 17:26:31.383676] I [fuse-bridge.c:4832:fuse_graph_sync]
>> 0-fuse: switched to graph 0
>> [2017-05-13 17:26:31.385453] I [MSGID: 108031]
>> [afr-common.c:2157:afr_local_discovery_cbk] 0-engine-replicate-0:
>> selecting local rea
>> d_child engine-client-0
>> [2017-05-13 17:26:31.396741] I [fuse-bridge.c:5080:fuse_thread_proc]
>> 0-fuse: unmounting /tmp/georepsetup_wZtfkN
>> [2017-05-13 17:26:31.397086] W [glusterfsd.c:1327:cleanup_and_exit]
>> (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f8838df6dc5] -->glusterf
>> s(glusterfs_sigwaiter+0xe5) [0x7f883a488cd5]
>> -->glusterfs(cleanup_and_exit+0x6b) [0x7f883a488b4b] ) 0-: received
>> signum (15), shutti
>> ng down
>> [2017-05-13 17:26:31.397112] I [fuse-bridge.c:5788:fini] 0-fuse:
>> Unmounting '/tmp/georepsetup_wZtfkN'.
>> [2017-05-13 17:26:31.413901] I [MSGID: 100030] [glusterfsd.c:2454:main]
>> 0-glusterfs: Started running glusterfs version 3.8.11 (args:
>>  glusterfs --xlator-option="*dht.lookup-unhashed=off" --volfile-server
>> georep.nwfiber.com --volfile-id engine -l /var/log/glusterfs/
>> geo-replication/georepsetup.mount.log --client-pid=-1
>> /tmp/georepsetup_M5poIr)
>> [2017-05-13 17:26:31.458733] I [MSGID: 101190]
>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 1
>> [2017-05-13 17:26:31.458833] E [socket.c:2309:socket_connect_finish]
>> 0-glusterfs: connection to 192.168.8.126:24007 failed (Connecti
>> on refused)
>> [2017-05-13 17:26:31.458886] E [glusterfsd-mgmt.c:1908:mgmt_rpc_notify]
>> 0-glusterfsd-mgmt: failed to connect with remote-host: geore
>> p.nwfiber.com (Transport endpoint is not connected)
>> [2017-05-13 17:26:31.458900] I [glusterfsd-mgmt.c:1926:mgmt_rpc_notify]
>> 0-glusterfsd-mgmt: Exhausted all volfile servers
>> [2017-05-13 17:26:31.459173] W [glusterfsd.c:1327:cleanup_and_exit]
>> (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xdb) [0x7f18d6c89aab]
>> -->glusterfs(+0x10309) [0x7f18d73b9309] -->glusterfs(cleanup_and_exit+0x6b)
>> [0x7f18d73b2b4b] ) 0-: received signum (1), shutting dow
>> n
>> [2017-05-13 17:26:31.459218] I [fuse-bridge.c:5788:fini] 0-fuse:
>> Unmounting '/tmp/georepsetup_M5poIr'.
>> [2017-05-13 17:26:31.459887] W [glusterfsd.c:1327:cleanup_and_exit]
>> (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f18d5d20dc5] -->glusterf
>> s(glusterfs_sigwaiter+0xe5) [0x7f18d73b2cd5]
>> -->glusterfs(cleanup_and_exit+0x6b) [0x7f18d73b2b4b] ) 0-: received
>> signum (15), shutti
>> ng down
>>
>> I don't know what to make of that.
>>
>> On a whim, I thought that perhaps the georep setup does not set up the
>> remote volume (I assumed it would, I thought that was what the ssh was
>> required for, and none of the instructions mentioned create your
>> destination (replication) volume.  So I tried to create it, but it won't
>> let me create a volume with replica 1.  this is already a backup, I don't
>> need a backup of a backup.  This further supported my thought that the
>> volume needs to be created by the georep setup commands.
>>
>
> The destination or slave volume needs to be created prior to setting up
> the geo-replication session. You should be able to create a replica 1
> volume as destination volume. How did you try to create this?
>
> Is glusterd running on georep.nwfiber.com ? And are the gluster ports
> open?
>
>
>> Where am I wrong / what do I need to do to fix this?
>>
>> --Jim
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170515/e104fbd6/attachment-0001.html>