[ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
Ravishankar N
ravishankar at redhat.com
Thu Jul 20 09:34:00 UTC 2017
On 07/20/2017 02:20 PM, yayo (j) wrote:
> Hi,
>
> Thank you for the answer and sorry for delay:
>
> 2017-07-19 16:55 GMT+02:00 Ravishankar N <ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>:
>
> 1. What does the glustershd.log say on all 3 nodes when you run
> the command? Does it complain anything about these files?
>
>
> No, glustershd.log is clean, no extra log after command on all 3 nodes
Could you check if the self-heal daemon on all nodes is connected to the
3 bricks? You will need to check the glustershd.log for that.
If it is not connected, try restarting the shd using `gluster volume
start engine force`, then launch the heal command like you did earlier
and see if heals happen.
If it doesn't, please provide the getfattr outputs of the 12 files from
all 3 nodes using `getfattr -d -m . -e hex
//gluster/engine/brick//path-to-file` ?
Thanks,
Ravi
> 2. Are these 12 files also present in the 3rd data brick?
>
>
> I've checked right now: all files exists in all 3 nodes
>
> 3. Can you provide the output of `gluster volume info` for the
> this volume?
>
>
>
> /Volume Name: engine/
> /Type: Replicate/
> /Volume ID: d19c19e3-910d-437b-8ba7-4f2a23d17515/
> /Status: Started/
> /Snapshot Count: 0/
> /Number of Bricks: 1 x 3 = 3/
> /Transport-type: tcp/
> /Bricks:/
> /Brick1: node01:/gluster/engine/brick/
> /Brick2: node02:/gluster/engine/brick/
> /Brick3: node04:/gluster/engine/brick/
> /Options Reconfigured:/
> /nfs.disable: on/
> /performance.readdir-ahead: on/
> /transport.address-family: inet/
> /storage.owner-uid: 36/
> /performance.quick-read: off/
> /performance.read-ahead: off/
> /performance.io-cache: off/
> /performance.stat-prefetch: off/
> /performance.low-prio-threads: 32/
> /network.remote-dio: off/
> /cluster.eager-lock: enable/
> /cluster.quorum-type: auto/
> /cluster.server-quorum-type: server/
> /cluster.data-self-heal-algorithm: full/
> /cluster.locking-scheme: granular/
> /cluster.shd-max-threads: 8/
> /cluster.shd-wait-qlength: 10000/
> /features.shard: on/
> /user.cifs: off/
> /storage.owner-gid: 36/
> /features.shard-block-size: 512MB/
> /network.ping-timeout: 30/
> /performance.strict-o-direct: on/
> /cluster.granular-entry-heal: on/
> /auth.allow: */
>
> server.allow-insecure: on
>
>
>
>
>> Some extra info:
>>
>> We have recently changed the gluster from: 2 (full
>> repliacated) + 1 arbiter to 3 full replicated cluster
>>
>
> Just curious, how did you do this? `remove-brick` of arbiter
> brick followed by an `add-brick` to increase to replica-3?
>
>
> Yes
>
>
> #gluster volume remove-brick engine replica 2
> node03:/gluster/data/brick force *(OK!)*
>
> #gluster volume heal engine info *(no entries!)*
>
> #gluster volume add-brick engine replica 3
> node04:/gluster/engine/brick *(OK!)*
>
> *After some minutes*
>
> [root at node01 ~]# gluster volume heal engine info
> Brick node01:/gluster/engine/brick
> Status: Connected
> Number of entries: 0
>
> Brick node02:/gluster/engine/brick
> Status: Connected
> Number of entries: 0
>
> Brick node04:/gluster/engine/brick
> Status: Connected
> Number of entries: 0
>
> Thanks,
> Ravi
>
>
> Another extra info (I don't know if this can be the problem): Five
> days ago A black out has suddenly shut down the networks switch (also
> gluster network) of node 03 and 04 ... But I don't know this problem
> is in place after this black out
>
> Thank you!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170720/2be61256/attachment-0001.html>
More information about the Users
mailing list