Problem with Glusterfs Distributed-Replicated volume almost full but only less than 50% used by files.

Hi, I have a weird problem with my couple Glusterfs-oVirt: I had two data domain for my VMs on oVirt. The main one (data) is a Distributed-Replicate with 2 x 3 = 6 bricks of ~ 1,4 Tb on 3 servers (2 bricks by server). It was an extend of a previous 1 x 3 = 3 Replicated volume on the same servers. I had one new disk (and one new brick) on each of these 3 servers. My concern is the reported use of this volume (almost 100%). a df -h give me : df -h . Filesystem Size Used Avail Use% Mounted on ovirt1.pmmg.priv:/data 3.4T 3.3T 84G 98% /rhev/data-center/mnt/glusterSD/ovirt1.pmmg.priv:_data but a du of the same partition give me : du -hs /rhev/data-center/mnt/glusterSD/ovirt1.pmmg.priv:_data 1.4T /rhev/data-center/mnt/glusterSD/ovirt1.pmmg.priv:_data and there no hidden folder I could miss in this mount point. The weird point (the second one) is when I delete a VMs and is disks (main disk and snapshot) I doesn't get back some space on these volume. Does anyone have an idea of these problem ? Thank's Here the Info of the gluster volume : gluster volume info data Volume Name: data Type: Distributed-Replicate Volume ID: 22b7942e-2952-40d4-ab92-64636e3ce409 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: ovirt1.pmmg.priv:/gluster_bricks/data/data Brick2: ovirt2.pmmg.priv:/gluster_bricks/data/data Brick3: ovirt3.pmmg.priv:/gluster_bricks/data/data Brick4: ovirt1.pmmg.priv:/gluster_bricks/data_extend1/data Brick5: ovirt2.pmmg.priv:/gluster_bricks/data_extend1/data Brick6: ovirt3.pmmg.priv:/gluster_bricks/data_extend1/data Options Reconfigured: features.shard-block-size: 128MB cluster.lookup-optimize: off server.keepalive-count: 5 server.keepalive-interval: 2 server.keepalive-time: 10 server.tcp-user-timeout: 20 performance.client-io-threads: on nfs.disable: on storage.fips-mode-rchecksum: on transport.address-family: inet performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.low-prio-threads: 32 network.remote-dio: disable cluster.eager-lock: enable cluster.quorum-type: auto cluster.server-quorum-type: server cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 10000 features.shard: on user.cifs: off cluster.choose-local: off client.event-threads: 4 server.event-threads: 4 storage.owner-uid: 36 storage.owner-gid: 36 network.ping-timeout: 20 performance.strict-o-direct: on cluster.granular-entry-heal: enable features.trash: off

Hi Olivier, I saw the similar issue, when a disk was mounted incorrectly or a disk had incorrect size. Can you please provide here what df and du give you on each host. Please perform on each host: df -h /gluster_bricks/data/data du -hs /gluster_bricks/data/data df /gluster_bricks/data_extend1/data du /gluster_bricks/data_extend1/data Thanks, Maxim

Hi maxime Thank for your answer: here the output: First server: [root@ovirt1 ~]# df -h /gluster_bricks/data* Filesystem Size Used Avail Use% Mounted on /dev/mapper/gluster_vg_sda-data 1.6T 1.6T 35G 98% /gluster_bricks/data /dev/mapper/gluster_vg_sdb-data 1.8T 593G 1.2T 34% /gluster_bricks/data_extend1 [root@ovirt1 ~]# du -hs /gluster_bricks/data* 1.6T /gluster_bricks/data 581G /gluster_bricks/data_extend1 second server: [root@ovirt2 ~]# df -h /gluster_bricks/data* | grep -v data2 Filesystem Size Used Avail Use% Mounted on /dev/mapper/gluster_vg_sdb-data 1.6T 1.6T 35G 98% /gluster_bricks/data /dev/mapper/gluster_vg_sdc-data 1.8T 593G 1.2T 34% /gluster_bricks/data_extend1 [root@ovirt2 ~]# du -hs /gluster_bricks/data /gluster_bricks/data_extend1 1.6T /gluster_bricks/data 581G /gluster_bricks/data_extend1 third one: [root@ovirt3 ~]# df -h /gluster_bricks/data* | grep -v data2 Filesystem Size Used Avail Use% Mounted on /dev/mapper/gluster_vg_sda-data 1.6T 1.6T 35G 98% /gluster_bricks/data /dev/mapper/gluster_vg_sdb_new-data 1.8T 593G 1.2T 34% /gluster_bricks/data_extend1 [root@ovirt3 ~]# du -hs /gluster_bricks/data /gluster_bricks/data_extend1/ 1.6T /gluster_bricks/data 581G /gluster_bricks/data_extend1/ Note: that I had a second data domain, I've just finish to remove what could be removed (exported as OVA before) and move on the second domain what should remain on. I stop and remove the gluster volume and plan to recreate the bricks from scratch and rebuild a new gluster volume with only 3 replicated bricks each brick as a logical volume with the 2 physicals disks. --Olivier

Hi Olivier, It looks like the problem is outside gluster. Probably the issue is that your filesystem contains some data - 1.6T. Probably this data located both in directory with brick and in directory above. If you can delete this data it can help. Thanks, Maxim

Thank's I solved the problem by moving VMs on another datastore. Erasing this datastore, destroy glustervolume, wipefs on the bricks. recreate the FS on the bricks, recreate the gluster volume and it's OK.
participants (2)
-
max.lyapin@gmail.com
-
Olivier