Hi,
I have a weird problem with my couple Glusterfs-oVirt:
I had two data domain for my VMs on oVirt. The main one (data) is a Distributed-Replicate
with 2 x 3 = 6 bricks of ~ 1,4 Tb on 3 servers (2 bricks by server).
It was an extend of a previous 1 x 3 = 3 Replicated volume on the same servers. I had one
new disk (and one new brick) on each of these 3 servers.
My concern is the reported use of this volume (almost 100%).
a df -h give me :
df -h .
Filesystem Size Used Avail Use% Mounted on
ovirt1.pmmg.priv:/data 3.4T 3.3T 84G 98%
/rhev/data-center/mnt/glusterSD/ovirt1.pmmg.priv:_data
but a du of the same partition give me :
du -hs /rhev/data-center/mnt/glusterSD/ovirt1.pmmg.priv:_data
1.4T /rhev/data-center/mnt/glusterSD/ovirt1.pmmg.priv:_data
and there no hidden folder I could miss in this mount point.
The weird point (the second one) is when I delete a VMs and is disks (main disk and
snapshot) I doesn't get back some space on these volume.
Does anyone have an idea of these problem ?
Thank's
Here the Info of the gluster volume :
gluster volume info data
Volume Name: data
Type: Distributed-Replicate
Volume ID: 22b7942e-2952-40d4-ab92-64636e3ce409
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: ovirt1.pmmg.priv:/gluster_bricks/data/data
Brick2: ovirt2.pmmg.priv:/gluster_bricks/data/data
Brick3: ovirt3.pmmg.priv:/gluster_bricks/data/data
Brick4: ovirt1.pmmg.priv:/gluster_bricks/data_extend1/data
Brick5: ovirt2.pmmg.priv:/gluster_bricks/data_extend1/data
Brick6: ovirt3.pmmg.priv:/gluster_bricks/data_extend1/data
Options Reconfigured:
features.shard-block-size: 128MB
cluster.lookup-optimize: off
server.keepalive-count: 5
server.keepalive-interval: 2
server.keepalive-time: 10
server.tcp-user-timeout: 20
performance.client-io-threads: on
nfs.disable: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.low-prio-threads: 32
network.remote-dio: disable
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
cluster.choose-local: off
client.event-threads: 4
server.event-threads: 4
storage.owner-uid: 36
storage.owner-gid: 36
network.ping-timeout: 20
performance.strict-o-direct: on
cluster.granular-entry-heal: enable
features.trash: off
Show replies by date
Hi Olivier,
I saw the similar issue, when a disk was mounted incorrectly or a disk had incorrect
size.
Can you please provide here what df and du give you on each host. Please perform on each
host:
df -h /gluster_bricks/data/data
du -hs /gluster_bricks/data/data
df /gluster_bricks/data_extend1/data
du /gluster_bricks/data_extend1/data
Thanks,
Maxim
Hi maxime
Thank for your answer:
here the output:
First server:
[root@ovirt1 ~]# df -h /gluster_bricks/data*
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/gluster_vg_sda-data 1.6T 1.6T 35G 98% /gluster_bricks/data
/dev/mapper/gluster_vg_sdb-data 1.8T 593G 1.2T 34% /gluster_bricks/data_extend1
[root@ovirt1 ~]# du -hs /gluster_bricks/data*
1.6T /gluster_bricks/data
581G /gluster_bricks/data_extend1
second server:
[root@ovirt2 ~]# df -h /gluster_bricks/data* | grep -v data2
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/gluster_vg_sdb-data 1.6T 1.6T 35G 98% /gluster_bricks/data
/dev/mapper/gluster_vg_sdc-data 1.8T 593G 1.2T 34%
/gluster_bricks/data_extend1
[root@ovirt2 ~]# du -hs /gluster_bricks/data /gluster_bricks/data_extend1
1.6T /gluster_bricks/data
581G /gluster_bricks/data_extend1
third one:
[root@ovirt3 ~]# df -h /gluster_bricks/data* | grep -v data2
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/gluster_vg_sda-data 1.6T 1.6T 35G 98% /gluster_bricks/data
/dev/mapper/gluster_vg_sdb_new-data 1.8T 593G 1.2T 34%
/gluster_bricks/data_extend1
[root@ovirt3 ~]# du -hs /gluster_bricks/data /gluster_bricks/data_extend1/
1.6T /gluster_bricks/data
581G /gluster_bricks/data_extend1/
Note: that I had a second data domain, I've just finish to remove what could be
removed (exported as OVA before) and move on the second domain what should remain on. I
stop and remove the gluster volume and plan to recreate the bricks from scratch and
rebuild a new gluster volume with only 3 replicated bricks each brick as a logical volume
with the 2 physicals disks.
--Olivier
Hi Olivier,
It looks like the problem is outside gluster. Probably the issue is that your filesystem
contains some data - 1.6T. Probably this data located both in directory with brick and in
directory above. If you can delete this data it can help.
Thanks,
Maxim
Thank's
I solved the problem by moving VMs on another datastore. Erasing this datastore, destroy
glustervolume, wipefs on the bricks. recreate the FS on the bricks, recreate the gluster
volume and it's OK.