Add gluster storage domain incomplete documentation

Hello, I think both at oVirt and Red Hat official docs the steps regarding adding a new Gluster storage domain on an existing installation is quite confusing and misaligned. Below I explain my reasons for both. In the mean time, suppose I have a single host HCI with self hosted engine (but it could be useful to have hints for multi-hosts too) and the 3 storage domains configured during install (engine and data on a disk and vmstore on another disk). The system initially had 3 disks, the first used for the ovirt-node-ng system, the second for engine and data, the third for vmstore initial gluster storage domains Now I go and add a fourth disk, say 4Tb in size. I would like to create a new Gluster storage domain on it What are the suggested steps? BTW: after booting, the new disk has been automatically included in multipath eui.01000000010000005cd2e4b5e7db4d51 dm-6 NVME,INTEL SSDPEDKX040T7 size=3.6T features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 1:0:1:0 nvme1n1 259:1 active undef running Coming back to the docs - oVirt documentation I base my considerations on page https://www.ovirt.org/documentation/admin-guide/chap-Working_with_Gluster_St... 1) the picture in that page is still based on 3.6 GUI, so the initial approach tends to make me feel the page could be not up to date 2) I would put the section "Creating a Storage Volume" before "Attaching a Gluster Storage Volume as a Storage Domain" and not the opposite as it is now 3) In "Creating a Storage Volume" there is the note that " You must create brick directories or mountpoints before you can add them to volumes. " This sentence in my opinion is not so clear (see also details at the end)... what is a user expected to have done on hypervisors? Creation of directories or file systems (and what type: prefered xfs or ext4?). Perhaps an example of commands would be useful. 4) in the workflow of creating volume, item 7: Click the Add Bricks button to select bricks to add to the volume. Bricks must be created externally on the Gluster Storage nodes. I would expect an indication of commands to be run instead... This implies knowledge of Gluster that the GUI functionality is aimed to hide... but so, not completely... I can see in my existing bricks, going to their advanced details: " xfs rw,seclabel,noatime,nodiratime,attr2,inode64,logbsize=128k,sunit=256,swidth=512,noquota " There is also the fstab portion of creating bricks... 5) In the workflow of creating volume, item 9: It is not clear the default "*" value for access if is the recommended one or not. I presume not - RHV documentation https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/htm... 4.3 Administrator Guide downloaded today Chapter 8, section 8.6 PREPARING AND ADDING RED HAT GLUSTER STORAGE There is reference to Red Hat Gluster Storage, version 3.4. I think it should be 3.5 instead? Because at the same time in 3.4 there is reference to RHV 4.1, not 4.3. While in "Configuring Red Hat Virtualization with Red Hat Gluster Storage" version 3.5 there is correct reference to RHV 4.3 Anyway this mix of product documentation is not optimal in my opinion. I would directly include inside RHV docs the Gluster part related to it, without jumping between the two comlete, with the risk of misalignment in time. Also, in Gluster Storage guide referred, there is the part related to volumes but not to bricks.... Possibly the correct reference for the brick part, that is the one missing in webadmin GUI, could be this: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht... and/or this https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht... ? Thanks for reading and hoping to get commands' workflow to configure this new disk for a new gluster storage domain. It seems inside the guides is oversimplified the process of creating bricks, that in the documentation seem actually to be xfs filesystems mounted over thin pool based logical volumes residing on top of volume groups with particular alignment settings specified during their creation.... Gianluca

On April 6, 2020 5:29:10 PM GMT+03:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Hello, I think both at oVirt and Red Hat official docs the steps regarding adding a new Gluster storage domain on an existing installation is quite confusing and misaligned. Below I explain my reasons for both. In the mean time, suppose I have a single host HCI with self hosted engine (but it could be useful to have hints for multi-hosts too) and the 3 storage domains configured during install (engine and data on a disk and vmstore on another disk). The system initially had 3 disks, the first used for the ovirt-node-ng system, the second for engine and data, the third for vmstore initial gluster storage domains
Now I go and add a fourth disk, say 4Tb in size. I would like to create a new Gluster storage domain on it What are the suggested steps?
BTW: after booting, the new disk has been automatically included in multipath
eui.01000000010000005cd2e4b5e7db4d51 dm-6 NVME,INTEL SSDPEDKX040T7
size=3.6T features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 1:0:1:0 nvme1n1 259:1 active undef running
Coming back to the docs
- oVirt documentation
I base my considerations on page https://www.ovirt.org/documentation/admin-guide/chap-Working_with_Gluster_St...
1) the picture in that page is still based on 3.6 GUI, so the initial approach tends to make me feel the page could be not up to date
2) I would put the section "Creating a Storage Volume" before "Attaching a Gluster Storage Volume as a Storage Domain" and not the opposite as it is now
3) In "Creating a Storage Volume" there is the note that " You must create brick directories or mountpoints before you can add them to volumes. "
This sentence in my opinion is not so clear (see also details at the end)... what is a user expected to have done on hypervisors? Creation of directories or file systems (and what type: prefered xfs or ext4?). Perhaps an example of commands would be useful.
4) in the workflow of creating volume, item 7:
Click the Add Bricks button to select bricks to add to the volume. Bricks must be created externally on the Gluster Storage nodes.
I would expect an indication of commands to be run instead... This implies knowledge of Gluster that the GUI functionality is aimed to hide... but so, not completely...
I can see in my existing bricks, going to their advanced details: " xfs rw,seclabel,noatime,nodiratime,attr2,inode64,logbsize=128k,sunit=256,swidth=512,noquota " There is also the fstab portion of creating bricks...
5) In the workflow of creating volume, item 9: It is not clear the default "*" value for access if is the recommended one or not. I presume not
- RHV documentation https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/htm... 4.3 Administrator Guide downloaded today Chapter 8, section 8.6 PREPARING AND ADDING RED HAT GLUSTER STORAGE There is reference to Red Hat Gluster Storage, version 3.4. I think it should be 3.5 instead? Because at the same time in 3.4 there is reference to RHV 4.1, not 4.3. While in "Configuring Red Hat Virtualization with Red Hat Gluster Storage" version 3.5 there is correct reference to RHV 4.3
Anyway this mix of product documentation is not optimal in my opinion. I would directly include inside RHV docs the Gluster part related to it, without jumping between the two comlete, with the risk of misalignment in time.
Also, in Gluster Storage guide referred, there is the part related to volumes but not to bricks.... Possibly the correct reference for the brick part, that is the one missing in webadmin GUI, could be this: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht... and/or this https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht... ?
Thanks for reading and hoping to get commands' workflow to configure this new disk for a new gluster storage domain. It seems inside the guides is oversimplified the process of creating bricks, that in the documentation seem actually to be xfs filesystems mounted over thin pool based logical volumes residing on top of volume groups with particular alignment settings specified during their creation....
Gianluca
Hi Gianluca,, Actually the situation is just like CEPH & Openstack... You have Openstack (in our case oVirt) that can manage basic tasks with the storage, but many administrators do not rely on the UI for complex tasks. In order to properly run a HCI , some gluster knowledge is "mandatory" (personal opinion - you will never find that word anywhere :) ). In your case, you need: 1. Blacklist the disks in multipath.conf . As it is managed by vdsm, you need to put a special comment '# VDSM PRIVATE' (without the quotes !) in order to prevent VDSM from modifying. I don't know if this is the best approach, yet it works for me. 2. Create a VDO (skip if not needed) 3. Create PV from the VDO/disk/array 4. Either add to an existing VG or create a new one 5. Create a thin LVM pool and thin LV (if you want gluster-level snapshots). I use this approach to snapshot my HostedEngine VM. For details, I can tell you in a separate thread. 5. Create an XFS filesystem and define it either in fstab or in systemd unit (second option is better as you can define dependencies). I would recommend you to use these options: noatime,nodiratime, context="system_u:object_r:glusterd_brick_t:s0" Keep the quotes and mount the brick on all nodes. I assumed that you are adding bricks on the same HCI nodes, but that could be a bad assumption. If not, you will need to extend the storage pool and then to create your volume . 6.Last, create a storage domain via API or the UI. In the end you can use storage migration (if you are not using qemu's libgfapi integration) to utilize the new storage without any downtime. P.S.: Documentation contributions are welcomed and if I have some time - I will be able to add some of my experience :) Best Regards, Strahil Nikolov

On Mon, Apr 6, 2020 at 7:15 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote: [snip]
Hi Gianluca,,
Actually the situation is just like CEPH & Openstack... You have Openstack (in our case oVirt) that can manage basic tasks with the storage, but many administrators do not rely on the UI for complex tasks.
Hi Strahil, thanks for your answers. Actually here we have the basic steps of Gluster bricks setup that are missing and only the more complex ones apparently enabled at GUI level....
In order to properly run a HCI , some gluster knowledge is "mandatory" (personal opinion - you will never find that word anywhere :) ). In your case, you need:
1. Blacklist the disks in multipath.conf . As it is managed by vdsm, you need to put a special comment '# VDSM PRIVATE' (without the quotes !) in order to prevent VDSM from modifying. I don't know if this is the best approach, yet it works for me.
Actually when you complete the initial supported gui based HCI setup, it doesn't blacklist anything in multipath.conf and it doesn't keep private the file. So I would like to avoid it. I don't think it should be necessary. The only blacklist part inside the setup generated file is: blacklist { protocol "(scsi:adt|scsi:sbp)" } In HCI single host setup you give the gui the whole disks' names: in my case they were /dev/nvme0n1 (for engine and data bricks/volumes) and /dev/nvme1n1 (for vmstore). All as JBOD. And the final configuration setup has similar approach to yours and resembling the Red Hat Gluster storage link I sent: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht...
2. Create a VDO (skip if not needed)
I didn't check it during initial setup, so it was skipped 3. Create PV from the VDO/disk/array
Yes, the setup created a PV, but not on /dev/nvme0n1 and on /dev/nvme1n1, but on their multipath side of the moon.... On my system after setup I have this for my two disks dedicated to Gluster Volumes: [root@ovirt ~]# multipath -l nvme.8086-50484b53373530353031325233373541474e-494e54454c205353 dm-5 NVME,INTEL SSDPED1K375GA size=349G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 0:0:1:0 nvme0n1 259:0 active undef running eui.01000000010000005cd2e4e359284f51 dm-7 NVME,INTEL SSDPE2KX010T7 size=932G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 2:0:1:0 nvme1n1 259:2 active undef running [root@ovirt ~]# So the possibilities are two: - the setup workflow has done something wrong and it should have blacklisted the disks - it is correct that the multipath devices are in place and the PVs done on top of them I don't know which is the correct one. Can anyone answer the expected correct config after the initial setup? The Gluster Storage guide says that I should do in my case: pvcreate --dataalignment 256K multipath_device NOTE: the 256K is the value specified in Gluster Storage Guide for JBOD It seems confirmed by existing PVs: [root@ovirt ~]# pvs -o +pe_start /dev/mapper/eui.01000000010000005cd2e4e359284f51 PV VG Fmt Attr PSize PFree 1st PE /dev/mapper/eui.01000000010000005cd2e4e359284f51 gluster_vg_nvme1n1 lvm2 a-- 931.51g 0 256.00k [root@ovirt ~]# 4. Either add to an existing VG or create a new one
Yes, the setup created two VGs: gluster_vg_nvme0n1 on the firt multipath device gluster_vg_nvme1n1 on the second multipath device Just to confirm I re-created a very similar setup (only difference is that I used only one disk for all the 3 gluster volumes and one disk for the operating system disk) inside this ovirt installation as a nested environment. Here the disk to configure for Gluster in HCI single host setup is /dev/sdb and the final result after reboot is: Note the "n" (for nested) in front of the host name that is not the same as before [root@novirt ~]# multipath -l 0QEMU_QEMU_HARDDISK_4daa576b-2020-4747-b dm-5 QEMU ,QEMU HARDDISK size=150G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 2:0:0:1 sdb 8:16 active undef running [root@novirt ~]# [root@novirt ~]# vgs VG #PV #LV #SN Attr VSize VFree gluster_vg_sdb 1 4 0 wz--n- <150.00g 0 onn_novirt 1 11 0 wz--n- <99.00g <17.88g [root@novirt ~]# [root@novirt ~]# lvs gluster_vg_sdb LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert gluster_lv_data gluster_vg_sdb Vwi-aot--- 500.00g gluster_thinpool_gluster_vg_sdb 0.05 gluster_lv_engine gluster_vg_sdb -wi-ao---- 100.00g gluster_lv_vmstore gluster_vg_sdb Vwi-aot--- 500.00g gluster_thinpool_gluster_vg_sdb 0.56 gluster_thinpool_gluster_vg_sdb gluster_vg_sdb twi-aot--- <44.00g 6.98 0.54 [root@novirt ~]# So this confirms that setup creates PVs on top of a multipath device (even if composed by only one path) I don't know with a 3 hosts HCI setup the approach would have been different or not... anyone chiming in? So I should simply execute, for JBOD (more considerations on PE size in Red Hat Gluster Storage admin guide, for RAIDN scenarios): vgcreate VG_NAME multipath_device 5. Create a thin LVM pool and thin LV (if you want gluster-level
snapshots). I use this approach to snapshot my HostedEngine VM. For details, I can tell you in a separate thread.
It seems also the setup creates thin LVs (apart for engine domain, as the manual says) Coming back to the physical environment and concentrating on the vmstore volume, I have indeed: [root@ovirt ~]# lvs gluster_vg_nvme1n1 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert gluster_lv_vmstore gluster_vg_nvme1n1 Vwi-aot--- 930.00g gluster_thinpool_gluster_vg_nvme1n1 48.67 gluster_thinpool_gluster_vg_nvme1n1 gluster_vg_nvme1n1 twi-aot--- 921.51g 49.12 1.46 [root@ovirt ~]# In my case it seems I can execute: lvcreate --thin VG_NAME/POOL_NAME --extents 100%FREE --chunksize CHUNKSIZE --poolmetadatasize METASIZE --zero n The docs recommends to create the pool metadata device of the maximum size possible, that is 16GiB. As my disk is 4Tb I think it is ok, for maximum safety Also for JBOD the chunksize has to be 256K So my commands: lvcreate --thin VG_NAME/POOL_NAME --extents 100%FREE --chunksize 256k --poolmetadatasize 16G --zero n and, supposing of doing overprovisioning of 25%: lvcreate --thin --name LV_NAME --virtualsize 5T VG_NAME/POOL_NAME 5. Create an XFS filesystem and define it either in fstab or in systemd
unit (second option is better as you can define dependencies). I would recommend you to use these options:
noatime,nodiratime, context="system_u:object_r:glusterd_brick_t:s0"
Keep the quotes and mount the brick on all nodes.
Going to my system I see for the 3 existing bricks, in fstab: UUID=fa5dd3cb-aeef-470e-b982-432ac896d87a /gluster_bricks/engine xfs inode64,noatime,nodiratime 0 0 UUID=43bed7de-66b1-491d-8055-5b4ef9b0482f /gluster_bricks/data xfs inode64,noatime,nodiratime 0 0 UUID=b81a491c-0a4c-4c11-89d8-9db7fe82888e /gluster_bricks/vmstore xfs inode64,noatime,nodiratime 0 0 and, for the biggest one ( I align to new lines for readability): [root@ovirt ~]# xfs_admin -lu /dev/mapper/gluster_vg_nvme1n1-gluster_lv_vmstore label = "" UUID = b81a491c-0a4c-4c11-89d8-9db7fe82888e [root@ovirt ~]# [root@ovirt ~]# xfs_info /gluster_bricks/vmstore meta-data /dev/mapper/gluster_vg_nvme1n1-gluster_lv_vmstore isize=512 agcount=32, agsize=7618528 blks sectsz=512 attr=2, projid32bit=1 crc=1 finobt=0 spinodes=0 data bsize=4096 blocks=243792896, imaxpct=25 sunit=32 swidth=64 blks naming version 2 bsize=8192 ascii-ci=0 ftype=1 log internal bsize=4096 blocks=119040, version=2 sectsz=512 sunit=32 blks, lazy-count=1 realtime none extsz=4096 blocks=0, rtextents=0 [root@ovirt ~]# This above confirms recommandations in Gluster Admin Guide: inode size of 512 bytes For RAID 10 and JBOD, the -d su=<>,sw=<> option can be omitted. By default, XFS will use the thin-p chunk size and other parameters to make layout decisions. logical block size for directory 8192 So the final command mkfs.xfs -i size=512 -n size=8192 VG_NAME/LV_NAME and then get its UUID with the xfs_admin command above, to put in fstab
I assumed that you are adding bricks on the same HCI nodes, but that could be a bad assumption. If not, you will need to extend the storage pool and then to create your volume .
6.Last, create a storage domain via API or the UI.
OK, thsi should be the, hopefully easy, part in webadmin GUI. Let's see.
In the end you can use storage migration (if you are not using qemu's libgfapi integration) to utilize the new storage without any downtime.
P.S.: Documentation contributions are welcomed and if I have some time - I will be able to add some of my experience :)
Best Regards, Strahil Nikolov
Thank you very much Strahil for your inputs. I'm going to test on my nested ovirt before, adding a disk to it, and then to the physical one Comments welcome On my physical then I have a device naming problem, because the new inserted disk has taken the name of a previous one and strangely there is conflict in creating VG, even if LVM2 entities have their UUID, but fot his particular problem I'm going to open a separate thread. Gianluca

On April 7, 2020 2:21:53 AM GMT+03:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Mon, Apr 6, 2020 at 7:15 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
[snip]
Hi Gianluca,,
Actually the situation is just like CEPH & Openstack... You have Openstack (in our case oVirt) that can manage basic
tasks
with the storage, but many administrators do not rely on the UI for complex tasks.
Hi Strahil, thanks for your answers. Actually here we have the basic steps of Gluster bricks setup that are missing and only the more complex ones apparently enabled at GUI level....
In order to properly run a HCI , some gluster knowledge is
(personal opinion - you will never find that word anywhere :) ). In your case, you need:
1. Blacklist the disks in multipath.conf . As it is managed by vdsm, you need to put a special comment '# VDSM PRIVATE' (without the quotes !) in order to prevent VDSM from modifying. I don't know if this is
"mandatory" the
best approach, yet it works for me.
Actually when you complete the initial supported gui based HCI setup, it doesn't blacklist anything in multipath.conf and it doesn't keep private the file. So I would like to avoid it. I don't think it should be necessary. The only blacklist part inside the setup generated file is:
blacklist { protocol "(scsi:adt|scsi:sbp)" }
In HCI single host setup you give the gui the whole disks' names: in my case they were /dev/nvme0n1 (for engine and data bricks/volumes) and /dev/nvme1n1 (for vmstore). All as JBOD. And the final configuration setup has similar approach to yours and resembling the Red Hat Gluster storage link I sent: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht...
2. Create a VDO (skip if not needed)
I didn't check it during initial setup, so it was skipped
3. Create PV from the VDO/disk/array
Yes, the setup created a PV, but not on /dev/nvme0n1 and on /dev/nvme1n1, but on their multipath side of the moon.... On my system after setup I have this for my two disks dedicated to Gluster Volumes:
[root@ovirt ~]# multipath -l nvme.8086-50484b53373530353031325233373541474e-494e54454c205353 dm-5 NVME,INTEL SSDPED1K375GA size=349G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 0:0:1:0 nvme0n1 259:0 active undef running eui.01000000010000005cd2e4e359284f51 dm-7 NVME,INTEL SSDPE2KX010T7
size=932G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 2:0:1:0 nvme1n1 259:2 active undef running [root@ovirt ~]#
So the possibilities are two: - the setup workflow has done something wrong and it should have blacklisted the disks - it is correct that the multipath devices are in place and the PVs done on top of them
I don't know which is the correct one. Can anyone answer the expected correct config after the initial setup?
The Gluster Storage guide says that I should do in my case:
pvcreate --dataalignment 256K multipath_device
NOTE: the 256K is the value specified in Gluster Storage Guide for JBOD
It seems confirmed by existing PVs: [root@ovirt ~]# pvs -o +pe_start /dev/mapper/eui.01000000010000005cd2e4e359284f51 PV VG Fmt Attr PSize PFree 1st PE /dev/mapper/eui.01000000010000005cd2e4e359284f51 gluster_vg_nvme1n1 lvm2 a-- 931.51g 0 256.00k [root@ovirt ~]#
4. Either add to an existing VG or create a new one
Yes, the setup created two VGs: gluster_vg_nvme0n1 on the firt multipath device gluster_vg_nvme1n1 on the second multipath device
Just to confirm I re-created a very similar setup (only difference is that I used only one disk for all the 3 gluster volumes and one disk for the operating system disk) inside this ovirt installation as a nested environment. Here the disk to configure for Gluster in HCI single host setup is /dev/sdb and the final result after reboot is:
Note the "n" (for nested) in front of the host name that is not the same as before
[root@novirt ~]# multipath -l 0QEMU_QEMU_HARDDISK_4daa576b-2020-4747-b dm-5 QEMU ,QEMU HARDDISK size=150G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 2:0:0:1 sdb 8:16 active undef running [root@novirt ~]#
[root@novirt ~]# vgs VG #PV #LV #SN Attr VSize VFree gluster_vg_sdb 1 4 0 wz--n- <150.00g 0 onn_novirt 1 11 0 wz--n- <99.00g <17.88g [root@novirt ~]#
[root@novirt ~]# lvs gluster_vg_sdb LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert gluster_lv_data gluster_vg_sdb Vwi-aot--- 500.00g gluster_thinpool_gluster_vg_sdb 0.05
gluster_lv_engine gluster_vg_sdb -wi-ao---- 100.00g
gluster_lv_vmstore gluster_vg_sdb Vwi-aot--- 500.00g gluster_thinpool_gluster_vg_sdb 0.56
gluster_thinpool_gluster_vg_sdb gluster_vg_sdb twi-aot--- <44.00g 6.98 0.54 [root@novirt ~]#
So this confirms that setup creates PVs on top of a multipath device (even if composed by only one path) I don't know with a 3 hosts HCI setup the approach would have been different or not... anyone chiming in?
So I should simply execute, for JBOD (more considerations on PE size in Red Hat Gluster Storage admin guide, for RAIDN scenarios):
vgcreate VG_NAME multipath_device
5. Create a thin LVM pool and thin LV (if you want gluster-level
snapshots). I use this approach to snapshot my HostedEngine VM. For details, I can tell you in a separate thread.
It seems also the setup creates thin LVs (apart for engine domain, as the manual says) Coming back to the physical environment and concentrating on the vmstore volume, I have indeed:
[root@ovirt ~]# lvs gluster_vg_nvme1n1 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert gluster_lv_vmstore gluster_vg_nvme1n1 Vwi-aot--- 930.00g gluster_thinpool_gluster_vg_nvme1n1 48.67
gluster_thinpool_gluster_vg_nvme1n1 gluster_vg_nvme1n1 twi-aot--- 921.51g 49.12 1.46
[root@ovirt ~]#
In my case it seems I can execute:
lvcreate --thin VG_NAME/POOL_NAME --extents 100%FREE --chunksize CHUNKSIZE --poolmetadatasize METASIZE --zero n The docs recommends to create the pool metadata device of the maximum size possible, that is 16GiB. As my disk is 4Tb I think it is ok, for maximum safety Also for JBOD the chunksize has to be 256K
So my commands:
lvcreate --thin VG_NAME/POOL_NAME --extents 100%FREE --chunksize 256k --poolmetadatasize 16G --zero n and, supposing of doing overprovisioning of 25%: lvcreate --thin --name LV_NAME --virtualsize 5T VG_NAME/POOL_NAME
5. Create an XFS filesystem and define it either in fstab or in systemd
unit (second option is better as you can define dependencies). I would recommend you to use these options:
noatime,nodiratime, context="system_u:object_r:glusterd_brick_t:s0"
Keep the quotes and mount the brick on all nodes.
Going to my system I see for the 3 existing bricks, in fstab:
UUID=fa5dd3cb-aeef-470e-b982-432ac896d87a /gluster_bricks/engine xfs inode64,noatime,nodiratime 0 0 UUID=43bed7de-66b1-491d-8055-5b4ef9b0482f /gluster_bricks/data xfs inode64,noatime,nodiratime 0 0 UUID=b81a491c-0a4c-4c11-89d8-9db7fe82888e /gluster_bricks/vmstore xfs inode64,noatime,nodiratime 0 0
and, for the biggest one ( I align to new lines for readability):
[root@ovirt ~]# xfs_admin -lu /dev/mapper/gluster_vg_nvme1n1-gluster_lv_vmstore label = "" UUID = b81a491c-0a4c-4c11-89d8-9db7fe82888e [root@ovirt ~]#
[root@ovirt ~]# xfs_info /gluster_bricks/vmstore meta-data /dev/mapper/gluster_vg_nvme1n1-gluster_lv_vmstore isize=512 agcount=32, agsize=7618528 blks sectsz=512 attr=2, projid32bit=1 crc=1 finobt=0 spinodes=0
data bsize=4096 blocks=243792896, imaxpct=25 sunit=32 swidth=64 blks
naming version 2 bsize=8192 ascii-ci=0 ftype=1
log internal bsize=4096 blocks=119040, version=2 sectsz=512 sunit=32 blks, lazy-count=1
realtime none extsz=4096 blocks=0, rtextents=0 [root@ovirt ~]#
This above confirms recommandations in Gluster Admin Guide: inode size of 512 bytes For RAID 10 and JBOD, the -d su=<>,sw=<> option can be omitted. By default, XFS will use the thin-p chunk size and other parameters to make layout decisions. logical block size for directory 8192
So the final command mkfs.xfs -i size=512 -n size=8192 VG_NAME/LV_NAME
and then get its UUID with the xfs_admin command above, to put in fstab
I assumed that you are adding bricks on the same HCI nodes, but that could be a bad assumption. If not, you will need to extend the storage pool and then to create your volume .
6.Last, create a storage domain via API or the UI.
OK, thsi should be the, hopefully easy, part in webadmin GUI. Let's see.
In the end you can use storage migration (if you are not using
qemu's
libgfapi integration) to utilize the new storage without any downtime.
P.S.: Documentation contributions are welcomed and if I have some time - I will be able to add some of my experience :)
Best Regards, Strahil Nikolov
Thank you very much Strahil for your inputs. I'm going to test on my nested ovirt before, adding a disk to it, and then to the physical one Comments welcome
On my physical then I have a device naming problem, because the new inserted disk has taken the name of a previous one and strangely there is conflict in creating VG, even if LVM2 entities have their UUID, but fot his particular problem I'm going to open a separate thread.
Gianluca
Hey Gianluca, Let me clarify the multipath story. In your case we have a single path (cause we don't use SAN) and it is not normal to keep local disks in the multipath.conf ... Many monitoring scripts would raise an alert in such case, so best practice is to blacklist local devices. I'm not sure if this can be done from the engine's UI (blacklisting local disks), but it's worth checking. About the PV stuff, you have to use the multipath device, if it exists, as it is 1 layer above the scsi devices. In your case, you won't be able to use the block device, even if you wish. But as I said, it's crazy to keep local devices in multipath. Best Regards, Strahil Nikolov
participants (2)
-
Gianluca Cecchi
-
Strahil Nikolov