Gluster problems with new disk and device name change and overlap

Hi, I have configured a single host HCI environment through the GUI wizard in 4.3.9. Initial setup has thai layout of disks, as seen by the operating system: /dev/sda -> for ovirt-node-ng OS /dev/nvme0n1 --> for gluster, engine and data volumes /dev/nvme1n1 --> for gluster, vmstore volume So far so good and all is ok. I notice that, even with single path internal disks, at the end oVirt configures the gluster disks as multipath devices and LVM2 PV structure on top of the multipath devices. Is this for "code optimization" at low level or what is the rationale for that, as with Gluster normally you do use local disks and so single path? Multipath structure generated: [root@ovirt ~]# multipath -l nvme.8086-50484b53373530353031325233373541474e-494e54454c205353 dm-5 NVME,INTEL SSDPED1K375GA size=349G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 0:0:1:0 nvme0n1 259:0 active undef running eui.01000000010000005cd2e4b5e7db4d51 dm-6 NVME,INTEL SSDPEDKX040T7 size=932G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 2:0:1:0 nvme1n1 259:2 active undef running [root@ovirt ~]# Anyway, on top of the multipah devices On /dev/nvme0n1: gluster_vg_nvme0n1 volume group with gluster_lv_data and gluster_lv_engine On /dev/nvme1n1 gluster_vg_nvme1n1 volume group with gluster_lv_vmstore logical volume The problem arises when I add another nvme disk, that, occupying a PCI slot, it seems has always higher priority of the previous /dev/nvme1n1 disk and so takes its name. After booting the node: old nvme0n1 --> unmodified name old nvme1n1 --> becomes nvme2n1 new disk --> gets name nvme1n1 From a funcional point of view I have no problems apart LVM warnings I send below and also because the xfs entries in fstab are with UUID: UUID=fa5dd3cb-aeef-470e-b982-432ac896d87a /gluster_bricks/engine xfs inode64,noatime,nodiratime 0 0 UUID=43bed7de-66b1-491d-8055-5b4ef9b0482f /gluster_bricks/data xfs inode64,noatime,nodiratime 0 0 UUID=b81a491c-0a4c-4c11-89d8-9db7fe82888e /gluster_bricks/vmstore xfs inode64,noatime,nodiratime 0 0 lvs commands get: [root@ovirt ~]# lvs WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. LV VG Attr LSize Pool Origin ... Or, for the old nvme1n1 disk, now nvme2n1 multipath device: [root@ovirt ~]# pvdisplay /dev/mapper/eui.01000000010000005cd2e4e359284f51 WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. --- Physical volume --- PV Name /dev/mapper/eui.01000000010000005cd2e4e359284f51 VG Name gluster_vg_nvme1n1 PV Size 931.51 GiB / not usable 1.71 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 238467 Free PE 0 Allocated PE 238467 PV UUID O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl [root@ovirt ~]# I'm able to create PV on top of the new multipath device detected by the system (see the nvme1n1 of the underlying disk): eui.01000000010000005cd2e4b5e7db4d51 dm-6 NVME,INTEL SSDPEDKX040T7 size=3.6T features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 1:0:1:0 nvme1n1 259:1 active undef running [root@ovirt ~]# pvcreate --dataalignment 256K /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. Physical volume "/dev/mapper/eui.01000000010000005cd2e4b5e7db4d51" successfully created. [root@ovirt ~]# But then I'm unable to create a VG on top of it: [root@ovirt ~]# vgcreate gluster_vg_4t /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme1n1 for PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9 prefers device /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 because device is in dm subsystem. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme1n1 for PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because of previous preference. WARNING: PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9 prefers device /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 because of previous preference. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because of previous preference. Cannot use device /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 with duplicates. [root@ovirt ~]# the same with the "-f" option. I suspect I can solve the problem filtering out /dev/nvme* devices in lvm.conf, but I'm not sure. The OS disk is seen as sda so it should not have problems with this Something like this: filter = [ "r|/dev/nvme|", "a|.*/|" ] And also I am not sure if I have to rebuild the initrd or not in this case and if so what would be the exact sequence of commands to execute. Any suggestions? Thanks in advance, Gianluca

On April 7, 2020 10:45:18 AM GMT+03:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Hi, I have configured a single host HCI environment through the GUI wizard in 4.3.9. Initial setup has thai layout of disks, as seen by the operating system: /dev/sda -> for ovirt-node-ng OS /dev/nvme0n1 --> for gluster, engine and data volumes /dev/nvme1n1 --> for gluster, vmstore volume
So far so good and all is ok. I notice that, even with single path internal disks, at the end oVirt configures the gluster disks as multipath devices and LVM2 PV structure on top of the multipath devices. Is this for "code optimization" at low level or what is the rationale for that, as with Gluster normally you do use local disks and so single path? Multipath structure generated:
[root@ovirt ~]# multipath -l nvme.8086-50484b53373530353031325233373541474e-494e54454c205353 dm-5 NVME,INTEL SSDPED1K375GA size=349G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 0:0:1:0 nvme0n1 259:0 active undef running eui.01000000010000005cd2e4b5e7db4d51 dm-6 NVME,INTEL SSDPEDKX040T7
size=932G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 2:0:1:0 nvme1n1 259:2 active undef running [root@ovirt ~]#
Anyway, on top of the multipah devices
On /dev/nvme0n1: gluster_vg_nvme0n1 volume group with gluster_lv_data and gluster_lv_engine
On /dev/nvme1n1 gluster_vg_nvme1n1 volume group with gluster_lv_vmstore logical volume
The problem arises when I add another nvme disk, that, occupying a PCI slot, it seems has always higher priority of the previous /dev/nvme1n1 disk and so takes its name.
After booting the node:
old nvme0n1 --> unmodified name old nvme1n1 --> becomes nvme2n1 new disk --> gets name nvme1n1
From a funcional point of view I have no problems apart LVM warnings I send below and also because the xfs entries in fstab are with UUID:
UUID=fa5dd3cb-aeef-470e-b982-432ac896d87a /gluster_bricks/engine xfs inode64,noatime,nodiratime 0 0 UUID=43bed7de-66b1-491d-8055-5b4ef9b0482f /gluster_bricks/data xfs inode64,noatime,nodiratime 0 0 UUID=b81a491c-0a4c-4c11-89d8-9db7fe82888e /gluster_bricks/vmstore xfs inode64,noatime,nodiratime 0 0
lvs commands get:
[root@ovirt ~]# lvs WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. LV VG Attr LSize Pool Origin ...
Or, for the old nvme1n1 disk, now nvme2n1 multipath device:
[root@ovirt ~]# pvdisplay /dev/mapper/eui.01000000010000005cd2e4e359284f51 WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. --- Physical volume --- PV Name /dev/mapper/eui.01000000010000005cd2e4e359284f51 VG Name gluster_vg_nvme1n1 PV Size 931.51 GiB / not usable 1.71 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 238467 Free PE 0 Allocated PE 238467 PV UUID O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl
[root@ovirt ~]#
I'm able to create PV on top of the new multipath device detected by the system (see the nvme1n1 of the underlying disk):
eui.01000000010000005cd2e4b5e7db4d51 dm-6 NVME,INTEL SSDPEDKX040T7
size=3.6T features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 1:0:1:0 nvme1n1 259:1 active undef running
[root@ovirt ~]# pvcreate --dataalignment 256K /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. Physical volume "/dev/mapper/eui.01000000010000005cd2e4b5e7db4d51" successfully created. [root@ovirt ~]#
But then I'm unable to create a VG on top of it:
[root@ovirt ~]# vgcreate gluster_vg_4t /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme1n1 for PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because device is used by LV. WARNING: PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9 prefers device /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 because device is in dm subsystem. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because device is used by LV. WARNING: Not using device /dev/nvme0n1 for PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp. WARNING: Not using device /dev/nvme1n1 for PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9. WARNING: Not using device /dev/nvme2n1 for PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl. WARNING: PV eYfuXw-yaPd-cMUE-0dnA-tVON-uZ9g-5x4BDp prefers device /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 because of previous preference. WARNING: PV 56ON99-hFFP-cGpZ-g4MX-GfjW-jXeE-fKZVG9 prefers device /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 because of previous preference. WARNING: PV O43LFq-46Gc-RRgS-Sk1F-5mFZ-Qw4n-oxXgJl prefers device /dev/mapper/eui.01000000010000005cd2e4e359284f51 because of previous preference. Cannot use device /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 with duplicates. [root@ovirt ~]#
the same with the "-f" option.
I suspect I can solve the problem filtering out /dev/nvme* devices in lvm.conf, but I'm not sure. The OS disk is seen as sda so it should not have problems with this Something like this:
filter = [ "r|/dev/nvme|", "a|.*/|" ]
And also I am not sure if I have to rebuild the initrd or not in this case and if so what would be the exact sequence of commands to execute.
Any suggestions?
Thanks in advance, Gianluca
The simplest way would be to say that 'blacklisting everything in multipath.conf' will solve your problems. In reality it is a little bit more complicated. You got some options in comparison with other OS-es (Windows) :) 1. The /dev/nvme*** are not persistent names , so forget about those. You can create udev rules for your NVMes in order to guarantee their names. For example you can use the following in order to find the serial: /lib/udev/scsi_id -g -u -x -d /dev/nvme0n1 Then you can use: ATTRS{serial}=='some string' Note: '=' is assignement, while '==' means it is equal. 2. You can tell lvm to use preferred names like: /dev/disk/by-id/dm-uuid-mpath-<WWID> (which is the persistent name of the mpath device) If that doesn't work, you can just filter everything with nvme like this: 'r|/dev/nvme|' I would go with udev rules, but I've used LVM preferred names also. If you change the LVM config, make a backup of your workig initramfs and then use : dracut -f If it boots (after a reboot) without issuues, you can rebuild all images via: dracut -f --regenerate-all Best Regards, Strahil Nikolov

On Tue, Apr 7, 2020 at 12:22 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
The simplest way would be to say that 'blacklisting everything in multipath.conf' will solve your problems. In reality it is a little bit more complicated.
Interesting your arguments Strahil. To be digged more at my part. In the mean time this approach below seems to have solved all the problems. Preamble: I was able to put the new disk in a new PCI slot so that the /dev/nvmeXX names remained consistent with previous setup, but anyway LVM seemed to complain and I was not able again to create VG on PV So it was confirmed my suspect that nvme disks were not filtered out by LVM and created confusion.. - under /etc/lvm [root@ovirt lvm]# diff lvm.conf lvm.conf.orig 142d141 < filter = [ "r|nvme|", "a|.*/|" ] 153d151 < global_filter = [ "r|nvme|", "a|.*/|" ] [root@ovirt lvm]# NOTE: it was not sufficient the "filter" directive alone, even if in theory from what I read, the global_filter one should come in place only when lvm metad is active, while in ovirt node it is not.. To be understood better... - under /etc I noticed that also the OS disk was recomprised into multipath, so I blacklisted it, making the file private at the end... [root@ovirt etc]# diff -u3 multipath.conf.orig multipath.conf --- multipath.conf.orig 2020-04-07 16:25:12.148044435 +0200 +++ multipath.conf 2020-04-07 10:55:44.728734050 +0200 @@ -1,4 +1,5 @@ # VDSM REVISION 1.8 +# VDSM PRIVATE # This file is managed by vdsm. # @@ -164,6 +165,7 @@ blacklist { protocol "(scsi:adt|scsi:sbp)" + wwid INTEL_SSDSCKKI256G8_PHLA835602TE256J } # Remove devices entries when overrides section is available. [root@ovirt etc]# - rebuild initramfs cp /boot/$(imgbase layer --current)/initramfs-$(uname -r).img /root/ dracut -f /boot/$(imgbase layer --current)/initramfs-$(uname -r).img cp -p /boot/$(imgbase layer --current)/initramfs-$(uname -r).img /boot/ After reboot I see configured as multipath devices the disks to be used for gluster [root@ovirt etc]# multipath -l nvme.8086-50484b53373530353031325233373541474e-494e54454c205353 dm-2 NVME,INTEL SSDPED1K375GA size=349G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 0:0:1:0 nvme0n1 259:0 active undef running eui.01000000010000005cd2e4b5e7db4d51 dm-0 NVME,INTEL SSDPEDKX040T7 size=3.6T features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 2:0:1:0 nvme2n1 259:1 active undef running eui.01000000010000005cd2e4e359284f51 dm-1 NVME,INTEL SSDPE2KX010T7 size=932G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=0 status=active `- 1:0:1:0 nvme1n1 259:2 active undef running [root@ovirt etc]# And LVM commands don't complain any more with duplicates: [root@ovirt etc]# pvs PV VG Fmt Attr PSize PFree /dev/mapper/eui.01000000010000005cd2e4b5e7db4d51 gluster_vg_4t lvm2 a-- <3.64t 0 /dev/mapper/eui.01000000010000005cd2e4e359284f51 gluster_vg_nvme1n1 lvm2 a-- 931.51g 0 /dev/mapper/nvme.8086-50484b53373530353031325233373541474e-494e54454c20535344504544314b3337354741-00000001 gluster_vg_nvme0n1 lvm2 a-- 349.32g 0 /dev/sda2 onn lvm2 a-- <228.40g <43.87g [root@ovirt etc]# And as you see I was able to create now the VG on top of the new PV on the 4Tb disk: [root@ovirt etc]# vgs VG #PV #LV #SN Attr VSize VFree gluster_vg_4t 1 2 0 wz--n- <3.64t 0 gluster_vg_nvme0n1 1 3 0 wz--n- 349.32g 0 gluster_vg_nvme1n1 1 2 0 wz--n- 931.51g 0 onn 1 11 0 wz--n- <228.40g <43.87g [root@ovirt etc]# [root@ovirt etc]# lvs gluster_vg_4t LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert gluster_lv_big gluster_vg_4t Vwi-aot--- <4.35t my_pool 0.05 my_pool gluster_vg_4t twi-aot--- <3.61t 0.05 0.14 [root@ovirt etc]# Let's go hunting for the next problem... ;-) Gianluca
participants (2)
-
Gianluca Cecchi
-
Strahil Nikolov