On Mon, Feb 1, 2021 at 3:10 PM Nir Soffer <nsoffer@redhat.com> wrote:
[snip]

> So at the end I have the multipath.conf default file installed by vdsm (so without the # PRIVATE line)
> and this in /etc/multipath/conf.d/eql.conf
>
> devices {
>     device {
>         vendor                  "EQLOGIC"
>         product                 "100E-00"

Ben, why is this device missing from multipath builtin devices?

I was using Equallogic kind of storage since oVirt 3.6, so CentoOS/RHEL 6, and it has never been inside the multipath database as far as I remember.
But I don't know why.
The parameters I put was from latest EQL best practices, but they was updated at CentOS 7 time.
I would like to use the same parameters in CentOS 8 now and see if they works ok.
PS line of EQL is somehow deprecated (in the sense of no new features and so on..) but anyway still supported
 

>         path_selector           "round-robin 0"
>         path_grouping_policy    multibus
>         path_checker            tur
>         rr_min_io_rq            10
>         rr_weight               priorities
>         failback                immediate
>         features                "0"

This is never needed, multipath generates this value.

Those were the recommended values from EQL
Latest is dated April 2016 when 8 not out yet:
http://downloads.dell.com/solutions/storage-solution-resources/(3199-CD-L)RHEL-PSseries-Configuration.pdf

 

Ben: please correct me if needed

>         no_path_retry            16

I'm don't think that you need this, since you should inherit the value from vdsm
multipath.conf, either from the "defaults" section, or from the
"overrides" section.

You must add no_path_retry here if you want to use another value, and you don't
want to use vdsm default value.

You are right; I see the value of 16 both in defaults and overrides. But I put it also inside the device section during my tests in doubt it was not picked up in the hope to see similar output as in CentOS 7:

36090a0c8d04f21111fc4251c7c08d0a3 dm-14 EQLOGIC ,100E-00        
size=2.4T features='1 queue_if_no_path' hwhandler='0' wp=rw

where you notice the hwhandler='0'

Originally I remember the default value for no_path_retry was 4 but probably it has been changed in 4.4 to 16, correct?
If I want to see the default that vdsm would create from scratch should I see inside /usr/lib/python3.6/site-packages/vdsm/tool/configurators/multipath.py of my version?
On my system with vdsm-python-4.40.40-1.el8.noarch I have this inside that file
_NO_PATH_RETRY = 16

 

Note that if you use your own value, you need to match it to sanlock io_timeout.
See this document for more info:
https://github.com/oVirt/vdsm/blob/master/doc/io-timeouts.md

>     }

Yes I set this:

# cat /etc/vdsm/vdsm.conf.d/99-FooIO.conf
# Configuration for FooIO storage.

[sanlock]
# Set renewal timeout to 80 seconds
# (8 * io_timeout == 80).
io_timeout = 10

And for another environment with Netapp MetroCluster and 2 different sites (I'm with RHV there...) I plan to set no_path_retry to 24 and io_timeout to 15, to manage disaster recovery scenarios and planned maintenance with Netapp node failover through sites taking potentially up to 120 seconds.

> But still I see this
>
> # multipath -l
> 36090a0c8d04f21111fc4251c7c08d0a3 dm-13 EQLOGIC,100E-00
> size=2.4T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
> `-+- policy='round-robin 0' prio=0 status=active
>   |- 16:0:0:0 sdc 8:32 active undef running
>   `- 18:0:0:0 sde 8:64 active undef running
> 36090a0d88034667163b315f8c906b0ac dm-12 EQLOGIC,100E-00
> size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
> `-+- policy='round-robin 0' prio=0 status=active
>   |- 15:0:0:0 sdb 8:16 active undef running
>   `- 17:0:0:0 sdd 8:48 active undef running
>
> that lets me think I'm not using the no_path_retry setting, but queue_if_no_path... I could be wrong anyway..

Not this is expected. What is means, if I understand multipath
behavior correctly,
that the device queue data for no_path_retry * polling_internal seconds when all
paths failed. After that the device will fail all pending and new I/O
until at least
one path is recovered.

> How to verify for sure (without dropping the paths, at least at the moment) from the config?
> Any option with multipath and/or dmsetup commands?

multipath show config -> find your device section, it will show the current
value for no_path_retry.

Nir


I would like just to be confident about the no_path_retry setting, because the multipath output, also with -v2, -v3, -v4 seems not so clear to me
In 7 (as Benjamin suggested 4 years ago.. ;-) I have this:

# multipath -r -v3 | grep no_path_retry
Feb 01 15:45:27 | 36090a0d88034667163b315f8c906b0ac: no_path_retry = 4 (config file default)
Feb 01 15:45:27 | 36090a0c8d04f21111fc4251c7c08d0a3: no_path_retry = 4 (config file default)

On CentOS 8.3 I get only standard error...:

# multipath -r -v3
Feb 01 15:46:32 | set open fds limit to 8192/262144
Feb 01 15:46:32 | loading /lib64/multipath/libchecktur.so checker
Feb 01 15:46:32 | checker tur: message table size = 3
Feb 01 15:46:32 | loading /lib64/multipath/libprioconst.so prioritizer
Feb 01 15:46:32 | foreign library "nvme" loaded successfully
Feb 01 15:46:32 | delegating command to multipathd

Thanks,
Gianluca