On Wed, Feb 1, 2017 at 8:39 AM, Nir Soffer <nsoffer@redhat.com> wrote:

Hi Gianluca,

This should be a number, not a string, maybe multipath is having trouble
parsing this and it ignores your value?

I don't think so. Also because reading dm multipath guide at
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/DM_Multipath/multipath_config_confirm.html

It seems that in RH EL 7.3 the "show config" command has this behaviour:

"
For example, the following command sequence displays the multipath configuration, including the defaults, before exiting the console.

# multipathd -k
> > show config
> > CTRL-D
"

So the output has to include the default too. Anyway I changed it, see below

<<<<< BEGIN OF PARENTHESIS

In theory it should be the same on RH EL 6.8

(see https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/DM_Multipath/multipath_config_confirm.html )

but it is not so for me on a system that is on 6.5, with device-mapper-multipath-0.4.9-93.el6.x86_64 and connected to Netapp

In /usr/share/doc/device-mapper-multipath-0.4.9/multipath.conf.defaults

#               vendor "NETAPP"
#               product "LUN.*"
#               path_grouping_policy group_by_prio
#               getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
#               path_selector "round-robin 0"
#               path_checker tur
#               features "3 queue_if_no_path pg_init_retries 50"
#               hardware_handler "0"
#               prio ontap
#               failback immediate
#               rr_weight uniform
#               rr_min_io 128
#               rr_min_io_rq 1
#               flush_on_last_del yes
#               fast_io_fail_tmo 5
#               dev_loss_tmo infinity
#               retain_attached_hw_handler yes
#               detect_prio yes
#               reload_readwrite yes
#       }

My customization in multipath.conf, based on Netapp guidelines and my Netapp storage array setup:

devices {
       device {
               vendor "NETAPP"
               product "LUN.*"
               getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n"
               hardware_handler "1 alua"
               prio alua
        }

If I run "multipathd show config" I see only 1 entry for NETAPP/LUN vendor/product and it is a merge of default and my custom.

        device {
                vendor "NETAPP"
                product "LUN.*"
                path_grouping_policy group_by_prio
                getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n"
                path_selector "round-robin 0"
                path_checker tur
                features "3 queue_if_no_path pg_init_retries 50"
                hardware_handler "1 alua"
                prio alua
                failback immediate
                rr_weight uniform
                rr_min_io 128
                rr_min_io_rq 1
                flush_on_last_del yes
                fast_io_fail_tmo 5
                dev_loss_tmo infinity
                retain_attached_hw_handler yes
                detect_prio yes
                reload_readwrite yes
        }

So this difference confused me when configuring multipath in CentOS 7.3. I have to see when I'm going to update from 6.5 to 6.8 if this changes.

<<<<< END OF PARENTHESIS

> }
> }
>
> So I put exactly the default device config for my IBM/1814 device but
> no_path_retry set to 12.

Why 12?

This will do 12 retries, 5 seconds each when no path is available. This will
block lvm commands for 60 seconds when no path is available, blocking
other stuff in vdsm. Vdsm is not designed to handle this.

I recommend value of 4.

OK.

But note that this will is not related to the fact that your devices are not
initialize properly after boot.

In fact it could be also a ds4700 overall problem.... The LUNs are configured as LNX CLUSTER type, that should be ok in theory, even if this kind of storage was never so supported with Linux.

Initially one had to use proprietary IBM kernel modules/drivers.

I will see consistency and robustness through testing.

I have to do a POC and this is the hw I have and I should at least try to have a working solution for it.

> In CentOS 6.x when you do something like this, "show config" gives you the
> modified entry only for your device section.
> Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814 and
> also the customized one at the end of the output....

Maybe your device configuration does not match exactly the builtin config.

I think it is the different behaviour as outlined above. I think you can confirm in another system where some customization has been done too...

Maybe waiting a moment helps the storage/switches to clean up
properly after a server is shut down?

I think so too. Eventually when possible, if errors repeat with the new config, I'll manage to do stop/start instead of restart

Does your power management trigger a proper shutdown?
I would avoid using it for normal shutdown.

I have not understood what you mean exactly here... Can you elaborate?

Suppose I have to power off one hypervisor (yum update, pathing, fw update or planned server room maintenance, ...), my workflow is this one all from inside web admin gui:

Move running VMs in charge of the host (or delegate to the following step)

Put host into maintenance

Power Mgmt --> Stop

When planned maintenance ha finished

Power Mgmt --> Start

I should see the host in maintenance

Activate

Or do you mean I should do anything from the host itself and not the GUI?

>
> multipath 0 1 rdac
> vs
> multipath 1 queue_if_no_path 1 rdac

This is not expected, multipath is using unlimited queueing, which is the worst
setup for ovirt.

Maybe this is the result of using "12" instead of 12?

Anyway, looking in multipath source, this is the default configuration for
your device:

405 /* DS3950 / DS4200 / DS4700 / DS5020 */
406 .vendor = "IBM",
407 .product = "^1814",
408 .bl_product = "Universal Xport",
409 .pgpolicy = GROUP_BY_PRIO,
410 .checker_name = RDAC,
411 .features = "2 pg_init_retries 50",
412 .hwhandler = "1 rdac",
413 .prio_name = PRIO_RDAC,
414 .pgfailback = -FAILBACK_IMMEDIATE,
415 .no_path_retry = 30,
416 },

and this is the commit that updated this (and other rdac devices):
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commit;h=c1ed393b91acace284901f16954ba5c1c0d943c9

So I would try this configuration:

device {
vendor "IBM"
product "^1814"

# defaults from multipathd show config
product_blacklist "Universal Xport"
path_grouping_policy "group_by_prio"
path_checker "rdac"
hardware_handler "1 rdac"
prio "rdac"
failback immediate
rr_weight "uniform"

# Based on multipath commit
c1ed393b91acace284901f16954ba5c1c0d943c9
features "2 pg_init_retries 50"

# Default is 30 seconds, ovirt recommended value is 4 to avoid
# blocking in vdsm. This gives 20 seconds (4 * polling_interval)
# gracetime when no path is available.
no_path_retry 4
}

Ben, do you have any other ideas on debugging this issue and
improving multipath configuration?

Nir

OK. In the mean time I have applied your suggested config and restarted the 2 nodes.

Let we test and see if I find any problems running also some I/O tests.

Thanks in the mean time,

Gianluca