[ovirt-users] VM has been paused due to storage I/O problem

Gianluca Cecchi gianluca.cecchi at gmail.com
Wed Feb 1 19:22:09 UTC 2017


On Wed, Feb 1, 2017 at 8:39 AM, Nir Soffer <nsoffer at redhat.com> wrote:

>
>
> Hi Gianluca,
>
> This should be a number, not a string, maybe multipath is having trouble
> parsing this and it ignores your value?
>

I don't think so. Also because reading dm multipath guide at
https://access.redhat.com/documentation/en-US/Red_Hat_Enterp
rise_Linux/7/html/DM_Multipath/multipath_config_confirm.html
It seems that in RH EL 7.3 the "show config" command has this behaviour:

"
For example, the following command sequence displays the multipath
configuration, including the defaults, before exiting the console.

# multipathd -k
> > show config
> > CTRL-D
"

So the output has to include the default too. Anyway I changed it, see below

<<<<< BEGIN OF PARENTHESIS
In theory it should be the same on RH EL 6.8
(see https://access.redhat.com/documentation/en-US/Red_Hat_Enterp
rise_Linux/6/html/DM_Multipath/multipath_config_confirm.html )
but it is not so for me on a system that is on 6.5, with
device-mapper-multipath-0.4.9-93.el6.x86_64 and connected to Netapp

In /usr/share/doc/device-mapper-multipath-0.4.9/multipath.conf.defaults


#               vendor "NETAPP"
#               product "LUN.*"
#               path_grouping_policy group_by_prio
#               getuid_callout "/lib/udev/scsi_id --whitelisted
--device=/dev/%n"
#               path_selector "round-robin 0"
#               path_checker tur
#               features "3 queue_if_no_path pg_init_retries 50"
#               hardware_handler "0"
#               prio ontap
#               failback immediate
#               rr_weight uniform
#               rr_min_io 128
#               rr_min_io_rq 1
#               flush_on_last_del yes
#               fast_io_fail_tmo 5
#               dev_loss_tmo infinity
#               retain_attached_hw_handler yes
#               detect_prio yes
#               reload_readwrite yes
#       }



My customization in multipath.conf, based on Netapp guidelines and my
Netapp storage array setup:

devices {
       device {
               vendor "NETAPP"
               product "LUN.*"
               getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n"
               hardware_handler "1 alua"
               prio alua
        }

If I run "multipathd show config" I see only 1 entry for NETAPP/LUN
vendor/product and it is a merge of default and my custom.

        device {
                vendor "NETAPP"
                product "LUN.*"
                path_grouping_policy group_by_prio
                getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n"
                path_selector "round-robin 0"
                path_checker tur
                features "3 queue_if_no_path pg_init_retries 50"
                hardware_handler "1 alua"
                prio alua
                failback immediate
                rr_weight uniform
                rr_min_io 128
                rr_min_io_rq 1
                flush_on_last_del yes
                fast_io_fail_tmo 5
                dev_loss_tmo infinity
                retain_attached_hw_handler yes
                detect_prio yes
                reload_readwrite yes
        }

So this difference confused me when configuring multipath in CentOS 7.3. I
have to see when I'm going to update from 6.5 to 6.8 if this changes.

<<<<< END OF PARENTHESIS


> >         }
> > }
> >
> > So I put exactly the default device config for my IBM/1814 device but
> > no_path_retry set to 12.
>
> Why 12?
>
> This will do 12 retries, 5 seconds each when no path is available. This
> will
> block lvm commands for 60 seconds when no path is available, blocking
> other stuff in vdsm. Vdsm is not designed to handle this.
>
> I recommend value of 4.
>

OK.



> But note that this will is not related to the fact that your devices are
> not
> initialize properly after boot.
>

In fact it could be also a ds4700 overall problem.... The LUNs are
configured as LNX CLUSTER type, that should be ok in theory, even if this
kind of storage was never so supported with Linux.
Initially one had to use proprietary IBM kernel modules/drivers.
I will see consistency and robustness through testing.
I have to do a POC and this is the hw I have and I should at least try to
have a working solution for it.


>
> > In CentOS 6.x when you do something like this, "show config" gives you
> the
> > modified entry only for your device section.
> > Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814
> and
> > also the customized one at the end of the output....
>
> Maybe your device configuration does not match exactly the builtin config.
>

I think it is the different behaviour as outlined above. I think you can
confirm in another system where some customization has been done too...


>
>
> Maybe waiting a moment helps the storage/switches to clean up
> properly after a server is shut down?
>

I think so too. Eventually when possible, if errors repeat with the new
config, I'll manage to do stop/start instead of restart


>
> Does your power management trigger a proper shutdown?
> I would avoid using it for normal shutdown.
>

I have not understood what you mean exactly here... Can you elaborate?
Suppose I have to power off one hypervisor (yum update, pathing, fw update
or planned server room maintenance, ...), my workflow is this one all from
inside web admin gui:

Move running VMs in charge of the host (or delegate to the following step)
Put host into maintenance
Power Mgmt --> Stop

When planned maintenance ha finished

Power Mgmt --> Start
I should see the host in maintenance
Activate

Or do you mean I should do anything from the host itself and not the GUI?



>
> >
> > multipath 0 1 rdac
> > vs
> > multipath 1 queue_if_no_path 1 rdac
>
> This is not expected, multipath is using unlimited queueing, which is the
> worst
> setup for ovirt.
>
> Maybe this is the result of using "12" instead of 12?
>
> Anyway, looking in multipath source, this is the default configuration for
> your device:
>
> 405         /* DS3950 / DS4200 / DS4700 / DS5020 */
>  406         .vendor        = "IBM",
>  407         .product       = "^1814",
>  408         .bl_product    = "Universal Xport",
>  409         .pgpolicy      = GROUP_BY_PRIO,
>  410         .checker_name  = RDAC,
>  411         .features      = "2 pg_init_retries 50",
>  412         .hwhandler     = "1 rdac",
>  413         .prio_name     = PRIO_RDAC,
>  414         .pgfailback    = -FAILBACK_IMMEDIATE,
>  415         .no_path_retry = 30,
>  416     },
>
> and this is the commit that updated this (and other rdac devices):
> http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=c
> ommit;h=c1ed393b91acace284901f16954ba5c1c0d943c9
>
> So I would try this configuration:
>
> device {
>                 vendor "IBM"
>                 product "^1814"
>
>                 # defaults from multipathd show config
>                 product_blacklist "Universal Xport"
>                 path_grouping_policy "group_by_prio"
>                 path_checker "rdac"
>                 hardware_handler "1 rdac"
>                 prio "rdac"
>                 failback immediate
>                 rr_weight "uniform"
>
>                 # Based on multipath commit
> c1ed393b91acace284901f16954ba5c1c0d943c9
>                 features "2 pg_init_retries 50"
>
>                 # Default is 30 seconds, ovirt recommended value is 4 to
> avoid
>                 # blocking in vdsm. This gives 20 seconds (4 *
> polling_interval)
>                 # gracetime when no path is available.
>                 no_path_retry 4
>         }
>
> Ben, do you have any other ideas on debugging this issue and
> improving multipath configuration?
>
> Nir
>

OK. In the mean time I have applied your suggested config and restarted the
2 nodes.
Let we test and see if I find any problems running also some I/O tests.
Thanks in the mean time,
Gianluca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170201/66707fda/attachment-0001.html>


More information about the Users mailing list