[ovirt-users] VM has been paused due to storage I/O problem

Thu Feb 2 21:53:44 UTC 2017

On Wed, Feb 01, 2017 at 09:39:45AM +0200, Nir Soffer wrote:
> On Tue, Jan 31, 2017 at 6:09 PM, Gianluca Cecchi
> <gianluca.cecchi at gmail.com> wrote:
> > On Tue, Jan 31, 2017 at 3:23 PM, Nathanaël Blanchet <blanchet at abes.fr>
> > wrote:
> >>
> >> exactly the same issue by there with FC EMC domain storage...
> >>
> >>
> >
> > I'm trying to mitigate inserting a timeout for my SAN devices but I'm not
> > sure of its effectiveness as CentOS 7 behavior  of "multipathd -k" and then
> > "show config" seems different from CentOS 6.x
> > In fact my attempt for multipath.conf is this

There was a significant change in how multipath deals with merging
device configurations between RHEL6 and RHEL7.  The short answer is, as
long as you copy the entire existing configuration, and just change what
you want changed (like you did), you can ignore the change.  Also,
multipath doesn't care if you quote numbers.

If you want to verify that no_path_retry is being set as intented, you
can run:

# multipath -r -v3 | grep no_path_retry

To reload you multipath devices with verbosity turned up. You shoudl see
lines like:

Feb 02 09:38:30 | mpatha: no_path_retry = 12 (controller setting)

That will tell you what no_path_retry is set to.

The configuration Nir suggested at the end of this email looks good to
me.

Now, here's the long answer:

multipath allows you to merge device configurations.  This means that as
long as you put in the "vendor" and "product" strings, you only need to
set the other values that you care about. On RHEL6, this would work

        device {
                vendor "IBM"
                product "^1814"
                no_path_retry 12
        }

And it would create a configuration that was exactly the same as the
builtin config for this device, except that no_path_retry was set to 12.
However, this wasn't as easy for users as it was supposed to be.
Specifically, users would often add their device's vendor and product
information, as well as whatever they wanted changed, and then be
surprised when multipath didn't retain all the information from the
builtin configuration as advertised. This is because they used the
actual vendor and product strings for their device, but the builtin
device configuration's vendor and product strings were regexes. In
RHEL6, multipath only merged configurations if the vendor and product
strings string matched. So users would try

        device {
                vendor "IBM"
                product "1814 FASt"
                no_path_retry 12
        }

and it wouldn't work as expected, since the product strings didn't
match.  To fix this, when RHEL7 checks if a user configuration should be
merged with a builtin configuration, all that is required is that the
user configuration's vendor and product strings regex match the builtin.
This means that the above configuration will work as expected in RHEL7.
However the first configuration won't because "^1814" doesn't regex
match "^1814".  This means that multipath would treat is as a completely
new configuration, and not merge any values from the builtin
configuration.  You can reenable the RHEL6 behaviour in RHEL7 by setting 

hw_str_match yes

in the defaults section.

Now, because the builtin configurations could handle more than one
device type per configuration, since they used regexes to match the
vendor and product strings, multipath couldn't just remove the original
builtin configuration when users added a new configuration that modified
it.  Otherwise, devices that regex matched the builtin configuration's
vendor and product strings but not the user configuration's vendor and
product strings wouldn't have any device configuration information. So
multipath keeps the original builtin configuration as well as the new
one.  However, when it's time to assign a device configuration to a
device, multipath looks through the device configurations list
backwards, and finds the first match.  This means that it will always
use the user configuration instead of the builtin one (since new
configurations get added to the end of the list).

Like I said before, if you add all the values you want set in your
configuration, instead of relying on them being merged from the builtin
configuration, then you don't need to worry about any of this.

-Ben

> >
> > # VDSM REVISION 1.3
> > # VDSM PRIVATE
> >
> > defaults {
> >     polling_interval            5
> >     no_path_retry               fail
> >     user_friendly_names         no
> >     flush_on_last_del           yes
> >     fast_io_fail_tmo            5
> >     dev_loss_tmo                30
> >     max_fds                     4096
> > }
> >
> > # Remove devices entries when overrides section is available.
> > devices {
> >     device {
> >         # These settings overrides built-in devices settings. It does not
> > apply
> >         # to devices without built-in settings (these use the settings in
> > the
> >         # "defaults" section), or to devices defined in the "devices"
> > section.
> >         # Note: This is not available yet on Fedora 21. For more info see
> >         # https://bugzilla.redhat.com/1253799
> >         all_devs                yes
> >         no_path_retry           fail
> >     }
> >         device {
> >                 vendor "IBM"
> >                 product "^1814"
> >                 product_blacklist "Universal Xport"
> >                 path_grouping_policy "group_by_prio"
> >                 path_checker "rdac"
> >                 features "0"
> >                 hardware_handler "1 rdac"
> >                 prio "rdac"
> >                 failback immediate
> >                 rr_weight "uniform"
> >                 no_path_retry "12"
> 
> Hi Gianluca,
> 
> This should be a number, not a string, maybe multipath is having trouble
> parsing this and it ignores your value?
> 
> >         }
> > }
> >
> > So I put exactly the default device config for my IBM/1814 device but
> > no_path_retry set to 12.
> 
> Why 12?
> 
> This will do 12 retries, 5 seconds each when no path is available. This will
> block lvm commands for 60 seconds when no path is available, blocking
> other stuff in vdsm. Vdsm is not designed to handle this.
> 
> I recommend value of 4.
> 
> But note that this will is not related to the fact that your devices are not
> initialize properly after boot.
> 
> > In CentOS 6.x when you do something like this, "show config" gives you the
> > modified entry only for your device section.
> > Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814 and
> > also the customized one at the end of the output....
> 
> Maybe your device configuration does not match exactly the builtin config.
> 
> >
> > Two facts:
> > - before I could reproduce the problem if I selected
> > Maintenance
> > Power Mgmt ---> Restart
> > (tried 3 times with same behavior)
> >
> > Instead if I executed in separate steps
> > Maintenance
> > Power Mgmt --> Stop
> > wait a moment
> > Power Mgmt --> Start
> >
> > I didn't get problems (tried only one time...)
> 
> Maybe waiting a moment helps the storage/switches to clean up
> properly after a server is shut down?
> 
> Does your power management trigger a proper shutdown?
> I would avoid using it for normal shutdown.
> 
> >
> > With this "new" multipath config (to be confirmed if in effect, how?) I
> > don't get the VM paused problem even with Restart option of Power Mgmt
> > In active host messages I see these ones when the other reboots:
> >
> > Jan 31 16:50:01 ovmsrv06 systemd: Started Session 705 of user root.
> > Jan 31 16:50:01 ovmsrv06 systemd: Starting Session 705 of user root.
> > Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde
> > - rdac checker reports path is up
> > Jan 31 16:53:47 ovmsrv06 multipathd: 8:64: reinstated
> > Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load
> > table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1
> > 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]
> > Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo
> > - rdac checker reports path is ghost
> > Jan 31 16:53:47 ovmsrv06 multipathd: 8:224: reinstated
> > Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk
> > - rdac checker reports path is up
> > Jan 31 16:53:47 ovmsrv06 multipathd: 8:160: reinstated
> > Jan 31 16:53:47 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > queueing MODE_SELECT command
> > Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq
> > - rdac checker reports path is ghost
> > Jan 31 16:53:47 ovmsrv06 multipathd: 65:0: reinstated
> > Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > MODE_SELECT returned with sense 05/91/36
> > Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > queueing MODE_SELECT command
> > Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > MODE_SELECT returned with sense 05/91/36
> > Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > queueing MODE_SELECT command
> > Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > MODE_SELECT completed
> > Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > queueing MODE_SELECT command
> > Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1,
> > MODE_SELECT completed
> > Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde
> > - rdac checker reports path is ghost
> > Jan 31 16:53:52 ovmsrv06 multipathd: 8:64: reinstated
> > Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load
> > table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1
> > 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]
> > Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo
> > - rdac checker reports path is up
> > Jan 31 16:53:52 ovmsrv06 multipathd: 8:224: reinstated
> > Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk
> > - rdac checker reports path is ghost
> > Jan 31 16:53:52 ovmsrv06 multipathd: 8:160: reinstated
> > Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq
> > - rdac checker reports path is up
> > Jan 31 16:53:52 ovmsrv06 multipathd: 65:0: reinstated
> >
> > But they are not related to the multipath device dedicated to oVirt storage
> > domain in this case....
> > What lets me be optimistic seems the difference in these lines:
> >
> > before I got
> > Jan 31 10:27:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load
> > table [0 41943040 multipath 0 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1
> > service-time 0 2 1 8:64 1 8:160 1]
> >
> > now I get
> > Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load
> > table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1
> > 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]
> >
> > multipath 0 1 rdac
> > vs
> > multipath 1 queue_if_no_path 1 rdac
> 
> This is not expected, multipath is using unlimited queueing, which is the worst
> setup for ovirt.
> 
> Maybe this is the result of using "12" instead of 12?
> 
> Anyway, looking in multipath source, this is the default configuration for
> your device:
> 
> 405         /* DS3950 / DS4200 / DS4700 / DS5020 */
>  406         .vendor        = "IBM",
>  407         .product       = "^1814",
>  408         .bl_product    = "Universal Xport",
>  409         .pgpolicy      = GROUP_BY_PRIO,
>  410         .checker_name  = RDAC,
>  411         .features      = "2 pg_init_retries 50",
>  412         .hwhandler     = "1 rdac",
>  413         .prio_name     = PRIO_RDAC,
>  414         .pgfailback    = -FAILBACK_IMMEDIATE,
>  415         .no_path_retry = 30,
>  416     },
> 
> and this is the commit that updated this (and other rdac devices):
> http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commit;h=c1ed393b91acace284901f16954ba5c1c0d943c9
> 
> So I would try this configuration:
> 
> device {
>                 vendor "IBM"
>                 product "^1814"
> 
>                 # defaults from multipathd show config
>                 product_blacklist "Universal Xport"
>                 path_grouping_policy "group_by_prio"
>                 path_checker "rdac"
>                 hardware_handler "1 rdac"
>                 prio "rdac"
>                 failback immediate
>                 rr_weight "uniform"
> 
>                 # Based on multipath commit
> c1ed393b91acace284901f16954ba5c1c0d943c9
>                 features "2 pg_init_retries 50"
> 
>                 # Default is 30 seconds, ovirt recommended value is 4 to avoid
>                 # blocking in vdsm. This gives 20 seconds (4 * polling_interval)
>                 # gracetime when no path is available.
>                 no_path_retry 4
>         }
> 
> Ben, do you have any other ideas on debugging this issue and
> improving multipath configuration?
> 
> Nir