On Thu, Jan 12, 2017 at 12:02 PM, Mark Greenall
<m.greenall(a)iontrading.com> wrote:
Firstly, thanks @Yaniv and thanks @Nir for your responses.
@Yaniv, in answer to this:
>> Why do you have 1 SD per VM?
It's a combination of performance and ease of management. We ran some IO tests with
various configurations and settled on this one for a balance of reduced IO contention and
ease of management. If there is a better recommended way of handling these then I'm
all ears. If you believe having a large amount of storage domains adds to the problem then
we can also review the setup.
>> Can you try and disable (mask) the lvmetad service on the hosts and see if it
improves matters?
Disabled and masked the lvmetad service and tried again this morning. It seemed to be
less of a load / quicker getting the initial activation of the host working but the end
result was still the same. Just under 10 minutes later the node went non-operational and
the cycle began again. By 09:27 we had the high CPU load and repeating lvm cycle.
Host Activation: 09:06
Host Up: 09:08
Non-Operational: 09:16
LVM Load: 09:27
Host Reboot: 09:30
From yesterday and today I've attached messages, sanlock.log and multipath.conf files
too. Although I'm not sure the messages file will be of much use as it looks like log
rate limiting kicked in and supressed messages for the duration of the process. I'm
booted off the kernel with debugging but maybe that's generating too much info? Let me
know if you want me to change anything here to get additional information.
As added configuration information we also have the following settings from the
Equallogic and Linux install guide:
/etc/sysctl.conf:
# Prevent ARP Flux for multiple NICs on the same subnet:
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
# Loosen RP Filter to alow multiple iSCSI connections
net.ipv4.conf.all.rp_filter = 2
And the following /lib/udev/rules.d/99-eqlsd.rules:
#-----------------------------------------------------------------------------
# Copyright (c) 2010-2012 by Dell, Inc.
#
# All rights reserved. This software may not be copied, disclosed,
# transferred, or used except in accordance with a license granted
# by Dell, Inc. This software embodies proprietary information
# and trade secrets of Dell, Inc.
#
#-----------------------------------------------------------------------------
#
# Various Settings for Dell Equallogic disks based on Dell Optimizing SAN Environment for
Linux Guide
#
# Modify disk scheduler mode to noop
ACTION=="add|change", SUBSYSTEM=="block",
ATTRS{vendor}=="EQLOGIC", RUN+="/bin/sh -c 'echo noop >
/sys/${DEVPATH}/queue/scheduler'"
# Modify disk timeout value to 60 seconds
ACTION!="remove", SUBSYSTEM=="block",
ATTRS{vendor}=="EQLOGIC", RUN+="/bin/sh -c 'echo 60 >
/sys/%p/device/timeout'"
This timeout may cause large timeouts in vdsm in commands accessing
storage, it may cause timeouts in various flows, and may cause your
domain to become inactive - since you set this for all domains, it may
cause the entire host to become non-operational.
I recommend to remove this rule.
# Modify read ahead value to 1024
ACTION!="remove", SUBSYSTEM=="block",
ATTRS{vendor}=="EQLOGIC", RUN+="/bin/sh -c 'echo 1024 >
/sys/${DEVPATH}/bdi/read_ahead_kb'"
In your multipath.conf, I see that you changed lot of the defaults
recommended by ovirt:
defaults {
deferred_remove yes
dev_loss_tmo 30
fast_io_fail_tmo 5
flush_on_last_del yes
max_fds 4096
no_path_retry fail
polling_interval 5
user_friendly_names no
}
You are using:
defaults {
You are not using "deferred_remove", so you get the default value
("no").
Do you have any reason to change this?
You are not using "dev_loss_tmo", so you get the default value
Do you have any reason to change this?
You are not using "fast_io_fail_tmo", so you will get the default
value (hopefully 5).
Do you have any reason to change this?
You are not using "flush_on_last_del " - any reason to change this?
failback immediate
max_fds 8192
no_path_retry fail
I guess these are the settings recommended for your storage?
path_checker tur
path_grouping_policy multibus
path_selector "round-robin 0"
polling_interval 10
This will means multipathd will check paths every 10-40 seconds.
You should use the default 5, which cause multipathd to check every
5-20 seconds.
rr_min_io 10
rr_weight priorities
user_friendly_names no
}
Also you are mixing defaults and settings that you need for your specific
devices.
You should leave the default without change, and create a device section
for your device:
devices {
device {
vendor XXX
product YYY
# ovirt specific settings
deferred_remove yes
dev_loss_tmo 30
fast_io_fail_tmo 5
flush_on_last_del yes
no_path_retry fail
polling_interval 5
user_friendly_names no
# device specific settings
max_fds 8192
path_checker tur
path_grouping_policy multibus
path_selector "round-robin 0"
}
}
Note that you must copy ovirt defaults into the device section, otherwise
you will get multipathd builtin defaults, which are not the same.
Can you share also the output of:
multipath -ll
In this command you can see the name of vendor and product.
Using these names, find the effective configuration of your
multipath devices using this command:
multipathd show config
If the the device is not listed in the output, you are using
the defaults.
Please share here the configuration for your device or the defaults
from the output of multiapthd show config.
For example, here is my test storage:
# multipath -ll
3600140549f3b93968d440ac9129d124f dm-11 LIO-ORG ,target1-12
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
`- 22:0:0:12 sdi 8:128 active ready running
multipathd does not have any setting for LIO-ORG, so we get the defaults:
defaults {
verbosity 2
polling_interval 5
max_polling_interval 20
reassign_maps "yes"
multipath_dir "/lib64/multipath"
path_selector "service-time 0"
path_grouping_policy "failover"
uid_attribute "ID_SERIAL"
prio "const"
prio_args ""
features "0"
path_checker "directio"
alias_prefix "mpath"
failback "manual"
rr_min_io 1000
rr_min_io_rq 1
max_fds 4096
rr_weight "uniform"
no_path_retry "fail"
queue_without_daemon "no"
flush_on_last_del "yes"
user_friendly_names "no"
fast_io_fail_tmo 5
dev_loss_tmo 30
bindings_file "/etc/multipath/bindings"
wwids_file /etc/multipath/wwids
log_checker_err always
find_multipaths no
retain_attached_hw_handler no
detect_prio no
hw_str_match no
force_sync no
deferred_remove yes
ignore_new_boot_devs no
skip_kpartx no
config_dir "/etc/multipath/conf.d"
delay_watch_checks no
delay_wait_checks no
retrigger_tries 3
retrigger_delay 10
missing_uev_wait_timeout 30
new_bindings_in_boot no
}
Nir