[ovirt-users] Ovirt host activation and lvm looping with high CPU load trying to mount iSCSI storage

Nir Soffer nsoffer at redhat.com
Sun Jan 15 13:36:05 UTC 2017


On Fri, Jan 13, 2017 at 11:29 AM, Mark Greenall
<m.greenall at iontrading.com> wrote:
> Hi Nir,
>
> Thanks very much for your feedback. It's really useful information. I keep my fingers crossed it leads to a solution for us.
>
> All the settings we currently have were to try and optimise the Equallogic for Linux and Ovirt.
>
> The multipath config settings came from this Dell Forum thread re: getting EqualLogic to work with Ovirt http://en.community.dell.com/support-forums/storage/f/3775/t/19529606

I don't think it is a good idea to copy undocumented changes to
multipath.conf like this.

You must understand any change you have in your multipath.conf. If you cannot
explain any of the changes you should use the defaults.

> The udev settings were from the Dell Optimizing SAN Environment for Linux Guide here: https://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0ahUKEwiXvJes4L7RAhXLAsAKHVWLDyQQFggiMAA&url=http%3A%2F%2Fen.community.dell.com%2Fdell-groups%2Fdtcmedia%2Fm%2Fmediagallery%2F20371245%2Fdownload&usg=AFQjCNG0J8uWEb90m-BwCH_nZJ8lEB3lFA&bvm=bv.144224172,d.d24&cad=rja

Not sure that these changes were tested by someone with ovirt.

I think the general approach is to first make the system work using
the defaults, applying required changes.

Tuning a system should be done after you the system works, and you
can show that you have performance issues that needs tuning.

> Perhaps some of the settings are now conflicting with Ovirt best practice as you optimise the releases.
>
> As requested, here is the output of multipath -ll
>
> [root at uk1-ion-ovm-08 rules.d]# multipath -ll
> 364842a3403798409cf7d555c6b8bb82e dm-237 EQLOGIC ,100E-00
> size=1.5T features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 48:0:0:0  sdan 66:112 active ready running
>   `- 49:0:0:0  sdao 66:128 active ready running
> 364842a34037924a7bf7d25416b8be891 dm-212 EQLOGIC ,100E-00
> size=345G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 42:0:0:0  sdah 66:16  active ready running
>   `- 43:0:0:0  sdai 66:32  active ready running
> 364842a340379c497f47ee5fe6c8b9846 dm-459 EQLOGIC ,100E-00
> size=175G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 86:0:0:0  sdbz 68:208 active ready running
>   `- 87:0:0:0  sdca 68:224 active ready running
> 364842a34037944f2807fe5d76d8b1842 dm-526 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 96:0:0:0  sdcj 69:112 active ready running
>   `- 97:0:0:0  sdcl 69:144 active ready running
> 364842a3403798426d37e05bc6c8b6843 dm-420 EQLOGIC ,100E-00
> size=250G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 82:0:0:0  sdbu 68:128 active ready running
>   `- 83:0:0:0  sdbw 68:160 active ready running
> 364842a340379449fbf7dc5406b8b2818 dm-199 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 38:0:0:0  sdad 65:208 active ready running
>   `- 39:0:0:0  sdae 65:224 active ready running
> 364842a34037984543c7d35a86a8bc8ee dm-172 EQLOGIC ,100E-00
> size=670G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 36:0:0:0  sdaa 65:160 active ready running
>   `- 37:0:0:0  sdac 65:192 active ready running
> 364842a340379e4303c7dd5a76a8bd8b4 dm-140 EQLOGIC ,100E-00
> size=1.5T features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 33:0:0:0  sdx  65:112 active ready running
>   `- 32:0:0:0  sdy  65:128 active ready running
> 364842a340379b44c7c7ed53b6c8ba8c0 dm-359 EQLOGIC ,100E-00
> size=300G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 69:0:0:0  sdbi 67:192 active ready running
>   `- 68:0:0:0  sdbh 67:176 active ready running
> 364842a3403790415d37ed5bb6c8b68db dm-409 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 80:0:0:0  sdbt 68:112 active ready running
>   `- 81:0:0:0  sdbv 68:144 active ready running
> 364842a34037964f7807f15d86d8b8860 dm-527 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 98:0:0:0  sdck 69:128 active ready running
>   `- 99:0:0:0  sdcm 69:160 active ready running
> 364842a34037944aebf7d85416b8ba895 dm-226 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 46:0:0:0  sdal 66:80  active ready running
>   `- 47:0:0:0  sdam 66:96  active ready running
> 364842a340379f44f7c7e053c6c8b98d2 dm-360 EQLOGIC ,100E-00
> size=450G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 70:0:0:0  sdbj 67:208 active ready running
>   `- 71:0:0:0  sdbk 67:224 active ready running
> 364842a34037924276e7e051e6c8b084f dm-308 EQLOGIC ,100E-00
> size=120G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 61:0:0:0  sdba 67:64  active ready running
>   `- 60:0:0:0  sdaz 67:48  active ready running
> 364842a34037994b93b7d85a66a8b789a dm-37 EQLOGIC ,100E-00
> size=270G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 20:0:0:0  sdl  8:176  active ready running
>   `- 21:0:0:0  sdm  8:192  active ready running
> 364842a340379348d6e7e351e6c8b4865 dm-319 EQLOGIC ,100E-00
> size=310G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 62:0:0:0  sdbb 67:80  active ready running
>   `- 63:0:0:0  sdbc 67:96  active ready running
> 364842a34037994cd3b7db5a66a8bc8ff dm-70 EQLOGIC ,100E-00
> size=270G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 23:0:0:0  sdn  8:208  active ready running
>   `- 22:0:0:0  sdo  8:224  active ready running
> 364842a340379e4826d7e451d6c8b6894 dm-269 EQLOGIC ,100E-00
> size=100G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 52:0:0:0  sdar 66:176 active ready running
>   `- 53:0:0:0  sdas 66:192 active ready running
> 364842a340379648ff47eb5fe6c8bc8f8 dm-458 EQLOGIC ,100E-00
> size=100G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 84:0:0:0  sdbx 68:176 active ready running
>   `- 85:0:0:0  sdby 68:192 active ready running
> 364842a340379d4a3bf7df5406b8b08bf dm-200 EQLOGIC ,100E-00
> size=300G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 40:0:0:0  sdaf 65:240 active ready running
>   `- 41:0:0:0  sdag 66:0   active ready running
> 364842a34037984db3b7de5a66a8bf896 dm-84 EQLOGIC ,100E-00
> size=270G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 24:0:0:0  sdp  8:240  active ready running
>   `- 25:0:0:0  sdq  65:0   active ready running
> 364842a340379c4fa3b7d75a76a8b88bf dm-113 EQLOGIC ,100E-00
> size=350G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 29:0:0:0  sdt  65:48  active ready running
>   `- 28:0:0:0  sdu  65:64  active ready running
> 364842a34a3382221bd7695715c18612a dm-5 EQLOGIC ,100E-00
> size=250G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 16:0:0:0  sdh  8:112  active ready running
>   `- 17:0:0:0  sdj  8:144  active ready running
> 364842a3403794465ec84b5fe778b88f9 dm-548 EQLOGIC ,100E-00
> size=500G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 105:0:0:0 sdcs 70:0   active ready running
>   `- 104:0:0:0 sdcr 69:240 active ready running
> 364842a340379c4206e7ed51d6c8ba87e dm-280 EQLOGIC ,100E-00
> size=300G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 58:0:0:0  sdaw 67:0   active ready running
>   `- 59:0:0:0  sday 67:32  active ready running
> 364842a34a33842ad7e7c15755c18311b dm-15 EQLOGIC ,100E-00
> size=350G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 18:0:0:0  sdi  8:128  active ready running
>   `- 19:0:0:0  sdk  8:160  active ready running
> 364842a34a33822199875b5705c18711f dm-1 EQLOGIC ,100E-00
> size=250G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 12:0:0:0  sdc  8:32   active ready running
>   `- 13:0:0:0  sde  8:64   active ready running
> 364842a34a338a237947585705c18214e dm-0 EQLOGIC ,100E-00
> size=100G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 10:0:0:0  sdb  8:16   active ready running
>   `- 11:0:0:0  sdd  8:48   active ready running
> 364842a34037934a7c18385d0758ba8e1 dm-547 EQLOGIC ,100E-00
> size=100G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 103:0:0:0 sdcq 69:224 active ready running
>   `- 102:0:0:0 sdcp 69:208 active ready running
> 364842a340379349fc97ff56f6e8bf843 dm-546 EQLOGIC ,100E-00
> size=500G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 100:0:0:0 sdcn 69:176 active ready running
>   `- 101:0:0:0 sdco 69:192 active ready running
> 364842a34037914537c7e353c6c8bf891 dm-382 EQLOGIC ,100E-00
> size=300G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 72:0:0:0  sdbl 67:240 active ready running
>   `- 73:0:0:0  sdbm 68:0   active ready running
> 364842a340379347d6d7e151d6c8bf8ad dm-257 EQLOGIC ,100E-00
> size=100G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 51:0:0:0  sdap 66:144 active ready running
>   `- 50:0:0:0  sdaq 66:160 active ready running
> 364842a3403798485b57eb59a6c8bc8be dm-405 EQLOGIC ,100E-00
> size=65G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 76:0:0:0  sdbp 68:48  active ready running
>   `- 77:0:0:0  sdbr 68:80  active ready running
> 364842a340379d4986e7e951e6c8b48b3 dm-330 EQLOGIC ,100E-00
> size=70G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 66:0:0:0  sdbf 67:144 active ready running
>   `- 67:0:0:0  sdbg 67:160 active ready running
> 364842a34037924433c7d05a86a8bc819 dm-154 EQLOGIC ,100E-00
> size=1.1T features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 34:0:0:0  sdz  65:144 active ready running
>   `- 35:0:0:0  sdab 65:176 active ready running
> 364842a340379a4e5807f85d76d8b087c dm-506 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 92:0:0:0  sdcf 69:48  active ready running
>   `- 93:0:0:0  sdch 69:80  active ready running
> 364842a340379c4876d7e751d6c8be8a6 dm-268 EQLOGIC ,100E-00
> size=100G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 54:0:0:0  sdat 66:208 active ready running
>   `- 55:0:0:0  sdau 66:224 active ready running
> 364842a34037964eb807fb5d76d8b586d dm-507 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 94:0:0:0  sdcg 69:64  active ready running
>   `- 95:0:0:0  sdci 69:96  active ready running
> 364842a340379a4cda87ec57a6c8ba85f dm-383 EQLOGIC ,100E-00
> size=50G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 74:0:0:0  sdbn 68:16  active ready running
>   `- 75:0:0:0  sdbo 68:32  active ready running
> 364842a340379a49ff47e15ff6c8b18ef dm-482 EQLOGIC ,100E-00
> size=200G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 88:0:0:0  sdcb 68:240 active ready running
>   `- 89:0:0:0  sdcc 69:0   active ready running
> 364842a340379f488b57ee59a6c8b789c dm-406 EQLOGIC ,100E-00
> size=300G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 78:0:0:0  sdbq 68:64  active ready running
>   `- 79:0:0:0  sdbs 68:96  active ready running
> 364842a340379641f3c7da5a76a8b5804 dm-126 EQLOGIC ,100E-00
> size=1.5T features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 30:0:0:0  sdv  65:80  active ready running
>   `- 31:0:0:0  sdw  65:96  active ready running
> 364842a340379e4a4f47e45ff6c8b0882 dm-483 EQLOGIC ,100E-00
> size=400G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 90:0:0:0  sdcd 69:16  active ready running
>   `- 91:0:0:0  sdce 69:32  active ready running
> 364842a34037924aabf7d55416b8bf8d4 dm-225 EQLOGIC ,100E-00
> size=345G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 45:0:0:0  sdak 66:64  active ready running
>   `- 44:0:0:0  sdaj 66:48  active ready running
> 364842a340379841b6e7ea51d6c8b88ef dm-279 EQLOGIC ,100E-00
> size=50G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 56:0:0:0  sdav 66:240 active ready running
>   `- 57:0:0:0  sdax 67:16  active ready running
> 364842a340379e4ea3b7d15a76a8b58f6 dm-98 EQLOGIC ,100E-00
> size=270G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 26:0:0:0  sdr  65:16  active ready running
>   `- 27:0:0:0  sds  65:32  active ready running
> 364842a34a33832229875e5705c180173 dm-2 EQLOGIC ,100E-00
> size=250G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 14:0:0:0  sdf  8:80   active ready running
>   `- 15:0:0:0  sdg  8:96   active ready running
> 364842a34037964946e7e651e6c8b98bc dm-329 EQLOGIC ,100E-00
> size=210G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
>   |- 64:0:0:0  sdbe 67:128 active ready running
>   `- 65:0:0:0  sdbd 67:112 active ready running
>
>
>
> For the used configuration multipathd does not have a device section for EQLOGIC so we are using the defaults which are:
>
> defaults {
>         verbosity 2
>         polling_interval 10
>         max_polling_interval 40
>         reassign_maps "yes"
>         multipath_dir "/lib64/multipath"
>         path_selector "round-robin 0"
>         path_grouping_policy "multibus"
>         uid_attribute "ID_SERIAL"
>         prio "const"
>         prio_args ""
>         features "0"
>         path_checker "tur"
>         alias_prefix "mpath"
>         failback "immediate"
>         rr_min_io 1000
>         rr_min_io_rq 1
>         max_fds 8192
>         rr_weight "priorities"
>         no_path_retry "fail"
>         queue_without_daemon "no"
>         flush_on_last_del "no"
>         user_friendly_names "no"
>         fast_io_fail_tmo 5
>         bindings_file "/etc/multipath/bindings"
>         wwids_file /etc/multipath/wwids
>         log_checker_err always
>         find_multipaths no
>         retain_attached_hw_handler no
>         detect_prio no
>         hw_str_match no
>         force_sync no
>         deferred_remove no
>         ignore_new_boot_devs no
>         skip_kpartx no
>         config_dir "/etc/multipath/conf.d"
>         delay_watch_checks no
>         delay_wait_checks no
>         retrigger_tries 3
>         retrigger_delay 10
>         missing_uev_wait_timeout 30
>         new_bindings_in_boot no
> }
>
> Many Thanks,
> Mark

I suggest to use this following multipath.conf.

This may not solve anything, but should be  a good start to document
what are the modified parameters for this storage.

# VDSM REVISION 1.3
# VDSM PRIVATE

defaults {
    deferred_remove             yes
    dev_loss_tmo                30
    fast_io_fail_tmo            5
    flush_on_last_del           yes
    max_fds                     4096
    no_path_retry               fail
    polling_interval            5
    user_friendly_names         no
}

devices {
    device {
        vendor                  "EQLOGIC"
        product                 "100E-00"

        # Ovirt defaults
        deferred_remove         yes
        dev_loss_tmo            30
        fast_io_fail_tmo        5
        flush_on_last_del       yes
        polling_interval        5
        user_friendly_names     no

        # Local settings
        max_fds                 8192
        path_checker            tur
        path_grouping_policy    multibus
        path_selector           "round-robin 0"

        # Use 4 retries will provide additional 20 seconds gracetime when no
        # path is available before the device is disabled. (assuming 5 seconds
        # polling interval). This may prevent vms from pausing when there is
        # short outage on the storage server or network.
        no_path_retry           4
   }

    device {
        # These settings overrides built-in devices settings. It does not apply
        # to devices without built-in settings (these use the settings in the
        # "defaults" section), or to devices defined in the "devices" section.
        all_devs                yes
        no_path_retry           fail
    }
}

Can you report if this make any difference?

Nir


More information about the Users mailing list