On Tue, Sep 4, 2018 at 11:30 AM Fabrice Bacchella <fabrice.bacchella@orange.fr> wrote:


> Le 3 sept. 2018 à 19:15, Nir Soffer <nsoffer@redhat.com> a écrit :

Thank you for you help, but I'm still not out of trouble.

>
> On Mon, Sep 3, 2018 at 8:01 PM Fabrice Bacchella <fabrice.bacchella@orange.fr> wrote:
>
>> Le 3 sept. 2018 à 18:31, Nir Soffer <nsoffer@redhat.com> a écrit :
>>
>> On Mon, Sep 3, 2018 at 5:07 PM Fabrice Bacchella <fabrice.bacchella@orange.fr> wrote:
>> In the release notes, I see:
>>
>> • BZ 1622700 [downstream clone - 4.2.6] [RFE][Dalton] - Blacklist all local disk in multipath on RHEL / RHEV Host (RHEL 7.5)
>> Feature:
>> Blacklist local devices in multipath.
>>
>> Reason:
>> multipath repeatedly logs irrelevant errors for local devices.
>>
>> Result:
>> Local devices are blacklisted, and no irrelevant errors are logged anymore.
>>
>> What defines a local disk ? I'm using a SAN on SAS. For many peoples, SAS is only for local disks, but that's not the case. Will other 4.2.6 will detect that ?
>>
>> We don't have any support for SAS.
>>
>> If you SAS drives are attached to the host using FC or iSCSI, you are fine.
>
> Nope, they are attached using SAS.
>
> I guess oVirt see them as FCP devices?

yes, in ovirt UI, I've configured my storage to be on FCP, and everything worked well since 3.6.

>
> Are these disks connected to multiple hosts?

Yes, that's a real SAN, multi-attached to HPE's blades
>
> Please share the output of:
>
>     vdsm-client Host getDeviceList

Things are strange:

    {
        "status": "used",
        "vendorID": "HP iLO",
        "GUID": "HP_iLO_LUN_01_Media_0_000002660A01-0:1",
        "capacity": "1073741824",
        "fwrev": "2.10",
        "discard_zeroes_data": 0,
        "vgUUID": "",
        "pathlist": [],
        "pvsize": "",
        "discard_max_bytes": 0,
        "pathstatus": [
            {
                "capacity": "1073741824",
                "physdev": "sddj",
                "type": "FCP",
                "state": "active",
                "lun": "1"
            }
        ],
        "devtype": "FCP",
        "physicalblocksize": "512",
        "pvUUID": "",
        "serial": "",
        "logicalblocksize": "512",
        "productID": "LUN 01 Media 0"
    },
...
    {
        "status": "used",
        "vendorID": "HP",
        "GUID": "3600c0ff0002631c42168f15601000000",
        "capacity": "1198996324352",
        "fwrev": "G22x",
        "discard_zeroes_data": 0,
        "vgUUID": "xGCmpC-DhHe-3v6v-6LJw-iS24-ExCE-0Hv48U",
        "pathlist": [],
        "pvsize": "1198698528768",
        "discard_max_bytes": 0,
        "pathstatus": [
            {
                "capacity": "1198996324352",
                "physdev": "sdc",
                "type": "FCP",
                "state": "active",
                "lun": "16"
            },
            {
                "capacity": "1198996324352",
                "physdev": "sds",
                "type": "FCP",
                "state": "active",
                "lun": "16"
            },


...

The first one is an embedded flash drive:
lrwxrwxrwx 1 root root 10 Jul 12 17:11 /dev/disk/by-id/usb-HP_iLO_LUN_01_Media_0_000002660A01-0:1 -> ../../sddj
lrwxrwxrwx 1 root root 10 Jul 12 17:11 /dev/disk/by-path/pci-0000:00:14.0-usb-0:3.1:1.0-scsi-0:0:0:1 -> ../../sddj

So why "type": "FCP",  ?

"FCP" actually means "not iSCSI". This why your sas storage works while oVirt does
know anything about sas.

This is why the blacklist by protocol feature was introduced in 7.5, to multipath can grab
only shared storage, and avoid grabbing local devices like your SSD.
See https://bugzilla.redhat.com/show_bug.cgi?id=1593459

According to this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1607749

The fix is available in:
device-mapper-multipath-0.4.9-119.el7_5.1.x86_64

Which device-mapper-multipath package are you using?
 
The second is indeed a SAS drives behind a SAS SAN (a MSA 2040 SAS from HPE).


>  ...
> Where do I find the protocol multipath thinks the drives are using ?
>
> multipath.conf(5) says:
>
>        The protocol strings that multipath recognizes are scsi:fcp, scsi:spi, scsi:ssa, scsi:sbp,
>        scsi:srp, scsi:iscsi, scsi:sas, scsi:adt, scsi:ata, scsi:unspec, ccw, cciss, nvme,  and
>        undef.  The protocol that a path is using can be viewed by running multipathd show
>        paths format "%d %P"

I have a centos 7.5:

lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.5.1804 (Core)
Release:        7.5.1804
Codename:       Core

and I don't have this in multipath.conf(5). But blacklist_exceptions exists.

The given command don't works:
multipathd show paths format "%d %P"
dev 
sddi
sddj
sda 
... 

It looks like your system does not have the fix.
 
> So this should work:
>
> blacklist_exceptions {
>         protocol "(scsi:fcp|scsi:iscsi|scsi:sas)"                                                                                                                                                     
> }
>
> The best way to make this change is to create a dropin conf file,
> and not touch /etc/multipath.conf, so vdsm will be able to update later.
>
> $cat /etc/multipath/conf.d/local.conf
> blacklist_exceptions {
>         protocol "(scsi:fcp|scsi:iscsi|scsi:sas)"                                                                                                                                                     
> }

The header in /etc/multipath.conf says:

# The recommended way to add configuration for your storage is to add a
# drop-in configuration file in "/etc/multipath/conf.d/<mydevice>.conf".

Does <mydevice> have a signification or it's just a meaningless string that can be used as a reminder ?

mydevice is not a good name, this is just arbitrary name that is useful to you,
multipath does not care about the name.

I'll update this to "my.conf" to make this more clear.

Nir