<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Feb 9, 2017 at 10:03 AM, Grundmann, Christian <span dir="ltr">&lt;<a href="mailto:Christian.Grundmann@fabasoft.com" target="_blank">Christian.Grundmann@fabasoft.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>

<br>

@ Can also be low level issue in kernel, hba, switch, server.<br>

I have the old storage on the same cable so I don’t think its hba or switch related<br>

On the same Switch I have a few ESXi Server with same storage setup which are working without problems.<br>

<br>

@multipath<br>

I use stock ng-node multipath configuration<br>

<br>

# VDSM REVISION 1.3<br>

<br>

defaults {<br>

    polling_interval            5<br>

    no_path_retry               fail<br>

    user_friendly_names         no<br>

    flush_on_last_del           yes<br>

    fast_io_fail_tmo            5<br>

    dev_loss_tmo                30<br>

    max_fds                     4096<br>

}<br>

<br>

# Remove devices entries when overrides section is available.<br>

devices {<br>

    device {<br>

        # These settings overrides built-in devices settings. It does not apply<br>

        # to devices without built-in settings (these use the settings in the<br>

        # &quot;defaults&quot; section), or to devices defined in the &quot;devices&quot; section.<br>

        # Note: This is not available yet on Fedora 21. For more info see<br>

        # <a href="https://bugzilla.redhat.com/1253799" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/<wbr>1253799</a><br>

        all_devs                yes<br>

        no_path_retry           fail<br>

    }<br>

}<br>

<br>

# Enable when this section is available on all supported platforms.<br>

# Options defined here override device specific options embedded into<br>

# multipathd.<br>

#<br>

# overrides {<br>

#      no_path_retry           fail<br>

# }<br>

<br>

<br>

multipath -r v3<br>

has no output<br></blockquote><div><br></div><div>My mistake, the correct command is:</div><div><br></div><div>multipath -r -v3<br></div><div><br></div><div>It creates tons of output, so better redirect to file and attach the file:</div><div><br></div><div>multipath -r -v3 &gt; multiapth-r-v3.out<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

<br>

Thx Christian<br>

<br>

<br>

Von: Nir Soffer [mailto:<a href="mailto:nsoffer@redhat.com">nsoffer@redhat.com</a>]<br>

Gesendet: Mittwoch, 08. Februar 2017 20:44<br>

An: Grundmann, Christian &lt;<a href="mailto:Christian.Grundmann@fabasoft.com">Christian.Grundmann@fabasoft.<wbr>com</a>&gt;<br>

Cc: <a href="mailto:users@ovirt.org">users@ovirt.org</a><br>

Betreff: Re: [ovirt-users] Storage domain experienced a high latency<br>

<div><div class="gmail-h5"><br>

On Wed, Feb 8, 2017 at 6:11 PM, Grundmann, Christian &lt;mailto:<a href="mailto:Christian.Grundmann@fabasoft.com">Christian.Grundmann@<wbr>fabasoft.com</a>&gt; wrote:<br>

Hi,<br>

got a new FC Storage (EMC Unity 300F) which is seen by my Hosts additional to my old Storage for Migration.<br>

New Storage has only on PATH until Migration is done.<br>

I already have a few VMs running on the new Storage without Problem.<br>

But after starting some VMs (don’t really no whats the difference to working ones), the Path for new Storage fails.<br>

 <br>

Engine tells me: Storage Domain &lt;storagedomain&gt; experienced a high latency of 22.4875 seconds from host &lt;host&gt;<br>

 <br>

Where can I start looking?<br>

 <br>

In /var/log/messages I found:<br>

 <br>

Feb  8 09:03:53 ovirtnode01 multipathd: 360060160422143002a38935800ae2<wbr>760: sdd - emc_clariion_checker: Active path is healthy.<br>

Feb  8 09:03:53 ovirtnode01 multipathd: 8:48: reinstated<br>

Feb  8 09:03:53 ovirtnode01 multipathd: 360060160422143002a38935800ae2<wbr>760: remaining active paths: 1<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 8<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 5833475<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 5833475<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967168<br>

Feb  8 09:03:53 ovirtnode01 kernel: Buffer I/O error on dev dm-207, logical block 97, async page read<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967168<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967280<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967280<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 0<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 0<br>

Feb  8 09:03:53 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967168<br>

Feb  8 09:03:53 ovirtnode01 kernel: device-mapper: multipath: Reinstating path 8:48.<br>

Feb  8 09:03:53 ovirtnode01 kernel: sd 3:0:0:22: alua: port group 01 state A preferred supports tolUsNA<br>

Feb  8 09:03:53 ovirtnode01 sanlock[5192]: 2017-02-08 09:03:53+0100 151809 [11772]: s59 add_lockspace fail result -202<br>

Feb  8 09:04:05 ovirtnode01 multipathd: dm-33: remove map (uevent)<br>

Feb  8 09:04:05 ovirtnode01 multipathd: dm-33: devmap not registered, can&#39;t remove<br>

Feb  8 09:04:05 ovirtnode01 multipathd: dm-33: remove map (uevent)<br>

Feb  8 09:04:06 ovirtnode01 multipathd: dm-34: remove map (uevent)<br>

Feb  8 09:04:06 ovirtnode01 multipathd: dm-34: devmap not registered, can&#39;t remove<br>

Feb  8 09:04:06 ovirtnode01 multipathd: dm-34: remove map (uevent)<br>

Feb  8 09:04:08 ovirtnode01 multipathd: dm-33: remove map (uevent)<br>

Feb  8 09:04:08 ovirtnode01 multipathd: dm-33: devmap not registered, can&#39;t remove<br>

Feb  8 09:04:08 ovirtnode01 multipathd: dm-33: remove map (uevent)<br>

Feb  8 09:04:08 ovirtnode01 kernel: dd: sending ioctl 80306d02 to a partition!<br>

Feb  8 09:04:24 ovirtnode01 sanlock[5192]: 2017-02-08 09:04:24+0100 151840 [15589]: read_sectors delta_leader offset 2560 rv -202 /dev/f9b70017-0a34-47bc-bf2f-<wbr>dfc70200a347/ids<br>

Feb  8 09:04:34 ovirtnode01 sanlock[5192]: 2017-02-08 09:04:34+0100 151850 [15589]: f9b70017 close_task_aio 0 0x7fd78c0008c0 busy<br>

Feb  8 09:04:39 ovirtnode01 multipathd: 360060160422143002a38935800ae2<wbr>760: sdd - emc_clariion_checker: Read error for WWN 60060160422143002a38935800ae27<wbr>60.  Sense data are 0x0/0x0/0x0.<br>

Feb  8 09:04:39 ovirtnode01 multipathd: checker failed path 8:48 in map 360060160422143002a38935800ae2<wbr>760<br>

Feb  8 09:04:39 ovirtnode01 multipathd: 360060160422143002a38935800ae2<wbr>760: remaining active paths: 0<br>

Feb  8 09:04:39 ovirtnode01 kernel: qla2xxx [0000:11:00.0]-801c:3: Abort command issued nexus=3:0:22 --  1 2002.<br>

Feb  8 09:04:39 ovirtnode01 kernel: device-mapper: multipath: Failing path 8:48.<br>

Feb  8 09:04:40 ovirtnode01 kernel: qla2xxx [0000:11:00.0]-801c:3: Abort command issued nexus=3:0:22 --  1 2002.<br>

Feb  8 09:04:42 ovirtnode01 kernel: blk_update_request: 8 callbacks suppressed<br>

Feb  8 09:04:42 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967168<br>

Feb  8 09:04:42 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967280<br>

Feb  8 09:04:42 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 0<br>

Feb  8 09:04:42 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967168<br>

Feb  8 09:04:42 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 4294967280<br>

Feb  8 09:04:42 ovirtnode01 kernel: blk_update_request: I/O error, dev dm-10, sector 0<br>

<br>

Maybe you should consult the storage vendor about this?<br>

<br>

Can be also incorrect multipath configuration, maybe multipatch checker,<br>

fail, and because you have one path the device moved to faulty state, and <br>

sanlock fail to access the device.<br>

<br>

Can also be low level issue in kernel, hba, switch, server.<br>

<br>

Lets start by inspecting multipath configuration, can you share<br>

output of:<br>

<br>

cat /etc/multiapth.conf<br>

multipath -r v3<br>

<br>

Maybe you can expose one lun for testing, and blacklist this lun in <br>

multipath.conf. You will not be able to use this lun in ovirt, but it can<br>

be used to validate the layers below multipath. If a plain lun is ok, <br>

and same lun used a multipath device fails, the problem is likely to be<br>

multipath configuration.<br>

 <br>

Nir<br>

<br>

 <br>

 <br>

multipath -ll output for this Domain<br>

 <br>

360060160422143002a38935800ae2<wbr>760 dm-10 DGC     ,VRAID<br>

size=2.0T features=&#39;1 retain_attached_hw_handler&#39; hwhandler=&#39;1 alua&#39; wp=rw<br>

`-+- policy=&#39;service-time 0&#39; prio=50 status=active<br>

  `- 3:0:0:22 sdd 8:48  active ready  running<br>

 <br>

 <br>

Thx Christian<br>

 <br>

 <br>

<br>

______________________________<wbr>_________________<br>

Users mailing list<br>

</div></div>mailto:<a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>

<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>

<br>

</blockquote></div><br></div></div>