<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Jan 31, 2017 at 3:23 PM, Nathanaël Blanchet <span dir="ltr"><<a href="mailto:blanchet@abes.fr" target="_blank">blanchet@abes.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>exactly the same issue by there with FC EMC domain storage...</p><div><div class="gmail-h5"><br></div></div></div></blockquote><div><br></div><div>I'm trying to mitigate inserting a timeout for my SAN devices but I'm not sure of its effectiveness as CentOS 7 behavior of "multipathd -k" and then "show config" seems different from CentOS 6.x</div><div>In fact my attempt for multipath.conf is this</div><div><br></div><div><br></div><div><div># VDSM REVISION 1.3</div><div># VDSM PRIVATE</div><div><br></div><div>defaults {</div><div> polling_interval 5</div><div> no_path_retry fail</div><div> user_friendly_names no</div><div> flush_on_last_del yes</div><div> fast_io_fail_tmo 5</div><div> dev_loss_tmo 30</div><div> max_fds 4096</div><div>}</div><div><br></div><div># Remove devices entries when overrides section is available.</div><div>devices {</div><div> device {</div><div> # These settings overrides built-in devices settings. It does not apply</div><div> # to devices without built-in settings (these use the settings in the</div><div> # "defaults" section), or to devices defined in the "devices" section.</div><div> # Note: This is not available yet on Fedora 21. For more info see</div><div> # <a href="https://bugzilla.redhat.com/1253799">https://bugzilla.redhat.com/1253799</a></div><div> all_devs yes</div><div> no_path_retry fail</div><div> }</div></div><div><div> device {</div><div> vendor "IBM"</div><div> product "^1814"</div><div> product_blacklist "Universal Xport"</div><div> path_grouping_policy "group_by_prio"</div><div> path_checker "rdac"</div><div> features "0"</div><div> hardware_handler "1 rdac"</div><div> prio "rdac"</div><div> failback immediate</div><div> rr_weight "uniform"</div><div> no_path_retry "12"</div><div> }</div><div>}</div></div><div><br></div><div>So I put exactly the default device config for my IBM/1814 device but no_path_retry set to 12.</div><div><br></div><div>In CentOS 6.x when you do something like this, "show config" gives you the modified entry only for your device section.</div><div>Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814 and also the customized one at the end of the output....</div><div><br></div><div>Two facts:</div><div>- before I could reproduce the problem if I selected </div><div>Maintenance</div><div>Power Mgmt ---> Restart</div><div>(tried 3 times with same behavior)<br></div><div><br></div><div>Instead if I executed in separate steps</div><div>Maintenance</div><div>Power Mgmt --> Stop</div><div>wait a moment</div><div>Power Mgmt --> Start</div><div><br></div><div>I didn't get problems (tried only one time...)</div><div><br></div><div>With this "new" multipath config (to be confirmed if in effect, how?) I don't get the VM paused problem even with Restart option of Power Mgmt</div><div>In active host messages I see these ones when the other reboots:</div><div><br></div><div><div>Jan 31 16:50:01 ovmsrv06 systemd: Started Session 705 of user root.</div><div>Jan 31 16:50:01 ovmsrv06 systemd: Starting Session 705 of user root.</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is up</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 8:64: reinstated</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is ghost</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 8:224: reinstated</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is up</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 8:160: reinstated</div><div>Jan 31 16:53:47 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is ghost</div><div>Jan 31 16:53:47 ovmsrv06 multipathd: 65:0: reinstated</div><div>Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36</div><div>Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command</div><div>Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36</div><div>Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command</div><div>Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed</div><div>Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command</div><div>Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is ghost</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 8:64: reinstated</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is up</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 8:224: reinstated</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is ghost</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 8:160: reinstated</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is up</div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 65:0: reinstated</div></div><div><br></div><div>But they are not related to the multipath device dedicated to oVirt storage domain in this case....</div><div>What lets me be optimistic seems the difference in these lines:</div><div><br></div><div>before I got</div><div>Jan 31 10:27:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 0 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]<br></div><div><br></div><div>now I get</div><div><div>Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]</div></div><div><br></div><div>multipath 0 1 rdac <br></div><div>vs</div><div>multipath 1 queue_if_no_path 1 rdac<br></div><div><br></div><div>Any confirmation?</div><div>Thanks in advance</div></div></div></div>