Re: [ovirt-users] VMs paused due to IO issues - Dell Equallogic controller failover

Tuesday, 4 October 2016

On Tue, Oct 4, 2016 at 7:03 PM, Michal Skrivanek <
michal.skrivanek(a)redhat.com&gt; wrote:

...

 > On 4 Oct 2016, at 09:51, Gary Lloyd <g.lloyd(a)keele.ac.uk&gt; wrote:
 >
 > Hi
 >
 > We have Ovirt 3.65 with a Dell Equallogic SAN and we use Direct Luns for
 all our VMs.
 > At the weekend during early hours an Equallogic controller failed over
 to its standby on one of our arrays and this caused about 20 of our VMs to
 be paused due to IO problems.
 >
 > I have also noticed that this happens during Equallogic firmware
 upgrades since we moved onto Ovirt 3.65.
 >
 > As recommended by Dell disk timeouts within the VMs are set to 60
 seconds when they are hosted on an EqualLogic SAN.
 >
 > Is there any other timeout value that we can configure in vdsm.conf to
 stop VMs from getting paused when a controller fails over ?

 not really. but things are not so different when you look at it from the
 guest perspective. If the intention is to hide the fact that there is a
 problem and the guest should just see a delay (instead of dealing with
 error) then pausing and unpausing is the right behavior. From guest point
 of view this is just a delay it sees.

 >
 > Also is there anything that we can tweak to automatically unpause the
 VMs once connectivity with the arrays is re-established ?

 that should happen when the storage domain monitoring detects error and
 then reactivate(http://gerrit.ovirt.org/16244). It may be that since you
 have direct luns it’s not working with those….dunno, storage people should
 chime in I guess...

We don't monitor direct luns, only storage domains, so we do not support
resuming vms using direct luns.

multipath does monitor all devices, so we could monitor the devices status
via multipath, and resume paused vms when a device move from faulty
state to active state.

Maybe open an RFE for this?

Nir

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [ovirt-users] VMs paused due to IO issues - Dell Equallogic controller failover