Re: [ovirt-users] Ovirt host activation and lvm looping with high CPU load trying to mount iSCSI storage

Thursday, 12 January 2017

On Thu, Jan 12, 2017 at 5:01 PM, Mark Greenall <m.greenall(a)iontrading.com&gt;
wrote:

...
 Hi Yaniv,

 >> 1. There is no point in so many connections.

 >> 2. Certainly not the same portal - you really should have more.

 >> 3. Note that some go via bond1 - and some via 'default' interface. Is
 that intended?

 >> 4. Your multipath.conf is using rr_min_io - where it should
 use rr_min_io_rq most likely.

 We have a single 68TB Equallogic unit with 24 disks. Each Ovirt host has 2
 HBA’s on the iSCSI network. We use Ovirt and the Cisco switches to create
 an LACP group with those 2 HBA’s. I have always assumed that the two
 connections are one each from the HBA’s (i.e I should have two paths and
 two connections to each target).

While it's a bit of a religious war on what is preferred with iSCSI -
network level bonding (LACP) or multipathing on the iSCSI level, I'm on the
multipathing side. The main reason is that you may end up easily using just
one of the paths in a bond - if your policy is not set correct on how to
distribute connections between the physical links (remember that each
connection sticks to a single physical link. So it really depends on the
hash policy and even then - not so sure). With iSCSI multipathing you have
more control - and it can also be determined by queue depth, etc.
(In your example, if you have SRC A -> DST 1 and SRC B -> DST 1 (as you
seem to have), both connections may end up on the same physical NIC.)

...

 If we reduce the number of storage domains, we reduce the number of
 devices and therefore the number of LVM Physical volumes that appear in
 Linux correct? At the moment each connection results in a Linux device
 which has its own queue. We have some guests with high IO loads on their
 device whilst others are low. All the storage domain / datastore sizing
 guides we found seem to imply it’s a trade-off between ease of management
 (i.e not having millions of domains to manage), IO contention between
 guests on a single large storage domain / datastore and possible wasted
 space on storage domains. If you have further information on
 recommendations, I am more than willing to change things as this problem is
 making our environment somewhat unusable at the moment. I have hosts that I
 can’t bring online and therefore reduced resiliency in clusters. They used
 to work just fine but the environment has grown over the last year and we
 also upgraded the Ovirt version from 3.6 to 4.x. We certainly had other
 problems, but host activation wasn’t one of them and it’s a problem that’s
 driving me mad.

I would say that each path has its own device (and therefore its own
queue). So I'd argue that you may want to have (for example) 4 paths to
each LUN or perhaps more (8?). For example, with 2 NICs, each connecting to
two controllers, each controller having 2 NICs (so no SPOF and nice number
of paths).

BTW, perhaps some guests need direct LUN?

...

 Thanks for the pointer on rr_min_io – I see that was for an older kernel.
 We had that set from a Dell guide. I’ve now removed that setting as it
 seems the default value has changed now anyway.

Depending on your storage, you may want to use rr_min_io_rq = 1 for latency
purposes.

...

 >> Unrelated, your engine.log is quite flooded with:

 >> 2017-01-11 15:07:46,085 WARN  [org.ovirt.engine.core.
 vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (DefaultQuartzScheduler9)
 [31a71bf5] Invalid or unknown guest architecture type '' received from
 guest agent

 >>

 >> Any idea what kind of guest you are running?

 Do you have any idea what the guest name is that’s coming from? We pretty
 much exclusively have Linux (CentOS various versions) and Windows (various
 versions) as the guest OS.

Vinzenz - any idea?
Y.

...

 Thanks again,

 Mark

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [ovirt-users] Ovirt host activation and lvm looping with high CPU load trying to mount iSCSI storage