[ovirt-users] problem with iSCSI/multipath.

Giorgio Bersano giorgio.bersano at gmail.com
Mon Jul 27 10:19:23 UTC 2015


Hi all.
We have an oVirt cluster in production happily running from the
beginning of 2014.
It started as 3.3 beta and now is Version 3.4.4-1.el6 .

Shared storage provided by an HP P2000 G3 iSCSI MSA.
The storage server is fully redundant (2 controllers, dual port disks,
4 iscsi connections per controller) and so is the connectivity (two
switches, multiple ethernet cards per server).

>From now on lets only talk about iSCSI connectivity.
The two oldest server have 2 nics each; they have been configured "by
hand" setting routes aimed to reach every scsi target from every nic.
On the "new" server we installed ovirt 3.5 to have a look at the
network configuration provided by oVirt.
In Data Center -> iSCSI Multipathing we defined an iSCSI Bond binding
together 3 server's nics and the 8 nics of the MSA.
The result is a system that has been functioning for months.

Recently we had to do an upgrade of the storage firmware.
This activity uploads the firmware to one of the MSA controllers then
reboots it. Being successful this is repeated on the other controller.
There is an impact on the I/O performance but there should be no
problems as every "volume" on the MSA remains visible on other paths.

Well, that's the theory.
On the two "hand configured" hosts we had no significant problems.
On the 3.5 host VMs started to migrate due to storage problems then
the situation got worse and it took more than an hour to bring again
the system to a good operating level.

I am inclined to believe that the culprit is the server's routing
table. Seems to me that the oVirt generated one is too simplistic and
prone to problems in case of connectivity loss (as in our situation or
when you have to reboot one of the switches).

Anyone on this list with strong experience on similar setup?

I have included below some background information.
I'm available to provide anything useful to further investigate the case.

TIA,
Giorgio.


-------------------
context information
-------------------

oVirt Compatibility Version: 3.4

two FUJITSU PRIMERGY RX300 S5 hosts
CPU:  Intel(R) Xeon(R) E5504 @ 2.00GHz  / Intel Nehalem Family
OS Version: RHEL - 6 - 6.el6.centos.12.2
Kernel Version: 2.6.32 - 504.16.2.el6.x86_64
KVM Version: 0.12.1.2 - 2.448.el6_6.2
LIBVIRT Version: libvirt-0.10.2-46.el6_6.6
VDSM Version: vdsm-4.14.17-0.el6
RAM: 40GB
mom-0.4.3-1.el6.noarch.rpm
ovirt-release34-1.0.3-1.noarch.rpm
qemu-img-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-tools-0.12.1.2-2.448.el6_6.2.x86_64.rpm
vdsm-4.14.17-0.el6.x86_64.rpm
vdsm-cli-4.14.17-0.el6.noarch.rpm
vdsm-hook-hostusb-4.14.17-0.el6.noarch.rpm
vdsm-hook-macspoof-4.14.17-0.el6.noarch.rpm
vdsm-python-4.14.17-0.el6.x86_64.rpm
vdsm-python-zombiereaper-4.14.17-0.el6.noarch.rpm
vdsm-xmlrpc-4.14.17-0.el6.noarch.rpm

# ip route list table all |grep 192.168.126.
192.168.126.87 dev eth4  table 4  proto kernel  scope link  src 192.168.126.65
192.168.126.86 dev eth4  table 4  proto kernel  scope link  src 192.168.126.65
192.168.126.81 dev eth4  table 4  proto kernel  scope link  src 192.168.126.65
192.168.126.80 dev eth4  table 4  proto kernel  scope link  src 192.168.126.65
192.168.126.77 dev eth4  table 4  proto kernel  scope link  src 192.168.126.65
192.168.126.0/24 dev eth4  table 4  proto kernel  scope link  src 192.168.126.65
192.168.126.0/24 dev eth3  proto kernel  scope link  src 192.168.126.64
192.168.126.0/24 dev eth4  proto kernel  scope link  src 192.168.126.65
192.168.126.85 dev eth3  table 3  proto kernel  scope link  src 192.168.126.64
192.168.126.84 dev eth3  table 3  proto kernel  scope link  src 192.168.126.64
192.168.126.83 dev eth3  table 3  proto kernel  scope link  src 192.168.126.64
192.168.126.82 dev eth3  table 3  proto kernel  scope link  src 192.168.126.64
192.168.126.76 dev eth3  table 3  proto kernel  scope link  src 192.168.126.64
192.168.126.0/24 dev eth3  table 3  proto kernel  scope link  src 192.168.126.64
broadcast 192.168.126.0 dev eth3  table local  proto kernel  scope
link  src 192.168.126.64
broadcast 192.168.126.0 dev eth4  table local  proto kernel  scope
link  src 192.168.126.65
local 192.168.126.65 dev eth4  table local  proto kernel  scope host
src 192.168.126.65
local 192.168.126.64 dev eth3  table local  proto kernel  scope host
src 192.168.126.64
broadcast 192.168.126.255 dev eth3  table local  proto kernel  scope
link  src 192.168.126.64
broadcast 192.168.126.255 dev eth4  table local  proto kernel  scope
link  src 192.168.126.65


one HP ProLiant DL560 Gen8 host
CPU:  Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz / Intel SandyBridge Family
OS Version:RHEL - 6 - 6.el6.centos.12.2
Kernel Version: 2.6.32 - 504.16.2.el6.x86_64
KVM Version: 0.12.1.2 - 2.448.el6_6.2
LIBVIRT Version: libvirt-0.10.2-46.el6_6.6
VDSM Version: vdsm-4.16.14-0.el6
RAM: 256GB
mom-0.4.3-1.el6.noarch.rpm
ovirt-release35-002-1.noarch.rpm
qemu-img-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-tools-0.12.1.2-2.448.el6_6.2.x86_64.rpm
vdsm-4.16.14-0.el6.x86_64.rpm
vdsm-cli-4.16.14-0.el6.noarch.rpm
vdsm-hook-hostusb-4.16.14-0.el6.noarch.rpm
vdsm-hook-macspoof-4.16.14-0.el6.noarch.rpm
vdsm-jsonrpc-4.16.14-0.el6.noarch.rpm
vdsm-python-4.16.14-0.el6.noarch.rpm
vdsm-python-zombiereaper-4.16.14-0.el6.noarch.rpm
vdsm-xmlrpc-4.16.14-0.el6.noarch.rpm
vdsm-yajsonrpc-4.16.14-0.el6.noarch.rpm

# ip route list table all |grep 192.168.126.
192.168.126.0/24 dev p6p1  proto kernel  scope link  src 192.168.126.34
192.168.126.0/24 dev p3p1  proto kernel  scope link  src 192.168.126.33
192.168.126.0/24 dev em3  proto kernel  scope link  src 192.168.126.32
local 192.168.126.32 dev em3  table local  proto kernel  scope host
src 192.168.126.32
local 192.168.126.33 dev p3p1  table local  proto kernel  scope host
src 192.168.126.33
broadcast 192.168.126.0 dev p6p1  table local  proto kernel  scope
link  src 192.168.126.34
broadcast 192.168.126.0 dev p3p1  table local  proto kernel  scope
link  src 192.168.126.33
broadcast 192.168.126.0 dev em3  table local  proto kernel  scope link
 src 192.168.126.32
local 192.168.126.34 dev p6p1  table local  proto kernel  scope host
src 192.168.126.34
broadcast 192.168.126.255 dev p6p1  table local  proto kernel  scope
link  src 192.168.126.34
broadcast 192.168.126.255 dev p3p1  table local  proto kernel  scope
link  src 192.168.126.33
broadcast 192.168.126.255 dev em3  table local  proto kernel  scope
link  src 192.168.126.32



More information about the Users mailing list