----- Original Message -----
From: "Giorgio Bersano" <giorgio.bersano(a)gmail.com>
To: "users(a)ovirt.org" <Users(a)ovirt.org>
Sent: Monday, July 27, 2015 1:19:23 PM
Subject: [ovirt-users] problem with iSCSI/multipath.
Hi all.
We have an oVirt cluster in production happily running from the
beginning of 2014.
It started as 3.3 beta and now is Version 3.4.4-1.el6 .
Shared storage provided by an HP P2000 G3 iSCSI MSA.
The storage server is fully redundant (2 controllers, dual port disks,
4 iscsi connections per controller) and so is the connectivity (two
switches, multiple ethernet cards per server).
From now on lets only talk about iSCSI connectivity.
The two oldest server have 2 nics each; they have been configured "by
hand" setting routes aimed to reach every scsi target from every nic.
On the "new" server we installed ovirt 3.5 to have a look at the
network configuration provided by oVirt.
In Data Center -> iSCSI Multipathing we defined an iSCSI Bond binding
together 3 server's nics and the 8 nics of the MSA.
The result is a system that has been functioning for months.
Recently we had to do an upgrade of the storage firmware.
This activity uploads the firmware to one of the MSA controllers then
reboots it. Being successful this is repeated on the other controller.
There is an impact on the I/O performance but there should be no
problems as every "volume" on the MSA remains visible on other paths.
Well, that's the theory.
On the two "hand configured" hosts we had no significant problems.
On the 3.5 host VMs started to migrate due to storage problems then
the situation got worse and it took more than an hour to bring again
the system to a good operating level.
I am inclined to believe that the culprit is the server's routing
table. Seems to me that the oVirt generated one is too simplistic and
prone to problems in case of connectivity loss (as in our situation or
when you have to reboot one of the switches).
Anyone on this list with strong experience on similar setup?
I have included below some background information.
I'm available to provide anything useful to further investigate the case.
TIA,
Giorgio.
Hi Giorgio,
There were some issues related to ISCSI multipathing that were already solved on later
versions then 3.4 AFAIK.
I'm attaching Sergey and Maor (the feature owners) to respond whether related fixes
were made.
thanks,
Liron.
-------------------
context information
-------------------
oVirt Compatibility Version: 3.4
two FUJITSU PRIMERGY RX300 S5 hosts
CPU: Intel(R) Xeon(R) E5504 @ 2.00GHz / Intel Nehalem Family
OS Version: RHEL - 6 - 6.el6.centos.12.2
Kernel Version: 2.6.32 - 504.16.2.el6.x86_64
KVM Version: 0.12.1.2 - 2.448.el6_6.2
LIBVIRT Version: libvirt-0.10.2-46.el6_6.6
VDSM Version: vdsm-4.14.17-0.el6
RAM: 40GB
mom-0.4.3-1.el6.noarch.rpm
ovirt-release34-1.0.3-1.noarch.rpm
qemu-img-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-tools-0.12.1.2-2.448.el6_6.2.x86_64.rpm
vdsm-4.14.17-0.el6.x86_64.rpm
vdsm-cli-4.14.17-0.el6.noarch.rpm
vdsm-hook-hostusb-4.14.17-0.el6.noarch.rpm
vdsm-hook-macspoof-4.14.17-0.el6.noarch.rpm
vdsm-python-4.14.17-0.el6.x86_64.rpm
vdsm-python-zombiereaper-4.14.17-0.el6.noarch.rpm
vdsm-xmlrpc-4.14.17-0.el6.noarch.rpm
# ip route list table all |grep 192.168.126.
192.168.126.87 dev eth4 table 4 proto kernel scope link src
192.168.126.65
192.168.126.86 dev eth4 table 4 proto kernel scope link src
192.168.126.65
192.168.126.81 dev eth4 table 4 proto kernel scope link src
192.168.126.65
192.168.126.80 dev eth4 table 4 proto kernel scope link src
192.168.126.65
192.168.126.77 dev eth4 table 4 proto kernel scope link src
192.168.126.65
192.168.126.0/24 dev eth4 table 4 proto kernel scope link src
192.168.126.65
192.168.126.0/24 dev eth3 proto kernel scope link src 192.168.126.64
192.168.126.0/24 dev eth4 proto kernel scope link src 192.168.126.65
192.168.126.85 dev eth3 table 3 proto kernel scope link src
192.168.126.64
192.168.126.84 dev eth3 table 3 proto kernel scope link src
192.168.126.64
192.168.126.83 dev eth3 table 3 proto kernel scope link src
192.168.126.64
192.168.126.82 dev eth3 table 3 proto kernel scope link src
192.168.126.64
192.168.126.76 dev eth3 table 3 proto kernel scope link src
192.168.126.64
192.168.126.0/24 dev eth3 table 3 proto kernel scope link src
192.168.126.64
broadcast 192.168.126.0 dev eth3 table local proto kernel scope
link src 192.168.126.64
broadcast 192.168.126.0 dev eth4 table local proto kernel scope
link src 192.168.126.65
local 192.168.126.65 dev eth4 table local proto kernel scope host
src 192.168.126.65
local 192.168.126.64 dev eth3 table local proto kernel scope host
src 192.168.126.64
broadcast 192.168.126.255 dev eth3 table local proto kernel scope
link src 192.168.126.64
broadcast 192.168.126.255 dev eth4 table local proto kernel scope
link src 192.168.126.65
one HP ProLiant DL560 Gen8 host
CPU: Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz / Intel SandyBridge Family
OS Version:RHEL - 6 - 6.el6.centos.12.2
Kernel Version: 2.6.32 - 504.16.2.el6.x86_64
KVM Version: 0.12.1.2 - 2.448.el6_6.2
LIBVIRT Version: libvirt-0.10.2-46.el6_6.6
VDSM Version: vdsm-4.16.14-0.el6
RAM: 256GB
mom-0.4.3-1.el6.noarch.rpm
ovirt-release35-002-1.noarch.rpm
qemu-img-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
qemu-kvm-rhev-tools-0.12.1.2-2.448.el6_6.2.x86_64.rpm
vdsm-4.16.14-0.el6.x86_64.rpm
vdsm-cli-4.16.14-0.el6.noarch.rpm
vdsm-hook-hostusb-4.16.14-0.el6.noarch.rpm
vdsm-hook-macspoof-4.16.14-0.el6.noarch.rpm
vdsm-jsonrpc-4.16.14-0.el6.noarch.rpm
vdsm-python-4.16.14-0.el6.noarch.rpm
vdsm-python-zombiereaper-4.16.14-0.el6.noarch.rpm
vdsm-xmlrpc-4.16.14-0.el6.noarch.rpm
vdsm-yajsonrpc-4.16.14-0.el6.noarch.rpm
# ip route list table all |grep 192.168.126.
192.168.126.0/24 dev p6p1 proto kernel scope link src 192.168.126.34
192.168.126.0/24 dev p3p1 proto kernel scope link src 192.168.126.33
192.168.126.0/24 dev em3 proto kernel scope link src 192.168.126.32
local 192.168.126.32 dev em3 table local proto kernel scope host
src 192.168.126.32
local 192.168.126.33 dev p3p1 table local proto kernel scope host
src 192.168.126.33
broadcast 192.168.126.0 dev p6p1 table local proto kernel scope
link src 192.168.126.34
broadcast 192.168.126.0 dev p3p1 table local proto kernel scope
link src 192.168.126.33
broadcast 192.168.126.0 dev em3 table local proto kernel scope link
src 192.168.126.32
local 192.168.126.34 dev p6p1 table local proto kernel scope host
src 192.168.126.34
broadcast 192.168.126.255 dev p6p1 table local proto kernel scope
link src 192.168.126.34
broadcast 192.168.126.255 dev p3p1 table local proto kernel scope
link src 192.168.126.33
broadcast 192.168.126.255 dev em3 table local proto kernel scope
link src 192.168.126.32
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users