[ovirt-users] problem with iSCSI/multipath.
Liron Aravot
laravot at redhat.com
Mon Aug 17 11:22:55 UTC 2015
----- Original Message -----
> From: "Giorgio Bersano" <giorgio.bersano at gmail.com>
> To: "users at ovirt.org" <Users at ovirt.org>
> Sent: Monday, July 27, 2015 1:19:23 PM
> Subject: [ovirt-users] problem with iSCSI/multipath.
>
> Hi all.
> We have an oVirt cluster in production happily running from the
> beginning of 2014.
> It started as 3.3 beta and now is Version 3.4.4-1.el6 .
>
> Shared storage provided by an HP P2000 G3 iSCSI MSA.
> The storage server is fully redundant (2 controllers, dual port disks,
> 4 iscsi connections per controller) and so is the connectivity (two
> switches, multiple ethernet cards per server).
>
> From now on lets only talk about iSCSI connectivity.
> The two oldest server have 2 nics each; they have been configured "by
> hand" setting routes aimed to reach every scsi target from every nic.
> On the "new" server we installed ovirt 3.5 to have a look at the
> network configuration provided by oVirt.
> In Data Center -> iSCSI Multipathing we defined an iSCSI Bond binding
> together 3 server's nics and the 8 nics of the MSA.
> The result is a system that has been functioning for months.
>
> Recently we had to do an upgrade of the storage firmware.
> This activity uploads the firmware to one of the MSA controllers then
> reboots it. Being successful this is repeated on the other controller.
> There is an impact on the I/O performance but there should be no
> problems as every "volume" on the MSA remains visible on other paths.
>
> Well, that's the theory.
> On the two "hand configured" hosts we had no significant problems.
> On the 3.5 host VMs started to migrate due to storage problems then
> the situation got worse and it took more than an hour to bring again
> the system to a good operating level.
>
> I am inclined to believe that the culprit is the server's routing
> table. Seems to me that the oVirt generated one is too simplistic and
> prone to problems in case of connectivity loss (as in our situation or
> when you have to reboot one of the switches).
>
> Anyone on this list with strong experience on similar setup?
>
> I have included below some background information.
> I'm available to provide anything useful to further investigate the case.
>
> TIA,
> Giorgio.
Hi Giorgio,
There were some issues related to ISCSI multipathing that were already solved on later versions then 3.4 AFAIK.
I'm attaching Sergey and Maor (the feature owners) to respond whether related fixes were made.
thanks,
Liron.
>
>
> -------------------
> context information
> -------------------
>
> oVirt Compatibility Version: 3.4
>
> two FUJITSU PRIMERGY RX300 S5 hosts
> CPU: Intel(R) Xeon(R) E5504 @ 2.00GHz / Intel Nehalem Family
> OS Version: RHEL - 6 - 6.el6.centos.12.2
> Kernel Version: 2.6.32 - 504.16.2.el6.x86_64
> KVM Version: 0.12.1.2 - 2.448.el6_6.2
> LIBVIRT Version: libvirt-0.10.2-46.el6_6.6
> VDSM Version: vdsm-4.14.17-0.el6
> RAM: 40GB
> mom-0.4.3-1.el6.noarch.rpm
> ovirt-release34-1.0.3-1.noarch.rpm
> qemu-img-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
> qemu-kvm-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
> qemu-kvm-rhev-tools-0.12.1.2-2.448.el6_6.2.x86_64.rpm
> vdsm-4.14.17-0.el6.x86_64.rpm
> vdsm-cli-4.14.17-0.el6.noarch.rpm
> vdsm-hook-hostusb-4.14.17-0.el6.noarch.rpm
> vdsm-hook-macspoof-4.14.17-0.el6.noarch.rpm
> vdsm-python-4.14.17-0.el6.x86_64.rpm
> vdsm-python-zombiereaper-4.14.17-0.el6.noarch.rpm
> vdsm-xmlrpc-4.14.17-0.el6.noarch.rpm
>
> # ip route list table all |grep 192.168.126.
> 192.168.126.87 dev eth4 table 4 proto kernel scope link src
> 192.168.126.65
> 192.168.126.86 dev eth4 table 4 proto kernel scope link src
> 192.168.126.65
> 192.168.126.81 dev eth4 table 4 proto kernel scope link src
> 192.168.126.65
> 192.168.126.80 dev eth4 table 4 proto kernel scope link src
> 192.168.126.65
> 192.168.126.77 dev eth4 table 4 proto kernel scope link src
> 192.168.126.65
> 192.168.126.0/24 dev eth4 table 4 proto kernel scope link src
> 192.168.126.65
> 192.168.126.0/24 dev eth3 proto kernel scope link src 192.168.126.64
> 192.168.126.0/24 dev eth4 proto kernel scope link src 192.168.126.65
> 192.168.126.85 dev eth3 table 3 proto kernel scope link src
> 192.168.126.64
> 192.168.126.84 dev eth3 table 3 proto kernel scope link src
> 192.168.126.64
> 192.168.126.83 dev eth3 table 3 proto kernel scope link src
> 192.168.126.64
> 192.168.126.82 dev eth3 table 3 proto kernel scope link src
> 192.168.126.64
> 192.168.126.76 dev eth3 table 3 proto kernel scope link src
> 192.168.126.64
> 192.168.126.0/24 dev eth3 table 3 proto kernel scope link src
> 192.168.126.64
> broadcast 192.168.126.0 dev eth3 table local proto kernel scope
> link src 192.168.126.64
> broadcast 192.168.126.0 dev eth4 table local proto kernel scope
> link src 192.168.126.65
> local 192.168.126.65 dev eth4 table local proto kernel scope host
> src 192.168.126.65
> local 192.168.126.64 dev eth3 table local proto kernel scope host
> src 192.168.126.64
> broadcast 192.168.126.255 dev eth3 table local proto kernel scope
> link src 192.168.126.64
> broadcast 192.168.126.255 dev eth4 table local proto kernel scope
> link src 192.168.126.65
>
>
> one HP ProLiant DL560 Gen8 host
> CPU: Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz / Intel SandyBridge Family
> OS Version:RHEL - 6 - 6.el6.centos.12.2
> Kernel Version: 2.6.32 - 504.16.2.el6.x86_64
> KVM Version: 0.12.1.2 - 2.448.el6_6.2
> LIBVIRT Version: libvirt-0.10.2-46.el6_6.6
> VDSM Version: vdsm-4.16.14-0.el6
> RAM: 256GB
> mom-0.4.3-1.el6.noarch.rpm
> ovirt-release35-002-1.noarch.rpm
> qemu-img-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
> qemu-kvm-rhev-0.12.1.2-2.448.el6_6.2.x86_64.rpm
> qemu-kvm-rhev-tools-0.12.1.2-2.448.el6_6.2.x86_64.rpm
> vdsm-4.16.14-0.el6.x86_64.rpm
> vdsm-cli-4.16.14-0.el6.noarch.rpm
> vdsm-hook-hostusb-4.16.14-0.el6.noarch.rpm
> vdsm-hook-macspoof-4.16.14-0.el6.noarch.rpm
> vdsm-jsonrpc-4.16.14-0.el6.noarch.rpm
> vdsm-python-4.16.14-0.el6.noarch.rpm
> vdsm-python-zombiereaper-4.16.14-0.el6.noarch.rpm
> vdsm-xmlrpc-4.16.14-0.el6.noarch.rpm
> vdsm-yajsonrpc-4.16.14-0.el6.noarch.rpm
>
> # ip route list table all |grep 192.168.126.
> 192.168.126.0/24 dev p6p1 proto kernel scope link src 192.168.126.34
> 192.168.126.0/24 dev p3p1 proto kernel scope link src 192.168.126.33
> 192.168.126.0/24 dev em3 proto kernel scope link src 192.168.126.32
> local 192.168.126.32 dev em3 table local proto kernel scope host
> src 192.168.126.32
> local 192.168.126.33 dev p3p1 table local proto kernel scope host
> src 192.168.126.33
> broadcast 192.168.126.0 dev p6p1 table local proto kernel scope
> link src 192.168.126.34
> broadcast 192.168.126.0 dev p3p1 table local proto kernel scope
> link src 192.168.126.33
> broadcast 192.168.126.0 dev em3 table local proto kernel scope link
> src 192.168.126.32
> local 192.168.126.34 dev p6p1 table local proto kernel scope host
> src 192.168.126.34
> broadcast 192.168.126.255 dev p6p1 table local proto kernel scope
> link src 192.168.126.34
> broadcast 192.168.126.255 dev p3p1 table local proto kernel scope
> link src 192.168.126.33
> broadcast 192.168.126.255 dev em3 table local proto kernel scope
> link src 192.168.126.32
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
More information about the Users
mailing list