[ovirt-users] iSCSI Multipathing -> host inactive
Yaniv Kaul
ykaul at redhat.com
Fri Aug 26 20:40:15 UTC 2016
On Fri, Aug 26, 2016 at 1:33 PM, InterNetX - Juergen Gotteswinter <
jg at internetx.com> wrote:
>
>
> Am 25.08.2016 um 15:53 schrieb Yaniv Kaul:
> >
> >
> > On Wed, Aug 24, 2016 at 6:15 PM, InterNetX - Juergen Gotteswinter
> > <juergen.gotteswinter at internetx.com
> > <mailto:juergen.gotteswinter at internetx.com>> wrote:
> >
> > iSCSI & Ovirt is an awful combination, no matter if multipathed or
> > bonded. its always gambling how long it will work, and when it fails
> why
> > did it fail.
> >
> >
> > I disagree. In most cases, it's actually a lower layer issues. In most
> > cases, btw, it's because multipathing was not configured (or not
> > configured correctly).
> >
>
> experience tells me it is like is said, this was something i have seen
> from 3.0 up to 3.6. Ovirt, and - suprprise - RHEV. Both act the same
> way. I am absolutly aware of Multpath Configurations, iSCSI Multipathing
> is very widespread in use in our DC. But such problems are an excluse
> Ovirt/RHEV Feature.
>
I don't think the resentful tone is appropriate for the oVirt community
mailing list.
>
> >
> >
> > its supersensitive to latency, and superfast with setting an host to
> > inactive because the engine thinks something is wrong with it. in
> most
> > cases there was no real reason for.
> >
> >
> > Did you open bugs for those issues? I'm not aware of 'no real reason'
> > issues.
> >
>
> Support Tickets for Rhev Installation, after Support (even after massive
> escalation requests) kept telling me the same again and again i gave up
> and we dropped the RHEV Subscriptions to Migrate the VMS to a different
> Plattform Solution (still iSCSI Backend). Problems gone.
>
I wish you peace of mind with your new platform solution.
>From a (shallow!) search I've made on oVirt bugs, I could not find any
oVirt issue you've reported or commented on.
I am aware of the request to set rp_filter correctly for setups with
multiple interfaces in the same IP subnet.
>
>
> >
> >
> > we had this in several different hardware combinations, self built
> > filers up on FreeBSD/Illumos & ZFS, Equallogic SAN, Nexenta Filer
> >
> > Been there, done that, wont do again.
> >
> >
> > We've had good success and reliability with most enterprise level
> > storage, such as EMC, NetApp, Dell filers.
> > When properly configured, of course.
> > Y.
> >
>
> Dell Equallogic? Cant really believe since Ovirt / Rhev and the
> Equallogic Network Configuration wont play nice together (EQL wants all
> Interfaces in the same Subnet). And they only work like expected when
> there Hit Kit Driverpackage is installed. Without Path Failover is like
> russian Roulette. But Ovirt hates the Hit Kit, so this Combo ends up in
> a huge mess, because Ovirt does changes to iSCSI, as well as the Hit Kit
> -> Kaboom. Host not available.
>
Thanks - I'll look into this specific storage.
I'm aware it's unique in some cases, but I don't have experience with it
specifically.
>
> There are several KB Articles in the RHN, without real Solution.
>
>
> But like you try to tell between the lines, this must be the Customers
> misconfiguration. Yep, typical Supportkilleranswer. Same Style than in
> RHN Tickets, i am done with this.
>
A funny sentence I've read yesterday:
Schrodinger's backup: "the condition of any backup is unknown until restore
is attempted."
In a sense, this is similar to no SPOF high availability setup - in many
cases you don't know it works well until needed.
There are simply many variables and components involved.
That was all I meant, nothing between the lines and I apologize if I've
given you a different impression.
Y.
> Thanks.
>
> >
> >
> >
> > Am 24.08.2016 um 16:04 schrieb Uwe Laverenz:
> > > Hi Elad,
> > >
> > > thank you very much for clearing things up.
> > >
> > > Initiator/iface 'a' tries to connect target 'b' and vice versa. As
> 'a'
> > > and 'b' are in completely separate networks this can never work as
> > long
> > > as there is no routing between the networks.
> > >
> > > So it seems the iSCSI-bonding feature is not useful for my setup. I
> > > still wonder how and where this feature is supposed to be used?
> > >
> > > thank you,
> > > Uwe
> > >
> > > Am 24.08.2016 um 15:35 schrieb Elad Ben Aharon:
> > >> Thanks.
> > >>
> > >> You're getting an iSCSI connection timeout [1], [2]. It means the
> > host
> > >> cannot connect to the targets from iface: enp9s0f1 nor iface:
> > enp9s0f0.
> > >>
> > >> This causes the host to loose its connection to the storage and
> also,
> > >> the connection to the engine becomes inactive. Therefore, the host
> > >> changes its status to Non-responsive [3] and since it's the SPM,
> the
> > >> whole DC, with all its storage domains become inactive.
> > >>
> > >>
> > >> vdsm.log:
> > >> [1]
> > >> Traceback (most recent call last):
> > >> File "/usr/share/vdsm/storage/hsm.py", line 2400, in
> > >> connectStorageServer
> > >> conObj.connect()
> > >> File "/usr/share/vdsm/storage/storageServer.py", line 508, in
> > connect
> > >> iscsi.addIscsiNode(self._iface, self._target, self._cred)
> > >> File "/usr/share/vdsm/storage/iscsi.py", line 204, in
> addIscsiNode
> > >> iscsiadm.node_login(iface.name <http://iface.name>
> > <http://iface.name>, portalStr,
> > >> target.iqn)
> > >> File "/usr/share/vdsm/storage/iscsiadm.py", line 336, in
> node_login
> > >> raise IscsiNodeError(rc, out, err)
> > >> IscsiNodeError: (8, ['Logging in to [iface: enp9s0f0, target:
> > >> iqn.2005-10.org.freenas.ctl:tgtb, portal: 10.0.132.121,3260]
> > >> (multiple)'], ['iscsiadm: Could not login to [iface: enp9s0f0,
> targ
> > >> et: iqn.2005-10.org.freenas.ctl:tgtb, portal:
> 10.0.132.121,3260].',
> > >> 'iscsiadm: initiator reported error (8 - connection timed out)',
> > >> 'iscsiadm: Could not log into all portals'])
> > >>
> > >>
> > >>
> > >> vdsm.log:
> > >> [2]
> > >> Traceback (most recent call last):
> > >> File "/usr/share/vdsm/storage/hsm.py", line 2400, in
> > >> connectStorageServer
> > >> conObj.connect()
> > >> File "/usr/share/vdsm/storage/storageServer.py", line 508, in
> > connect
> > >> iscsi.addIscsiNode(self._iface, self._target, self._cred)
> > >> File "/usr/share/vdsm/storage/iscsi.py", line 204, in
> addIscsiNode
> > >> iscsiadm.node_login(iface.name <http://iface.name>
> > <http://iface.name>, portalStr,
> > >> target.iqn)
> > >> File "/usr/share/vdsm/storage/iscsiadm.py", line 336, in
> node_login
> > >> raise IscsiNodeError(rc, out, err)
> > >> IscsiNodeError: (8, ['Logging in to [iface: enp9s0f1, target:
> > >> iqn.2005-10.org.freenas.ctl:tgta, portal: 10.0.131.121,3260]
> > >> (multiple)'], ['iscsiadm: Could not login to [iface: enp9s0f1,
> > target:
> > >> iqn.2005-10.org.freenas.ctl:tgta, portal: 10.0.131.121,3260].',
> > >> 'iscsiadm: initiator reported error (8 - connection timed out)',
> > >> 'iscsiadm: Could not log into all portals'])
> > >>
> > >>
> > >> engine.log:
> > >> [3]
> > >>
> > >>
> > >> 2016-08-24 14:10:23,222 WARN
> > >>
> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.
> AuditLogDirector]
> > >> (default task-25) [15d1637f] Correlation ID: 15d1637f, Call
> > Stack: null,
> > >> Custom Event ID:
> > >> -1, Message: iSCSI bond 'iBond' was successfully created in Data
> > Center
> > >> 'Default' but some of the hosts encountered connection issues.
> > >>
> > >>
> > >>
> > >> 2016-08-24 14:10:23,208 INFO
> > >>
> > [org.ovirt.engine.core.vdsbroker.vdsbroker.
> ConnectStorageServerVDSCommand]
> > >>
> > >> (org.ovirt.thread.pool-8-thread-25) [15d1637f] Command
> > >> 'org.ovirt.engine.core.vdsbrok
> > >> er.vdsbroker.ConnectStorageServerVDSCommand' return value '
> > >> ServerConnectionStatusReturnForXmlRpc:{status='StatusForXmlRpc
> > >> [code=5022, message=Message timeout which can be caused by
> > communication
> > >> issues]'}
> > >>
> > >>
> > >>
> > >> On Wed, Aug 24, 2016 at 4:04 PM, Uwe Laverenz <uwe at laverenz.de
> > <mailto:uwe at laverenz.de>
> > >> <mailto:uwe at laverenz.de <mailto:uwe at laverenz.de>>> wrote:
> > >>
> > >> Hi Elad,
> > >>
> > >> I sent you a download message.
> > >>
> > >> thank you,
> > >> Uwe
> > >> _______________________________________________
> > >> Users mailing list
> > >> Users at ovirt.org <mailto:Users at ovirt.org>
> > <mailto:Users at ovirt.org <mailto:Users at ovirt.org>>
> > >> http://lists.ovirt.org/mailman/listinfo/users
> > <http://lists.ovirt.org/mailman/listinfo/users>
> > >> <http://lists.ovirt.org/mailman/listinfo/users
> > <http://lists.ovirt.org/mailman/listinfo/users>>
> > >>
> > >>
> > > _______________________________________________
> > > Users mailing list
> > > Users at ovirt.org <mailto:Users at ovirt.org>
> > > http://lists.ovirt.org/mailman/listinfo/users
> > <http://lists.ovirt.org/mailman/listinfo/users>
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org <mailto:Users at ovirt.org>
> > http://lists.ovirt.org/mailman/listinfo/users
> > <http://lists.ovirt.org/mailman/listinfo/users>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160826/2fcf330f/attachment-0001.html>
More information about the Users
mailing list