Re: [ovirt-users] iSCSI Multipathing -> host inactive

26 Aug 2016


      On Fri, Aug 26, 2016 at 1:33 PM, InterNetX - Juergen Gotteswinter <
jg@internetx.com> wrote:
...
Am 25.08.2016 um 15:53 schrieb Yaniv Kaul:
...
On Wed, Aug 24, 2016 at 6:15 PM, InterNetX - Juergen Gotteswinter
<juergen.gotteswinter@internetx.com
<mailto:juergen.gotteswinter@internetx.com>> wrote:
iSCSI & Ovirt is an awful combination, no matter if multipathed or
    bonded. its always gambling how long it will work, and when it fails
why
...
did it fail.
I disagree. In most cases, it's actually a lower layer issues. In most
cases, btw, it's because multipathing was not configured (or not
configured correctly).
experience tells me it is like is said, this was something i have seen
from 3.0 up to 3.6. Ovirt, and - suprprise - RHEV. Both act the same
way. I am absolutly aware of Multpath Configurations, iSCSI Multipathing
is very widespread in use in our DC. But such problems are an excluse
Ovirt/RHEV Feature.
I don't think the resentful tone is appropriate for the oVirt community
mailing list.
...
...
its supersensitive to latency, and superfast with setting an host to
    inactive because the engine thinks something is wrong with it. in
most
...
cases there was no real reason for.
Did you open bugs for those issues? I'm not aware of 'no real reason'
issues.
Support Tickets for Rhev Installation, after Support (even after massive
escalation requests) kept telling me the same again and again i gave up
and we dropped the RHEV Subscriptions to Migrate the VMS to a different
Plattform Solution (still iSCSI Backend). Problems gone.
I wish you peace of mind with your new platform solution.
...
From a (shallow!) search I've made on oVirt bugs, I could not find any
oVirt issue you've reported or commented on.
I am aware of the request to set rp_filter correctly for setups with
multiple interfaces in the same IP subnet.
...
...
we had this in several different hardware combinations, self built
    filers up on FreeBSD/Illumos & ZFS, Equallogic SAN, Nexenta Filer
Been there, done that, wont do again.
We've had good success and reliability with most enterprise level
storage, such as EMC, NetApp, Dell filers.
When properly configured, of course.
Y.
Dell Equallogic? Cant really believe since Ovirt / Rhev and the
Equallogic Network Configuration wont play nice together (EQL wants all
Interfaces in the same Subnet). And they only work like expected when
there Hit Kit Driverpackage is installed. Without Path Failover is like
russian Roulette. But Ovirt hates the Hit Kit, so this Combo ends up in
a huge mess, because Ovirt does changes to iSCSI, as well as the Hit Kit
-> Kaboom. Host not available.
Thanks - I'll look into this specific storage.
I'm aware it's unique in some cases, but I don't have experience with it
specifically.
...
There are several KB Articles in the RHN, without real Solution.
But like you try to tell between the lines, this must be the Customers
misconfiguration. Yep, typical Supportkilleranswer. Same Style than in
RHN Tickets, i am done with this.
A funny sentence I've read yesterday:
Schrodinger's backup: "the condition of any backup is unknown until restore
is attempted."

In a sense, this is similar to no SPOF high availability setup - in many
cases you don't know it works well until needed.
There are simply many variables and components involved.
That was all I meant, nothing between the lines and I apologize if I've
given you a different impression.
Y.
...
Thanks.
...
Am 24.08.2016 um 16:04 schrieb Uwe Laverenz:
    > Hi Elad,
    >
    > thank you very much for clearing things up.
    >
    > Initiator/iface 'a' tries to connect target 'b' and vice versa. As
'a'
...
> and 'b' are in completely separate networks this can never work as
    long
    > as there is no routing between the networks.
    >
    > So it seems the iSCSI-bonding feature is not useful for my setup. I
    > still wonder how and where this feature is supposed to be used?
    >
    > thank you,
    > Uwe
    >
    > Am 24.08.2016 um 15:35 schrieb Elad Ben Aharon:
    >> Thanks.
    >>
    >> You're getting an iSCSI connection timeout [1], [2]. It means the
    host
    >> cannot connect to the targets from iface: enp9s0f1 nor iface:
    enp9s0f0.
    >>
    >> This causes the host to loose its connection to the storage and
also,
...
>> the connection to the engine becomes inactive. Therefore, the host
    >> changes its status to Non-responsive [3] and since it's the SPM,
the
...
>> whole DC, with all its storage domains become inactive.
    >>
    >>
    >> vdsm.log:
    >> [1]
    >> Traceback (most recent call last):
    >>   File "/usr/share/vdsm/storage/hsm.py", line 2400, in
    >> connectStorageServer
    >>     conObj.connect()
    >>   File "/usr/share/vdsm/storage/storageServer.py", line 508, in
    connect
    >>     iscsi.addIscsiNode(self._iface, self._target, self._cred)
    >>   File "/usr/share/vdsm/storage/iscsi.py", line 204, in
addIscsiNode
...
>>     iscsiadm.node_login(iface.name <http://iface.name>
    <http://iface.name>, portalStr,
    >> target.iqn)
    >>   File "/usr/share/vdsm/storage/iscsiadm.py", line 336, in
node_login
...
>>     raise IscsiNodeError(rc, out, err)
    >> IscsiNodeError: (8, ['Logging in to [iface: enp9s0f0, target:
    >> iqn.2005-10.org.freenas.ctl:tgtb, portal: 10.0.132.121,3260]
    >> (multiple)'], ['iscsiadm: Could not login to [iface: enp9s0f0,
targ
...
>> et: iqn.2005-10.org.freenas.ctl:tgtb, portal:
10.0.132.121,3260].',
...
>> 'iscsiadm: initiator reported error (8 - connection timed out)',
    >> 'iscsiadm: Could not log into all portals'])
    >>
    >>
    >>
    >> vdsm.log:
    >> [2]
    >> Traceback (most recent call last):
    >>   File "/usr/share/vdsm/storage/hsm.py", line 2400, in
    >> connectStorageServer
    >>     conObj.connect()
    >>   File "/usr/share/vdsm/storage/storageServer.py", line 508, in
    connect
    >>     iscsi.addIscsiNode(self._iface, self._target, self._cred)
    >>   File "/usr/share/vdsm/storage/iscsi.py", line 204, in
addIscsiNode
...
>>     iscsiadm.node_login(iface.name <http://iface.name>
    <http://iface.name>, portalStr,
    >> target.iqn)
    >>   File "/usr/share/vdsm/storage/iscsiadm.py", line 336, in
node_login
...
>>     raise IscsiNodeError(rc, out, err)
    >> IscsiNodeError: (8, ['Logging in to [iface: enp9s0f1, target:
    >> iqn.2005-10.org.freenas.ctl:tgta, portal: 10.0.131.121,3260]
    >> (multiple)'], ['iscsiadm: Could not login to [iface: enp9s0f1,
    target:
    >> iqn.2005-10.org.freenas.ctl:tgta, portal: 10.0.131.121,3260].',
    >> 'iscsiadm: initiator reported error (8 - connection timed out)',
    >> 'iscsiadm: Could not log into all portals'])
    >>
    >>
    >> engine.log:
    >> [3]
    >>
    >>
    >> 2016-08-24 14:10:23,222 WARN
    >>
    [org.ovirt.engine.core.dal.dbbroker.auditloghandling.
AuditLogDirector]
...
>> (default task-25) [15d1637f] Correlation ID: 15d1637f, Call
    Stack: null,
    >> Custom Event ID:
    >>  -1, Message: iSCSI bond 'iBond' was successfully created in Data
    Center
    >> 'Default' but some of the hosts encountered connection issues.
    >>
    >>
    >>
    >> 2016-08-24 14:10:23,208 INFO
    >>
    [org.ovirt.engine.core.vdsbroker.vdsbroker.
ConnectStorageServerVDSCommand]
...
>>
    >> (org.ovirt.thread.pool-8-thread-25) [15d1637f] Command
    >> 'org.ovirt.engine.core.vdsbrok
    >> er.vdsbroker.ConnectStorageServerVDSCommand' return value '
    >> ServerConnectionStatusReturnForXmlRpc:{status='StatusForXmlRpc
    >> [code=5022, message=Message timeout which can be caused by
    communication
    >> issues]'}
    >>
    >>
    >>
    >> On Wed, Aug 24, 2016 at 4:04 PM, Uwe Laverenz <uwe@laverenz.de
    <mailto:uwe@laverenz.de>
    >> <mailto:uwe@laverenz.de <mailto:uwe@laverenz.de>>> wrote:
    >>
    >>     Hi Elad,
    >>
    >>     I sent you a download message.
    >>
    >>     thank you,
    >>     Uwe
    >>     _______________________________________________
    >>     Users mailing list
    >>     Users@ovirt.org <mailto:Users@ovirt.org>
    <mailto:Users@ovirt.org <mailto:Users@ovirt.org>>
    >>     http://lists.ovirt.org/mailman/listinfo/users
    <http://lists.ovirt.org/mailman/listinfo/users>
    >>     <http://lists.ovirt.org/mailman/listinfo/users
    <http://lists.ovirt.org/mailman/listinfo/users>>
    >>
    >>
    > _______________________________________________
    > Users mailing list
    > Users@ovirt.org <mailto:Users@ovirt.org>
    > http://lists.ovirt.org/mailman/listinfo/users
    <http://lists.ovirt.org/mailman/listinfo/users>
    _______________________________________________
    Users mailing list
    Users@ovirt.org <mailto:Users@ovirt.org>
    http://lists.ovirt.org/mailman/listinfo/users
    <http://lists.ovirt.org/mailman/listinfo/users>