
On Wed, Aug 24, 2016 at 6:15 PM, InterNetX - Juergen Gotteswinter <juergen.gotteswinter@internetx.com> wrote:
iSCSI & Ovirt is an awful combination, no matter if multipathed or bonded. its always gambling how long it will work, and when it fails why did it fail.
its supersensitive to latency,
Can you elaborate on this?
and superfast with setting an host to inactive because the engine thinks something is wrong with it.
Typically it takes at least 5 minutes with abnormal monitoring conditions before engine will make a host non-operational, is this realy superfast?
in most cases there was no real reason for.
I think the main issue was mixing of storage monitoring and lvm refreshes, unneeded serialization of lvm commands, and bad locking in engine side. The engine side was fixed in 3.6, and the vdsm side in 4.0. See https://bugzilla.redhat.com/1081962 In rhel/centos 7.2, lot of lot of multipath related issues were fixed, and ovirt multipath configuration was fixed to prevent unwanted io queuing with some devices, that could lead to long delays and failures in many flows. However, I think our configuration is too extreme, and you may like to use the configuration in this patch: https://gerrit.ovirt.org/61281 I guess trying 4.0 may be too bleeding edge for you, but hopefully you will find that your iscsi setup is much more reliable now. Please file bugs if you still have issues with 4.0. Nir
we had this in several different hardware combinations, self built filers up on FreeBSD/Illumos & ZFS, Equallogic SAN, Nexenta Filer
Been there, done that, wont do again.
Am 24.08.2016 um 16:04 schrieb Uwe Laverenz:
Hi Elad,
thank you very much for clearing things up.
Initiator/iface 'a' tries to connect target 'b' and vice versa. As 'a' and 'b' are in completely separate networks this can never work as long as there is no routing between the networks.
So it seems the iSCSI-bonding feature is not useful for my setup. I still wonder how and where this feature is supposed to be used?
thank you, Uwe
Am 24.08.2016 um 15:35 schrieb Elad Ben Aharon:
Thanks.
You're getting an iSCSI connection timeout [1], [2]. It means the host cannot connect to the targets from iface: enp9s0f1 nor iface: enp9s0f0.
This causes the host to loose its connection to the storage and also, the connection to the engine becomes inactive. Therefore, the host changes its status to Non-responsive [3] and since it's the SPM, the whole DC, with all its storage domains become inactive.
vdsm.log: [1] Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2400, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 508, in connect iscsi.addIscsiNode(self._iface, self._target, self._cred) File "/usr/share/vdsm/storage/iscsi.py", line 204, in addIscsiNode iscsiadm.node_login(iface.name <http://iface.name>, portalStr, target.iqn) File "/usr/share/vdsm/storage/iscsiadm.py", line 336, in node_login raise IscsiNodeError(rc, out, err) IscsiNodeError: (8, ['Logging in to [iface: enp9s0f0, target: iqn.2005-10.org.freenas.ctl:tgtb, portal: 10.0.132.121,3260] (multiple)'], ['iscsiadm: Could not login to [iface: enp9s0f0, targ et: iqn.2005-10.org.freenas.ctl:tgtb, portal: 10.0.132.121,3260].', 'iscsiadm: initiator reported error (8 - connection timed out)', 'iscsiadm: Could not log into all portals'])
vdsm.log: [2] Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2400, in connectStorageServer conObj.connect() File "/usr/share/vdsm/storage/storageServer.py", line 508, in connect iscsi.addIscsiNode(self._iface, self._target, self._cred) File "/usr/share/vdsm/storage/iscsi.py", line 204, in addIscsiNode iscsiadm.node_login(iface.name <http://iface.name>, portalStr, target.iqn) File "/usr/share/vdsm/storage/iscsiadm.py", line 336, in node_login raise IscsiNodeError(rc, out, err) IscsiNodeError: (8, ['Logging in to [iface: enp9s0f1, target: iqn.2005-10.org.freenas.ctl:tgta, portal: 10.0.131.121,3260] (multiple)'], ['iscsiadm: Could not login to [iface: enp9s0f1, target: iqn.2005-10.org.freenas.ctl:tgta, portal: 10.0.131.121,3260].', 'iscsiadm: initiator reported error (8 - connection timed out)', 'iscsiadm: Could not log into all portals'])
engine.log: [3]
2016-08-24 14:10:23,222 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-25) [15d1637f] Correlation ID: 15d1637f, Call Stack: null, Custom Event ID: -1, Message: iSCSI bond 'iBond' was successfully created in Data Center 'Default' but some of the hosts encountered connection issues.
2016-08-24 14:10:23,208 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand]
(org.ovirt.thread.pool-8-thread-25) [15d1637f] Command 'org.ovirt.engine.core.vdsbrok er.vdsbroker.ConnectStorageServerVDSCommand' return value ' ServerConnectionStatusReturnForXmlRpc:{status='StatusForXmlRpc [code=5022, message=Message timeout which can be caused by communication issues]'}
On Wed, Aug 24, 2016 at 4:04 PM, Uwe Laverenz <uwe@laverenz.de <mailto:uwe@laverenz.de>> wrote:
Hi Elad,
I sent you a download message.
thank you, Uwe _______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users <http://lists.ovirt.org/mailman/listinfo/users>
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users