On Mon, Apr 10, 2017 at 3:06 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:


On Mon, Apr 10, 2017 at 2:44 PM, Ondrej Svoboda <osvoboda@redhat.com> wrote:
Yes, this is what struck me about your situation. Will you be able to find relevant logs regarding multipath configuration, in which we would see when (or even why) the third connection was created on the first node, and only one connection on the second?

On Mon, Apr 10, 2017 at 2:17 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Mon, Apr 10, 2017 at 2:12 PM, Ondrej Svoboda <osvoboda@redhat.com> wrote:
Gianluca,

I can see that the workaround you describe here (to complete multipath configuration in CLI) fixes an inconsistency in observed iSCSI sessions. I think it is a shortcoming in oVirt that you had to resort to manual configuration. Could you file a bug about this? Ideally, following the bug template presented to you by Bugzilla, i.e. "Expected: two iSCSI sessions", "Got: one the first node ... one the second node".

Edy, Martin, do you think you could help out here?

Thanks,
Ondra

Ok, this evening I'm going to open a bugzilla for that.
Please keep in mind that on the already configured node (where before node addition there were two connections in place with multipath), actually the node addition generates a third connection, added to the existing two, using "default" as iSCSI interface (clearly seen if I run "iscsiadm -m session -P1") ....

Gianluca
 


vdsm log of the already configured host is here for that day:

Installation / configuration of the second node happened between 11:30 AM and 01:30 PM of 6th of April.

Aound 12:29 you will find:

2017-04-06 12:29:05,832+0200 INFO  (jsonrpc/7) [dispatcher] Run and protect: getVGInfo, Return response: {'info': {'state': 'OK', 'vgsize': '1099108974592', 'name': '5ed04196-87f1-480e-9fee-9dd450a3b53b', 'vgfree': '182536110080', 'vgUUID': 'rIENae-3NLj-o4t8-GVuJ-ZKKb-ksTk-qBkMrE', 'pvlist': [{'vendorID': 'EQLOGIC', 'capacity': '1099108974592', 'fwrev': '0000', 'pe_alloc_count': '6829', 'vgUUID': 'rIENae-3NLj-o4t8-GVuJ-ZKKb-ksTk-qBkMrE', 'pathlist': [{'connection': '10.10.100.9', 'iqn': 'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910', 'portal': '1', 'port': '3260', 'initiatorname': 'p1p1.100'}, {'connection': '10.10.100.9', 'iqn': 'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910', 'portal': '1', 'port': '3260', 'initiatorname': 'p1p2'}, {'connection': '10.10.100.9', 'iqn': 'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910', 'portal': '1', 'port': '3260', 'initiatorname': 'default'}], 'pe_count': '8189', 'discard_max_bytes': 15728640, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sde', 'capacity': '1099526307840', 'state': 'active', 'lun': '0'}, {'type': 'iSCSI', 'physdev': 'sdf', 'capacity': '1099526307840', 'state': 'active', 'lun': '0'}, {'type': 'iSCSI', 'physdev': 'sdg', 'capacity': '1099526307840', 'state': 'active', 'lun': '0'}], 'devtype': 'iSCSI', 'discard_zeroes_data': 1, 'pvUUID': 'g9pjI0-oifQ-kz2O-0Afy-xdnx-THYD-eTWgqB', 'serial': 'SEQLOGIC_100E-00_64817197B5DFD0E5538D959702249B1C', 'GUID': '364817197b5dfd0e5538d959702249b1c', 'devcapacity': '1099526307840', 'productID': '100E-00'}], 'type': 3, 'attr': {'allocation': 'n', 'partial': '-', 'exported': '-', 'permission': 'w', 'clustered': '-', 'resizeable': 'z'}}} (logUtils:54)

and around 12:39 you will find 

2017-04-06 12:39:11,003+0200 ERROR (check/loop) [storage.Monitor] Error checking path /dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata (monitor:485)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked
    delay = result.delay()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 368, in delay
    raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: ('/dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata', 1, bytearray(b"/usr/bin/dd: error reading \'/dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata\': Input/output error\n0+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000234164 s, 0.0 kB/s\n"))
2017-04-06 12:39:11,020+0200 INFO  (check/loop) [storage.Monitor] Domain 5ed04196-87f1-480e-9fee-9dd450a3b53b became INVALID (monitor:456)

that I think corresponds to the moment when I executed "iscsiadm -m session -u" and had the automaic remediation of the correctly defined paths

Gianluca

So I come back here because I have an "orthogonal" action with the same effect.

I have already in place the same 2 oVirt hosts using one 4Tb iSCSI lun.
With the configuration detailed at beginning of thread, taht I resend:
http://lists.ovirt.org/pipermail/users/2017-March/080992.html

On them I have multipath access defined from inside oVirt like this at OS level

[root@ov300 ~]# iscsiadm -m session -P 1
Target: iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910 (non-flash)
Current Portal: 10.10.100.41:3260,1
Persistent Portal: 10.10.100.9:3260,1
**********
Interface:
**********
Iface Name: p1p2
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: p1p2
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
Current Portal: 10.10.100.42:3260,1
Persistent Portal: 10.10.100.9:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 2
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE

So the network adapter used in the multipath config are p1p2 and p1p1.100

Now I go and add a new storage domain: a 5Tb lun on another DELL EQLOGIC storage with ip address 10.10.100.7, so on the same lan as the existing one.

I go through
Storage -> New Domain
discover targets with the ip
only one path is detected
I connect and the storage domain is activated

But on both oVirt hosts I have only one path for this LUN and at portal level the generic "default" Iface Name is used:

iscsiadm -m session -P1 gives the 2 connections above, plus

 Target: iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 (non-flash)
Current Portal: 10.10.100.38:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE


I can now retry to put host in maintenance and then activate and see if then it uses iscsi1 and iscsi2, but I proceed manually

From web gui
Datacenter --> MyDC
iSCSI Multipathing

iscsi1 -> edit
I see that only the 10.10.100.9 4Tb storage target is selected
--> I select the second one too (10.10.100.7) --> OK

No tasks seen in web gui but at os side I see added:

Current Portal: 10.10.100.37:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 4
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE


The same for iscsi2 --> edit
only the 10.10.100.9 4Tb storage target is selected
--> I select the second one too (10.10.100.7) --> OK

Now I see for this lun a total of 3 connections (the initial "default" one and the iscsi1 and iscsi2 ones):

Target: iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 (non-flash)
Current Portal: 10.10.100.38:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE

**********
Interface:
**********
Iface Name: p1p2
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: p1p2
SID: 5
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
Current Portal: 10.10.100.37:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 4
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE


[root@ov300 ~]# iscsiadm -m session
tcp: [1] 10.10.100.9:3260,1 iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910 (non-flash)
tcp: [2] 10.10.100.9:3260,1 iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910 (non-flash)
tcp: [3] 10.10.100.7:3260,1 iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 (non-flash)
tcp: [4] 10.10.100.7:3260,1 iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 (non-flash)
tcp: [5] 10.10.100.7:3260,1 iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 (non-flash)
[root@ov300 ~]# 

At the end I manually remove the wrong "default" one created by the web gui on both nodes:

[root@ov300 ~]# iscsiadm -m session -r 3 -u
Logging out of session [sid: 3, target: iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750, portal: 10.10.100.7,3260]
Logout of [sid: 3, target: iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750, portal: 10.10.100.7,3260] successful.
[root@ov300 ~]# 

Is this the expected workflow, or in case of multipathing configured at oVirt level, oVirt should have directly configured the new LUN in multipathed way?

I have still to verify if I put one host in maintenance and then I activate it that all goes as expected for both LUNs.

Thanks,
Gianluca