On Mon, Apr 10, 2017 at 3:06 PM, Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
On Mon, Apr 10, 2017 at 2:44 PM, Ondrej Svoboda <osvoboda(a)redhat.com>
wrote:
> Yes, this is what struck me about your situation. Will you be able to
> find relevant logs regarding multipath configuration, in which we would see
> when (or even why) the third connection was created on the first node, and
> only one connection on the second?
>
> On Mon, Apr 10, 2017 at 2:17 PM, Gianluca Cecchi <
> gianluca.cecchi(a)gmail.com> wrote:
>
>> On Mon, Apr 10, 2017 at 2:12 PM, Ondrej Svoboda <osvoboda(a)redhat.com>
>> wrote:
>>
>>> Gianluca,
>>>
>>> I can see that the workaround you describe here (to complete multipath
>>> configuration in CLI) fixes an inconsistency in observed iSCSI sessions. I
>>> think it is a shortcoming in oVirt that you had to resort to manual
>>> configuration. Could you file a bug about this? Ideally, following the bug
>>> template presented to you by Bugzilla, i.e. "Expected: two iSCSI
sessions",
>>> "Got: one the first node ... one the second node".
>>>
>>> Edy, Martin, do you think you could help out here?
>>>
>>> Thanks,
>>> Ondra
>>>
>>
>> Ok, this evening I'm going to open a bugzilla for that.
>> Please keep in mind that on the already configured node (where before
>> node addition there were two connections in place with multipath), actually
>> the node addition generates a third connection, added to the existing two,
>> using "default" as iSCSI interface (clearly seen if I run
"iscsiadm -m
>> session -P1") ....
>>
>> Gianluca
>>
>>
>
>
vdsm log of the already configured host is here for that day:
https://drive.google.com/file/d/0BwoPbcrMv8mvQzdCUmtIT1NOT2c/
view?usp=sharing
Installation / configuration of the second node happened between 11:30 AM
and 01:30 PM of 6th of April.
Aound 12:29 you will find:
2017-04-06 12:29:05,832+0200 INFO (jsonrpc/7) [dispatcher] Run and
protect: getVGInfo, Return response: {'info': {'state': 'OK',
'vgsize':
'1099108974592', 'name': '5ed04196-87f1-480e-9fee-9dd450a3b53b',
'vgfree': '182536110080', 'vgUUID':
'rIENae-3NLj-o4t8-GVuJ-ZKKb-ksTk-qBkMrE',
'pvlist': [{'vendorID': 'EQLOGIC', 'capacity':
'1099108974592', 'fwrev':
'0000', 'pe_alloc_count': '6829', 'vgUUID':
'rIENae-3NLj-o4t8-GVuJ-ZKKb-ksTk-qBkMrE',
'pathlist': [{'connection': '10.10.100.9', 'iqn':
'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910',
'portal': '1', 'port': '3260', 'initiatorname':
'p1p1.100'}, {'connection':
'10.10.100.9', 'iqn':
'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910',
'portal': '1', 'port': '3260', 'initiatorname':
'p1p2'}, {'connection':
'10.10.100.9', 'iqn':
'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910',
'portal': '1', 'port': '3260', 'initiatorname':
'default'}], 'pe_count':
'8189', 'discard_max_bytes': 15728640, 'pathstatus':
[{'type': 'iSCSI',
'physdev': 'sde', 'capacity': '1099526307840',
'state': 'active', 'lun':
'0'}, {'type': 'iSCSI', 'physdev': 'sdf',
'capacity': '1099526307840',
'state': 'active', 'lun': '0'}, {'type':
'iSCSI', 'physdev': 'sdg',
'capacity': '1099526307840', 'state': 'active',
'lun': '0'}], 'devtype':
'iSCSI', 'discard_zeroes_data': 1, 'pvUUID':
'g9pjI0-oifQ-kz2O-0Afy-xdnx-THYD-eTWgqB',
'serial': 'SEQLOGIC_100E-00_64817197B5DFD0E5538D959702249B1C',
'GUID': '
364817197b5dfd0e5538d959702249b1c', 'devcapacity': '1099526307840',
'productID': '100E-00'}], 'type': 3, 'attr':
{'allocation': 'n', 'partial':
'-', 'exported': '-', 'permission': 'w',
'clustered': '-', 'resizeable':
'z'}}} (logUtils:54)
and around 12:39 you will find
2017-04-06 12:39:11,003+0200 ERROR (check/loop) [storage.Monitor] Error
checking path /dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata
(monitor:485)
Traceback (most recent call last):
File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked
delay = result.delay()
File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line
368, in delay
raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure:
('/dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata', 1,
bytearray(b"/usr/bin/dd: error reading
\'/dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata\':
Input/output error\n0+0 records in\n0+0 records out\n0 bytes (0 B) copied,
0.000234164 s, 0.0 kB/s\n"))
2017-04-06 12:39:11,020+0200 INFO (check/loop) [storage.Monitor] Domain
5ed04196-87f1-480e-9fee-9dd450a3b53b became INVALID (monitor:456)
that I think corresponds to the moment when I executed "iscsiadm -m
session -u" and had the automaic remediation of the correctly defined paths
Gianluca
So I come back here because I have an "orthogonal" action with the same
effect.
I have already in place the same 2 oVirt hosts using one 4Tb iSCSI lun.
With the configuration detailed at beginning of thread, taht I resend:
On them I have multipath access defined from inside oVirt like this at OS
level
[root@ov300 ~]# iscsiadm -m session -P 1
Target:
iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910
(non-flash)
Current Portal: 10.10.100.41:3260,1
Persistent Portal: 10.10.100.9:3260,1
**********
Interface:
**********
Iface Name: p1p2
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: p1p2
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
Current Portal: 10.10.100.42:3260,1
Persistent Portal: 10.10.100.9:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 2
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
So the network adapter used in the multipath config are p1p2 and p1p1.100
Now I go and add a new storage domain: a 5Tb lun on another DELL EQLOGIC
storage with ip address 10.10.100.7, so on the same lan as the existing one.
I go through
Storage -> New Domain
discover targets with the ip
only one path is detected
I connect and the storage domain is activated
But on both oVirt hosts I have only one path for this LUN and at portal
level the generic "default" Iface Name is used:
iscsiadm -m session -P1 gives the 2 connections above, plus
Target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
Current Portal: 10.10.100.38:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
I can now retry to put host in maintenance and then activate and see if
then it uses iscsi1 and iscsi2, but I proceed manually
Datacenter --> MyDC
iSCSI Multipathing
iscsi1 -> edit
I see that only the 10.10.100.9 4Tb storage target is selected
--> I select the second one too (10.10.100.7) --> OK
No tasks seen in web gui but at os side I see added:
Current Portal: 10.10.100.37:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 4
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
The same for iscsi2 --> edit
only the 10.10.100.9 4Tb storage target is selected
--> I select the second one too (10.10.100.7) --> OK
Now I see for this lun a total of 3 connections (the initial "default" one
and the iscsi1 and iscsi2 ones):
Target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
Current Portal: 10.10.100.38:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
**********
Interface:
**********
Iface Name: p1p2
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: p1p2
SID: 5
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
Current Portal: 10.10.100.37:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 4
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
[root@ov300 ~]# iscsiadm -m session
tcp: [1] 10.10.100.9:3260,1
iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910
(non-flash)
tcp: [2] 10.10.100.9:3260,1
iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910
(non-flash)
tcp: [3] 10.10.100.7:3260,1
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
tcp: [4] 10.10.100.7:3260,1
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
tcp: [5] 10.10.100.7:3260,1
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
[root@ov300 ~]#
At the end I manually remove the wrong "default" one created by the web gui
on both nodes:
[root@ov300 ~]# iscsiadm -m session -r 3 -u
Logging out of session [sid: 3, target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
portal: 10.10.100.7,3260]
Logout of [sid: 3, target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
portal: 10.10.100.7,3260] successful.
[root@ov300 ~]#
Is this the expected workflow, or in case of multipathing configured at
oVirt level, oVirt should have directly configured the new LUN in
multipathed way?
I have still to verify if I put one host in maintenance and then I activate
it that all goes as expected for both LUNs.
Thanks,
Gianluca