[ovirt-users] iSCSI storage domain and multipath when adding node
Gianluca Cecchi
gianluca.cecchi at gmail.com
Tue Jun 27 09:17:35 UTC 2017
On Mon, Apr 10, 2017 at 3:06 PM, Gianluca Cecchi <gianluca.cecchi at gmail.com>
wrote:
>
>
> On Mon, Apr 10, 2017 at 2:44 PM, Ondrej Svoboda <osvoboda at redhat.com>
> wrote:
>
>> Yes, this is what struck me about your situation. Will you be able to
>> find relevant logs regarding multipath configuration, in which we would see
>> when (or even why) the third connection was created on the first node, and
>> only one connection on the second?
>>
>> On Mon, Apr 10, 2017 at 2:17 PM, Gianluca Cecchi <
>> gianluca.cecchi at gmail.com> wrote:
>>
>>> On Mon, Apr 10, 2017 at 2:12 PM, Ondrej Svoboda <osvoboda at redhat.com>
>>> wrote:
>>>
>>>> Gianluca,
>>>>
>>>> I can see that the workaround you describe here (to complete multipath
>>>> configuration in CLI) fixes an inconsistency in observed iSCSI sessions. I
>>>> think it is a shortcoming in oVirt that you had to resort to manual
>>>> configuration. Could you file a bug about this? Ideally, following the bug
>>>> template presented to you by Bugzilla, i.e. "Expected: two iSCSI sessions",
>>>> "Got: one the first node ... one the second node".
>>>>
>>>> Edy, Martin, do you think you could help out here?
>>>>
>>>> Thanks,
>>>> Ondra
>>>>
>>>
>>> Ok, this evening I'm going to open a bugzilla for that.
>>> Please keep in mind that on the already configured node (where before
>>> node addition there were two connections in place with multipath), actually
>>> the node addition generates a third connection, added to the existing two,
>>> using "default" as iSCSI interface (clearly seen if I run "iscsiadm -m
>>> session -P1") ....
>>>
>>> Gianluca
>>>
>>>
>>
>>
> vdsm log of the already configured host is here for that day:
> https://drive.google.com/file/d/0BwoPbcrMv8mvQzdCUmtIT1NOT2c/
> view?usp=sharing
>
> Installation / configuration of the second node happened between 11:30 AM
> and 01:30 PM of 6th of April.
>
> Aound 12:29 you will find:
>
> 2017-04-06 12:29:05,832+0200 INFO (jsonrpc/7) [dispatcher] Run and
> protect: getVGInfo, Return response: {'info': {'state': 'OK', 'vgsize':
> '1099108974592', 'name': '5ed04196-87f1-480e-9fee-9dd450a3b53b',
> 'vgfree': '182536110080', 'vgUUID': 'rIENae-3NLj-o4t8-GVuJ-ZKKb-ksTk-qBkMrE',
> 'pvlist': [{'vendorID': 'EQLOGIC', 'capacity': '1099108974592', 'fwrev':
> '0000', 'pe_alloc_count': '6829', 'vgUUID': 'rIENae-3NLj-o4t8-GVuJ-ZKKb-ksTk-qBkMrE',
> 'pathlist': [{'connection': '10.10.100.9', 'iqn':
> 'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910',
> 'portal': '1', 'port': '3260', 'initiatorname': 'p1p1.100'}, {'connection':
> '10.10.100.9', 'iqn': 'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910',
> 'portal': '1', 'port': '3260', 'initiatorname': 'p1p2'}, {'connection':
> '10.10.100.9', 'iqn': 'iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910',
> 'portal': '1', 'port': '3260', 'initiatorname': 'default'}], 'pe_count':
> '8189', 'discard_max_bytes': 15728640, 'pathstatus': [{'type': 'iSCSI',
> 'physdev': 'sde', 'capacity': '1099526307840', 'state': 'active', 'lun':
> '0'}, {'type': 'iSCSI', 'physdev': 'sdf', 'capacity': '1099526307840',
> 'state': 'active', 'lun': '0'}, {'type': 'iSCSI', 'physdev': 'sdg',
> 'capacity': '1099526307840', 'state': 'active', 'lun': '0'}], 'devtype':
> 'iSCSI', 'discard_zeroes_data': 1, 'pvUUID': 'g9pjI0-oifQ-kz2O-0Afy-xdnx-THYD-eTWgqB',
> 'serial': 'SEQLOGIC_100E-00_64817197B5DFD0E5538D959702249B1C', 'GUID': '
> 364817197b5dfd0e5538d959702249b1c', 'devcapacity': '1099526307840',
> 'productID': '100E-00'}], 'type': 3, 'attr': {'allocation': 'n', 'partial':
> '-', 'exported': '-', 'permission': 'w', 'clustered': '-', 'resizeable':
> 'z'}}} (logUtils:54)
>
> and around 12:39 you will find
>
> 2017-04-06 12:39:11,003+0200 ERROR (check/loop) [storage.Monitor] Error
> checking path /dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata
> (monitor:485)
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked
> delay = result.delay()
> File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line
> 368, in delay
> raise exception.MiscFileReadException(self.path, self.rc, self.err)
> MiscFileReadException: Internal file read failure:
> ('/dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata', 1,
> bytearray(b"/usr/bin/dd: error reading \'/dev/5ed04196-87f1-480e-9fee-9dd450a3b53b/metadata\':
> Input/output error\n0+0 records in\n0+0 records out\n0 bytes (0 B) copied,
> 0.000234164 s, 0.0 kB/s\n"))
> 2017-04-06 12:39:11,020+0200 INFO (check/loop) [storage.Monitor] Domain
> 5ed04196-87f1-480e-9fee-9dd450a3b53b became INVALID (monitor:456)
>
> that I think corresponds to the moment when I executed "iscsiadm -m
> session -u" and had the automaic remediation of the correctly defined paths
>
> Gianluca
>
So I come back here because I have an "orthogonal" action with the same
effect.
I have already in place the same 2 oVirt hosts using one 4Tb iSCSI lun.
With the configuration detailed at beginning of thread, taht I resend:
http://lists.ovirt.org/pipermail/users/2017-March/080992.html
On them I have multipath access defined from inside oVirt like this at OS
level
[root at ov300 ~]# iscsiadm -m session -P 1
Target:
iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910
(non-flash)
Current Portal: 10.10.100.41:3260,1
Persistent Portal: 10.10.100.9:3260,1
**********
Interface:
**********
Iface Name: p1p2
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: p1p2
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
Current Portal: 10.10.100.42:3260,1
Persistent Portal: 10.10.100.9:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 2
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
So the network adapter used in the multipath config are p1p2 and p1p1.100
Now I go and add a new storage domain: a 5Tb lun on another DELL EQLOGIC
storage with ip address 10.10.100.7, so on the same lan as the existing one.
I go through
Storage -> New Domain
discover targets with the ip
only one path is detected
I connect and the storage domain is activated
But on both oVirt hosts I have only one path for this LUN and at portal
level the generic "default" Iface Name is used:
iscsiadm -m session -P1 gives the 2 connections above, plus
Target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
Current Portal: 10.10.100.38:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
I can now retry to put host in maintenance and then activate and see if
then it uses iscsi1 and iscsi2, but I proceed manually
>From web gui
Datacenter --> MyDC
iSCSI Multipathing
iscsi1 -> edit
I see that only the 10.10.100.9 4Tb storage target is selected
--> I select the second one too (10.10.100.7) --> OK
No tasks seen in web gui but at os side I see added:
Current Portal: 10.10.100.37:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 4
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
The same for iscsi2 --> edit
only the 10.10.100.9 4Tb storage target is selected
--> I select the second one too (10.10.100.7) --> OK
Now I see for this lun a total of 3 connections (the initial "default" one
and the iscsi1 and iscsi2 ones):
Target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
Current Portal: 10.10.100.38:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
**********
Interface:
**********
Iface Name: p1p2
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.88
Iface HWaddress: <empty>
Iface Netdev: p1p2
SID: 5
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
Current Portal: 10.10.100.37:3260,1
Persistent Portal: 10.10.100.7:3260,1
**********
Interface:
**********
Iface Name: p1p1.100
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:f2d7fc1e2fc
Iface IPaddress: 10.10.100.87
Iface HWaddress: <empty>
Iface Netdev: p1p1.100
SID: 4
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
[root at ov300 ~]# iscsiadm -m session
tcp: [1] 10.10.100.9:3260,1
iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910
(non-flash)
tcp: [2] 10.10.100.9:3260,1
iqn.2001-05.com.equallogic:4-771816-e5d0dfb59-1c9b240297958d53-ovsd3910
(non-flash)
tcp: [3] 10.10.100.7:3260,1
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
tcp: [4] 10.10.100.7:3260,1
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
tcp: [5] 10.10.100.7:3260,1
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
(non-flash)
[root at ov300 ~]#
At the end I manually remove the wrong "default" one created by the web gui
on both nodes:
[root at ov300 ~]# iscsiadm -m session -r 3 -u
Logging out of session [sid: 3, target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
portal: 10.10.100.7,3260]
Logout of [sid: 3, target:
iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
portal: 10.10.100.7,3260] successful.
[root at ov300 ~]#
Is this the expected workflow, or in case of multipathing configured at
oVirt level, oVirt should have directly configured the new LUN in
multipathed way?
I have still to verify if I put one host in maintenance and then I activate
it that all goes as expected for both LUNs.
Thanks,
Gianluca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170627/1e3648cf/attachment-0001.html>
More information about the Users
mailing list