[ovirt-users] Recovering iSCSI domain (Was: Changing iSCSI LUN host IP and changing master domain)
Trey Dockendorf
treydock at gmail.com
Tue Oct 21 20:59:23 UTC 2014
Somehow my NFS domain got to be master again. I went into the database and
updated the connections for NFS and I noticed that once I updated the IP
for the ISCSI in the "storage_server_connections" table that the interface
kept moving "(master)" between the iSCSI and NFS domain...very odd.
I did these commands and now NFS is up.
update storage_server_connections set connection='10.0.0.10:/tank/ovirt/data'
where id='a89fa66b-8737-4bb8-a089-d9067f61b58a';
update storage_server_connections set
connection='10.0.0.10:/tank/ovirt/import_export'
where id='521a8477-9e88-4f2d-96e2-d3667ec407df';
update storage_server_connections set
connection='192.168.202.245:/tank/ovirt/iso'
where id='fb55cfea-c7ef-49f2-b77f-16ddd2de0f7a';
update storage_server_connections set connection='10.0.0.10' where
id='d6da7fbf-5056-44a7-9fc8-e76a1ff9f525';
Once I activated the NFS master domain all my other domains went to active,
including iSCSI.
My concern now is whether the iSCSI domain is usable. The API path at
"/api/storagedomains/4eeb8415-c912-44bf-b482-2673849705c9/storageconnections"
shows
<storage_connections/>
If I go to edit the iSCSI domain and check the LUN the warning I get is
this:
This operation might be unrecoverable and destructive!
The following LUNs are already in use:
- 1IET_00010001 (Used by VG: 3nxXNr-bIHu-9YS5-Kfzc-A2Na-sMhb-jihwdt)
That alone makes me very hesitant to approve the operation. I could use
some wisdom if this is safe or not.
Thanks,
- Trey
On Tue, Oct 21, 2014 at 3:17 PM, Trey Dockendorf <treydock at gmail.com> wrote:
> John,
>
> Thanks for reply. The Discover function in GUI works...it's once I try
> and login (Click the array next to target) that things just hang
> indefinitely.
>
> # iscsiadm -m session
> tcp: [2] 10.0.0.10:3260,1
> iqn.2014-04.edu.tamu.brazos.vmstore1:ovirt-data_iscsi
>
> # iscsiadm -m node
> 10.0.0.10:3260,1 iqn.2014-04.edu.tamu.brazos.vmstore1:ovirt-data_iscsi
>
> # multipath -ll
> 1IET_00010001 dm-3 IET,VIRTUAL-DISK
> size=500G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
> `- 8:0:0:1 sdd 8:48 active ready running
> 1ATA_WDC_WD5003ABYZ-011FA0_WD-WMAYP0DNSAEZ dm-2 ATA,WDC WD5003ABYZ-0
> size=466G features='0' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=active
> `- 3:0:0:0 sdc 8:32 active ready running
>
> The first entry, 1IET_00010001 is the iSCSI LUN.
>
> The log when I click the array in the interface for the target is this:
>
> Thread-14::DEBUG::2014-10-21
> 15:12:49,900::BindingXMLRPC::251::vds::(wrapper) client [192.168.202.99]
> flowID [7177dafe]
> Thread-14::DEBUG::2014-10-21
> 15:12:49,901::task::595::TaskManager.Task::(_updateState)
> Task=`01d8d01e-8bfd-4764-890f-2026fdeb78d9`::moving from state init ->
> state preparing
> Thread-14::INFO::2014-10-21
> 15:12:49,901::logUtils::44::dispatcher::(wrapper) Run and protect:
> connectStorageServer(domType=3,
> spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection':
> '10.0.0.10', 'iqn': 'iqn.2014-04.edu.tamu.brazos.)
> Thread-14::DEBUG::2014-10-21
> 15:12:49,902::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo
> -n /sbin/iscsiadm -m node -T
> iqn.2014-04.edu.tamu.brazos.vmstore1:ovirt-data_iscsi -I default -p
> 10.0.0.10:3260,1 --op=new' (cwd None)
> Thread-14::DEBUG::2014-10-21
> 15:12:56,684::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> =
> ''; <rc> = 0
> Thread-14::DEBUG::2014-10-21
> 15:12:56,685::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo
> -n /sbin/iscsiadm -m node -T
> iqn.2014-04.edu.tamu.brazos.vmstore1:ovirt-data_iscsi -I default -p
> 10.0.0.10:3260,1 -l' (cwd None)
> Thread-14::DEBUG::2014-10-21
> 15:12:56,711::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> =
> ''; <rc> = 0
> Thread-14::DEBUG::2014-10-21
> 15:12:56,711::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) '/usr/bin/sudo
> -n /sbin/iscsiadm -m node -T
> iqn.2014-04.edu.tamu.brazos.vmstore1:ovirt-data_iscsi -I default -p
> 10.0.0.10:3260,1 -n node.startup -v manual --op)
> Thread-14::DEBUG::2014-10-21
> 15:12:56,767::iscsiadm::92::Storage.Misc.excCmd::(_runCmd) SUCCESS: <err> =
> ''; <rc> = 0
> Thread-14::DEBUG::2014-10-21
> 15:12:56,767::lvm::373::OperationMutex::(_reloadvgs) Operation 'lvm reload
> operation' got the operation mutex
> Thread-14::DEBUG::2014-10-21
> 15:12:56,768::lvm::296::Storage.Misc.excCmd::(cmd) '/usr/bin/sudo -n
> /sbin/lvm vgs --config " devices { preferred_names = [\\"^/dev/mapper/\\"]
> ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3)
> Thread-14::DEBUG::2014-10-21
> 15:12:56,968::lvm::296::Storage.Misc.excCmd::(cmd) SUCCESS: <err> = ' No
> volume groups found\n'; <rc> = 0
> Thread-14::DEBUG::2014-10-21
> 15:12:56,969::lvm::415::OperationMutex::(_reloadvgs) Operation 'lvm reload
> operation' released the operation mutex
> Thread-14::DEBUG::2014-10-21
> 15:12:56,974::hsm::2352::Storage.HSM::(__prefetchDomains) Found SD uuids: ()
> Thread-14::DEBUG::2014-10-21
> 15:12:56,974::hsm::2408::Storage.HSM::(connectStorageServer) knownSDs: {}
> Thread-14::INFO::2014-10-21
> 15:12:56,974::logUtils::47::dispatcher::(wrapper) Run and protect:
> connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id':
> '00000000-0000-0000-0000-000000000000'}]}
> Thread-14::DEBUG::2014-10-21
> 15:12:56,974::task::1185::TaskManager.Task::(prepare)
> Task=`01d8d01e-8bfd-4764-890f-2026fdeb78d9`::finished: {'statuslist':
> [{'status': 0, 'id': '00000000-0000-0000-0000-000000000000'}]}
> Thread-14::DEBUG::2014-10-21
> 15:12:56,975::task::595::TaskManager.Task::(_updateState)
> Task=`01d8d01e-8bfd-4764-890f-2026fdeb78d9`::moving from state preparing ->
> state finished
> Thread-14::DEBUG::2014-10-21
> 15:12:56,975::resourceManager::940::ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-14::DEBUG::2014-10-21
> 15:12:56,975::resourceManager::977::ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-14::DEBUG::2014-10-21
> 15:12:56,975::task::990::TaskManager.Task::(_decref)
> Task=`01d8d01e-8bfd-4764-890f-2026fdeb78d9`::ref 0 aborting False
> Thread-13::DEBUG::2014-10-21
> 15:13:18,281::task::595::TaskManager.Task::(_updateState)
> Task=`8674b6b0-5e4c-4f0c-8b6b-c5fa5fef6126`::moving from state init ->
> state preparing
> Thread-13::INFO::2014-10-21
> 15:13:18,281::logUtils::44::dispatcher::(wrapper) Run and protect:
> repoStats(options=None)
> Thread-13::INFO::2014-10-21
> 15:13:18,282::logUtils::47::dispatcher::(wrapper) Run and protect:
> repoStats, Return response: {}
> Thread-13::DEBUG::2014-10-21
> 15:13:18,282::task::1185::TaskManager.Task::(prepare)
> Task=`8674b6b0-5e4c-4f0c-8b6b-c5fa5fef6126`::finished: {}
> Thread-13::DEBUG::2014-10-21
> 15:13:18,282::task::595::TaskManager.Task::(_updateState)
> Task=`8674b6b0-5e4c-4f0c-8b6b-c5fa5fef6126`::moving from state preparing ->
> state finished
> Thread-13::DEBUG::2014-10-21
> 15:13:18,282::resourceManager::940::ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-13::DEBUG::2014-10-21
> 15:13:18,282::resourceManager::977::ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-13::DEBUG::2014-10-21
> 15:13:18,283::task::990::TaskManager.Task::(_decref)
> Task=`8674b6b0-5e4c-4f0c-8b6b-c5fa5fef6126`::ref 0 aborting False
>
> The lines prefixed with "Thread-13" just repeat over and over only
> changing the Task value.
>
> Unsure what could be done to restore things. The iscsi connection is good
> and I'm able to see the logical volumes:
>
> # lvscan
> ACTIVE '/dev/4eeb8415-c912-44bf-b482-2673849705c9/metadata'
> [512.00 MiB] inherit
> ACTIVE '/dev/4eeb8415-c912-44bf-b482-2673849705c9/leases'
> [2.00 GiB] inherit
> ACTIVE '/dev/4eeb8415-c912-44bf-b482-2673849705c9/ids'
> [128.00 MiB] inherit
> ACTIVE '/dev/4eeb8415-c912-44bf-b482-2673849705c9/inbox'
> [128.00 MiB] inherit
> ACTIVE '/dev/4eeb8415-c912-44bf-b482-2673849705c9/outbox'
> [128.00 MiB] inherit
> ACTIVE '/dev/4eeb8415-c912-44bf-b482-2673849705c9/master'
> [1.00 GiB] inherit
> inactive
> '/dev/4eeb8415-c912-44bf-b482-2673849705c9/aced9726-5a28-4d52-96f5-89553ba770af'
> [100.00 GiB] inherit
> inactive
> '/dev/4eeb8415-c912-44bf-b482-2673849705c9/87bf28aa-be25-4a93-9b23-f70bfd8accc0'
> [1.00 GiB] inherit
> inactive
> '/dev/4eeb8415-c912-44bf-b482-2673849705c9/27256587-bf87-4519-89e7-260e13697de3'
> [20.00 GiB] inherit
> inactive
> '/dev/4eeb8415-c912-44bf-b482-2673849705c9/ac2cb7f9-1df9-43dc-9fda-8a9958ef970f'
> [20.00 GiB] inherit
> inactive
> '/dev/4eeb8415-c912-44bf-b482-2673849705c9/d8c41f05-006a-492b-8e5f-101c4e113b28'
> [100.00 GiB] inherit
> inactive
> '/dev/4eeb8415-c912-44bf-b482-2673849705c9/83f17e9b-183e-4bad-ada5-bcef1c5c8e6a'
> [20.00 GiB] inherit
> inactive
> '/dev/4eeb8415-c912-44bf-b482-2673849705c9/cf79052e-b4ef-4bda-96dc-c53b7c2acfb5'
> [20.00 GiB] inherit
> ACTIVE '/dev/vg_ovirtnode02/lv_swap' [46.59 GiB] inherit
> ACTIVE '/dev/vg_ovirtnode02/lv_root' [418.53 GiB] inherit
>
> Thanks,
> - Trey
>
>
>
> On Tue, Oct 21, 2014 at 2:49 PM, Sandra Taylor <jtt77777 at gmail.com> wrote:
>
>> Hi Trey,
>> Sorry for your trouble.
>> Don't know if I can help but I run iscsi here as my primary domain so
>> I've had some experience with it.
>> I don't know the answer to the master domain question.
>>
>> Does iscsi show connected using iscsiadm -m session and -m node ?
>> in the vdsm log there should be the iscsiadm commands that were
>> executed to connect.
>> Does multipath -ll show anything?
>>
>> -John
>>
>> On Tue, Oct 21, 2014 at 3:18 PM, Trey Dockendorf <treydock at gmail.com>
>> wrote:
>> > I was able to get iSCSI over TCP working...but now the task of adding
>> the
>> > LUN to the GUI has been stuck at the "spinning" icon for about 20
>> minutes.
>> >
>> > I see these entries in vdsm.log over and over with the Task value
>> changing:
>> >
>> > Thread-14::DEBUG::2014-10-21
>> > 14:16:50,086::task::595::TaskManager.Task::(_updateState)
>> > Task=`ebcd8e0a-54b1-43d2-92a2-ed9fd62d00fa`::moving from state init ->
>> state
>> > preparing
>> > Thread-14::INFO::2014-10-21
>> > 14:16:50,086::logUtils::44::dispatcher::(wrapper) Run and protect:
>> > repoStats(options=None)
>> > Thread-14::INFO::2014-10-21
>> > 14:16:50,086::logUtils::47::dispatcher::(wrapper) Run and protect:
>> > repoStats, Return response: {}
>> > Thread-14::DEBUG::2014-10-21
>> > 14:16:50,087::task::1185::TaskManager.Task::(prepare)
>> > Task=`ebcd8e0a-54b1-43d2-92a2-ed9fd62d00fa`::finished: {}
>> > Thread-14::DEBUG::2014-10-21
>> > 14:16:50,087::task::595::TaskManager.Task::(_updateState)
>> > Task=`ebcd8e0a-54b1-43d2-92a2-ed9fd62d00fa`::moving from state
>> preparing ->
>> > state finished
>> > Thread-14::DEBUG::2014-10-21
>> > 14:16:50,087::resourceManager::940::ResourceManager.Owner::(releaseAll)
>> > Owner.releaseAll requests {} resources {}
>> > Thread-14::DEBUG::2014-10-21
>> > 14:16:50,087::resourceManager::977::ResourceManager.Owner::(cancelAll)
>> > Owner.cancelAll requests {}
>> > Thread-14::DEBUG::2014-10-21
>> > 14:16:50,087::task::990::TaskManager.Task::(_decref)
>> > Task=`ebcd8e0a-54b1-43d2-92a2-ed9fd62d00fa`::ref 0 aborting False
>> >
>> > What is there I can do to get my storage back online? Right now my
>> iSCSI is
>> > master (something I did not want) which is odd considering the NFS data
>> > domain was added as master when I setup oVirt. Nothing will come back
>> until
>> > I get the master domain online and unsure what to do now.
>> >
>> > Thanks,
>> > - Trey
>> >
>> > On Tue, Oct 21, 2014 at 12:58 PM, Trey Dockendorf <treydock at gmail.com>
>> > wrote:
>> >>
>> >> I had a catastrophic failure of the IB switch that was used by all my
>> >> storage domains. I had one data domain that was NFS and one that was
>> iSCSI.
>> >> I managed to get the iSCSI LUN detached using the docs [1] but now I
>> noticed
>> >> that somehow my master domain went from the NFS domain to the iSCSI
>> domain
>> >> and I'm unable to switch them back.
>> >>
>> >> How does one change the master? Right now I am having issues getting
>> >> iSCSI over TCP to work, so am sort of stuck with 30 VMs down and an
>> entire
>> >> cluster inaccessible.
>> >>
>> >> Thanks,
>> >> - Trey
>> >>
>> >> [1] http://www.ovirt.org/Features/Manage_Storage_Connections
>> >
>> >
>> >
>> > _______________________________________________
>> > Users mailing list
>> > Users at ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/users
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20141021/16e5aef8/attachment-0001.html>
More information about the Users
mailing list