We actually got it working. Both of us were tired from working late,
so
we didn't find that the missing storage domain was actually one of the
NFS exports for awhile. After removing our NFS ISO domain and NFS
export domain everything came up.
Thanks for the update.
Coincidentally we have a patch upstream that should ignore the iso and export domains in
such situations (
Dan
On 9/22/13 6:08 AM, Ayal Baron wrote:
> If I understand correctly you have a storage domain which is built of
> multiple (at least 2) LUNs.
> One of these LUNs seems to be missing
> (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is an LVM PV UUID).
> It looks like you are either not fully connected to the storage server
> (missing a connection) or the LUN mapping in LIO has been changed or that
> the chap password has changed or something.
>
> LVM is able to report the LVs since the PV which contains the metadata is
> still accessible (which is also why you see the VG and why LVM knows that
> the Wy3Ymi... device is missing).
>
> Can you compress and attach *all* of the vdsm.log* files?
>
> ----- Original Message -----
>> Hi Dan, it looks like one of the domains is missing:
>>
>> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
>>
>> Is there any target missing? (disconnected or somehow faulty or
>> unreachable)
>>
>> --
>> Federico
>>
>> ----- Original Message -----
>>> From: "Dan Ferris" <dferris(a)prometheusresearch.com>
>>> To: users(a)ovirt.org
>>> Sent: Friday, September 20, 2013 4:01:06 AM
>>> Subject: [Users] Unable to attach to storage domain (Ovirt 3.2)
>>>
>>> Hi,
>>>
>>> This is my first post to the list. I am happy to say that we have been
>>> using Ovirt for 6 months with a few bumps, but it's mostly been ok.
>>>
>>> Until tonight that is...
>>>
>>> I had to do a maintenance that required rebooting both of our Hypervisor
>>> nodes. Both of them run Fedora Core 18 and have been happy for months.
>>> After rebooting them tonight, they will not attach to the storage. If
>>> it matters, the storage is a server running LIO with a Fibre Channel
>>> target.
>>>
>>> Vdsm log:
>>>
>>> Thread-22::DEBUG::2013-09-19
>>> 21:57:09,392::misc::84::Storage.Misc.excCmd::(<lambda>)
'/usr/bin/dd
>>> iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
>>> bs=4096 count=1' (cwd None)
>>> Thread-22::DEBUG::2013-09-19
>>> 21:57:09,400::misc::84::Storage.Misc.excCmd::(<lambda>) SUCCESS:
<err> =
>>> '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
>>> 0.000547161 s, 7.5 MB/s\n'; <rc> = 0
>>> Thread-23::DEBUG::2013-09-19
>>> 21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
>>> reload operation' got the operation mutex
>>> Thread-23::DEBUG::2013-09-19
>>> 21:57:16,587::misc::84::Storage.Misc.excCmd::(<lambda>)
u'/usr/bin/sudo
>>> -n /sbin/lvm vgs --config " devices { preferred_names =
>>> [\\"^/dev/mapper/\\"] ignore_suspended_devices=1
write_cache_state=0
>>> disable_after_error_count=3 filter = [
>>> \\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\"
] } global {
>>> locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup {
>>> retain_min = 50 retain_days = 0 } " --noheadings --units b --nosuffix
>>> --separator | -o
>>>
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
>>> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
>>> Thread-23::DEBUG::2013-09-19
>>> 21:57:16,643::misc::84::Storage.Misc.excCmd::(<lambda>) FAILED:
<err> =
>>> ' Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not
found\n';
>>> <rc> = 5
>>> Thread-23::WARNING::2013-09-19
>>> 21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
>>> [' Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not
found']
>>> Thread-23::DEBUG::2013-09-19
>>> 21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
>>> reload operation' released the operation mutex
>>> Thread-23::ERROR::2013-09-19
>>>
21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
>>> Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
>>> monitoring information
>>> Traceback (most recent call last):
>>> File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in
>>> _monitorDomain
>>> self.domain = sdCache.produce(self.sdUUID)
>>> File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
>>> domain.getRealDomain()
>>> File "/usr/share/vdsm/storage/sdc.py", line 52, in
getRealDomain
>>> return self._cache._realProduce(self._sdUUID)
>>> File "/usr/share/vdsm/storage/sdc.py", line 121, in
_realProduce
>>> domain = self._findDomain(sdUUID)
>>> File "/usr/share/vdsm/storage/sdc.py", line 152, in
_findDomain
>>> raise se.StorageDomainDoesNotExist(sdUUID)
>>> StorageDomainDoesNotExist: Storage domain does not exist:
>>> (u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)
>>>
>>> vgs output (Note that I don't know what the device
>>> (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :
>>>
>>> [root@node01 vdsm]# vgs
>>> Couldn't find device with uuid
>>> Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
>>> VG #PV #LV #SN Attr VSize VFree
>>> b358e46b-635b-4c0e-8e73-0a494602e21d 1 39 0 wz--n- 8.19t
>>> 5.88t
>>> build 2 2 0 wz-pn- 299.75g
>>> 16.00m
>>> fedora 1 3 0 wz--n- 557.88g 0
>>>
>>> lvs output:
>>>
>>> [root@node01 vdsm]# lvs
>>> Couldn't find device with uuid
>>> Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
>>> LV VG
>>> Attr LSize Pool Origin Data% Move Log Copy% Convert
>>> 0b8cca47-313f-48da-84f2-154810790d5a
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> 0f6f7572-8797-4d84-831b-87dbc4e1aa48
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> 19a1473f-c375-411f-9a02-c6054b9a28d2
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 50.00g
>>> 221144dc-51dc-46ae-9399-c0b8e030f38a
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> 2386932f-5f68-46e1-99a4-e96c944ac21b
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> 3e027010-931b-43d6-9c9f-eeeabbdcd47a
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 2.00g
>>> 4257ccc2-94d5-4d71-b21a-c188acbf7ca1
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 200.00g
>>> 4979b2a4-04aa-46a1-be0d-f10be0a1f587
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> 4e1b8a1a-1704-422b-9d79-60f15e165cb7
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> 70bce792-410f-479f-8e04-a2a4093d3dfb
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> 791f6bda-c7eb-4d90-84c1-d7e33e73de62
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> 818ad6bc-8da2-4099-b38a-8c5b52f69e32
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 120.00g
>>> 861c9c44-fdeb-43cd-8e5c-32c00ce3cd3d
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> 86b69521-14db-43d1-801f-9d21f0a0e00f
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> 8a578e50-683d-47c3-af41-c7e508d493e8
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> 90463a7a-ecd4-4838-bc91-adccf99d9997
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> 9170a33d-3bdf-4c15-8e6b-451622c8093b
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 80.00g
>>> 964b9e32-c1ee-4152-a05b-0c43815f5ea6
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> 975c0a26-f699-4351-bd27-dd7621eac6bd
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> 9ec24f39-8b32-4247-bfb4-4b7f2cf86d9d
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> a4f303bf-6e89-43c3-a801-046920cb24d6
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> a7115874-0f1c-4f43-ab3a-a6026ad99013
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> b1fb5597-a3bb-4e4b-b73f-d1752cc576cb
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> bc28d7c6-a14b-4398-8166-ac2f25b17312
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> bc72da88-f5fd-4f53-9c2c-af2fcd14d117
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> c2c1ba71-c938-4d71-876a-1bba89a5d8a9
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> c54eb342-b79b-45fe-8117-aab7137f5f9d
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> c892f8b5-fadc-4774-a355-32655512a462
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> c9f636ce-efed-495d-9a29-cfaac1f289d3
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> cb657c62-44c8-43dd-8ea2-cbf5927cff72
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> cdba05ac-5f68-4213-827b-3d3518c67251
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> d98f3bfc-55b0-44f7-8a39-cb0920762cba
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
>>> e0708bc4-19df-4d48-a0e7-682070634dea
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
>>> ids
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-ao--- 128.00m
>>> inbox
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 128.00m
>>> leases
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 2.00g
>>> master
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 1.00g
>>> metadata
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 512.00m
>>> outbox
>>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 128.00m
>>> root build
>>> -wi-----p 298.74g
>>> swap_1 build
>>> -wi-----p 1020.00m
>>> home fedora
>>> -wi-ao--- 503.88g
>>> root fedora
>>> -wi-ao--- 50.00g
>>> swap fedora
>>> -wi-ao--- 4.00g
>>>
>>>
>>> The strange thing is that I can see all of the LVM volumes for the VMs.
>>> Both servers see the storage just fine. This error has me baffled and
>>> it's a total show stopper since the cluster will not come back up.
>>>
>>> If anyone can help, that would be very appreciated.
>>>
>>> Thanks,
>>>
>>> Dan
>>> _______________________________________________
>>> Users mailing list
>>> Users(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/users
>>