[Users] Unable to attach to storage domain (Ovirt 3.2)

Ayal Baron abaron at redhat.com
Sun Sep 22 13:52:41 UTC 2013



----- Original Message -----
> We actually got it working.  Both of us were tired from working late, so
> we didn't find that the missing storage domain was actually one of the
> NFS exports for awhile.  After removing our NFS ISO domain and NFS
> export domain everything came up.
> 

Thanks for the update.
Coincidentally we have a patch upstream that should ignore the iso and export domains in such situations (http://gerrit.ovirt.org/#/c/17986/) and would void the need for you to deactivate them.

> Dan
> 
> On 9/22/13 6:08 AM, Ayal Baron wrote:
> > If I understand correctly you have a storage domain which is built of
> > multiple (at least 2) LUNs.
> > One of these LUNs seems to be missing
> > (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is an LVM PV UUID).
> > It looks like you are either not fully connected to the storage server
> > (missing a connection) or the LUN mapping in LIO has been changed or that
> > the chap password has changed or something.
> >
> > LVM is able to report the LVs since the PV which contains the metadata is
> > still accessible (which is also why you see the VG and why LVM knows that
> > the Wy3Ymi... device is missing).
> >
> > Can you compress and attach *all* of the vdsm.log* files?
> >
> > ----- Original Message -----
> >> Hi Dan, it looks like one of the domains is missing:
> >>
> >> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
> >>
> >> Is there any target missing? (disconnected or somehow faulty or
> >> unreachable)
> >>
> >> --
> >> Federico
> >>
> >> ----- Original Message -----
> >>> From: "Dan Ferris" <dferris at prometheusresearch.com>
> >>> To: users at ovirt.org
> >>> Sent: Friday, September 20, 2013 4:01:06 AM
> >>> Subject: [Users] Unable to attach to storage domain (Ovirt 3.2)
> >>>
> >>> Hi,
> >>>
> >>> This is my first post to the list.  I am happy to say that we have been
> >>> using Ovirt for 6 months with a few bumps, but it's mostly been ok.
> >>>
> >>> Until tonight that is...
> >>>
> >>> I had to do a maintenance that required rebooting both of our Hypervisor
> >>> nodes.  Both of them run Fedora Core 18 and have been happy for months.
> >>>    After rebooting them tonight, they will not attach to the storage.  If
> >>> it matters, the storage is a server running LIO with a Fibre Channel
> >>> target.
> >>>
> >>> Vdsm log:
> >>>
> >>> Thread-22::DEBUG::2013-09-19
> >>> 21:57:09,392::misc::84::Storage.Misc.excCmd::(<lambda>) '/usr/bin/dd
> >>> iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
> >>> bs=4096 count=1' (cwd None)
> >>> Thread-22::DEBUG::2013-09-19
> >>> 21:57:09,400::misc::84::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> =
> >>> '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
> >>> 0.000547161 s, 7.5 MB/s\n'; <rc> = 0
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
> >>> reload operation' got the operation mutex
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,587::misc::84::Storage.Misc.excCmd::(<lambda>) u'/usr/bin/sudo
> >>> -n /sbin/lvm vgs --config " devices { preferred_names =
> >>> [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
> >>> disable_after_error_count=3 filter = [
> >>> \\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] }  global {
> >>> locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {
> >>> retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix
> >>> --separator | -o
> >>> uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
> >>> 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,643::misc::84::Storage.Misc.excCmd::(<lambda>) FAILED: <err> =
> >>> '  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n';
> >>> <rc> = 5
> >>> Thread-23::WARNING::2013-09-19
> >>> 21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
> >>> ['  Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
> >>> Thread-23::DEBUG::2013-09-19
> >>> 21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
> >>> reload operation' released the operation mutex
> >>> Thread-23::ERROR::2013-09-19
> >>> 21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
> >>> Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
> >>> monitoring information
> >>> Traceback (most recent call last):
> >>>     File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in
> >>> _monitorDomain
> >>>       self.domain = sdCache.produce(self.sdUUID)
> >>>     File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
> >>>       domain.getRealDomain()
> >>>     File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> >>>       return self._cache._realProduce(self._sdUUID)
> >>>     File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
> >>>       domain = self._findDomain(sdUUID)
> >>>     File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
> >>>       raise se.StorageDomainDoesNotExist(sdUUID)
> >>> StorageDomainDoesNotExist: Storage domain does not exist:
> >>> (u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)
> >>>
> >>> vgs output (Note that I don't know what the device
> >>> (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :
> >>>
> >>> [root at node01 vdsm]# vgs
> >>>     Couldn't find device with uuid
> >>>     Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
> >>>     VG                                   #PV #LV #SN Attr   VSize   VFree
> >>>     b358e46b-635b-4c0e-8e73-0a494602e21d   1  39   0 wz--n-   8.19t
> >>>     5.88t
> >>>     build                                  2   2   0 wz-pn- 299.75g
> >>>     16.00m
> >>>     fedora                                 1   3   0 wz--n- 557.88g     0
> >>>
> >>> lvs output:
> >>>
> >>> [root at node01 vdsm]# lvs
> >>>     Couldn't find device with uuid
> >>>     Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
> >>>     LV                                   VG
> >>>        Attr      LSize    Pool Origin Data%  Move Log Copy%  Convert
> >>>     0b8cca47-313f-48da-84f2-154810790d5a
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     0f6f7572-8797-4d84-831b-87dbc4e1aa48
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     19a1473f-c375-411f-9a02-c6054b9a28d2
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   50.00g
> >>>     221144dc-51dc-46ae-9399-c0b8e030f38a
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     2386932f-5f68-46e1-99a4-e96c944ac21b
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     3e027010-931b-43d6-9c9f-eeeabbdcd47a
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----    2.00g
> >>>     4257ccc2-94d5-4d71-b21a-c188acbf7ca1
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  200.00g
> >>>     4979b2a4-04aa-46a1-be0d-f10be0a1f587
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     4e1b8a1a-1704-422b-9d79-60f15e165cb7
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     70bce792-410f-479f-8e04-a2a4093d3dfb
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     791f6bda-c7eb-4d90-84c1-d7e33e73de62
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     818ad6bc-8da2-4099-b38a-8c5b52f69e32
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  120.00g
> >>>     861c9c44-fdeb-43cd-8e5c-32c00ce3cd3d
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     86b69521-14db-43d1-801f-9d21f0a0e00f
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     8a578e50-683d-47c3-af41-c7e508d493e8
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     90463a7a-ecd4-4838-bc91-adccf99d9997
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     9170a33d-3bdf-4c15-8e6b-451622c8093b
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   80.00g
> >>>     964b9e32-c1ee-4152-a05b-0c43815f5ea6
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     975c0a26-f699-4351-bd27-dd7621eac6bd
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     9ec24f39-8b32-4247-bfb4-4b7f2cf86d9d
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     a4f303bf-6e89-43c3-a801-046920cb24d6
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     a7115874-0f1c-4f43-ab3a-a6026ad99013
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     b1fb5597-a3bb-4e4b-b73f-d1752cc576cb
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     bc28d7c6-a14b-4398-8166-ac2f25b17312
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     bc72da88-f5fd-4f53-9c2c-af2fcd14d117
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     c2c1ba71-c938-4d71-876a-1bba89a5d8a9
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     c54eb342-b79b-45fe-8117-aab7137f5f9d
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     c892f8b5-fadc-4774-a355-32655512a462
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     c9f636ce-efed-495d-9a29-cfaac1f289d3
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     cb657c62-44c8-43dd-8ea2-cbf5927cff72
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     cdba05ac-5f68-4213-827b-3d3518c67251
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     d98f3bfc-55b0-44f7-8a39-cb0920762cba
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  100.00g
> >>>     e0708bc4-19df-4d48-a0e7-682070634dea
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----   40.00g
> >>>     ids
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-ao---  128.00m
> >>>     inbox
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  128.00m
> >>>     leases
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----    2.00g
> >>>     master
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----    1.00g
> >>>     metadata
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  512.00m
> >>>     outbox
> >>> b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a----  128.00m
> >>>     root                                 build
> >>>        -wi-----p  298.74g
> >>>     swap_1                               build
> >>>        -wi-----p 1020.00m
> >>>     home                                 fedora
> >>>        -wi-ao---  503.88g
> >>>     root                                 fedora
> >>>        -wi-ao---   50.00g
> >>>     swap                                 fedora
> >>>        -wi-ao---    4.00g
> >>>
> >>>
> >>> The strange thing is that I can see all of the LVM volumes for the VMs.
> >>>    Both servers see the storage just fine.  This error has me baffled and
> >>> it's a total show stopper since the cluster will not come back up.
> >>>
> >>> If anyone can help, that would be very appreciated.
> >>>
> >>> Thanks,
> >>>
> >>> Dan
> >>> _______________________________________________
> >>> Users mailing list
> >>> Users at ovirt.org
> >>> http://lists.ovirt.org/mailman/listinfo/users
> >>>
> >> _______________________________________________
> >> Users mailing list
> >> Users at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
> 



More information about the Users mailing list