Hi,
This is my first post to the list. I am happy to say that we have been
using Ovirt for 6 months with a few bumps, but it's mostly been ok.
Until tonight that is...
I had to do a maintenance that required rebooting both of our Hypervisor
nodes. Both of them run Fedora Core 18 and have been happy for months.
After rebooting them tonight, they will not attach to the storage. If
it matters, the storage is a server running LIO with a Fibre Channel target.
Vdsm log:
Thread-22::DEBUG::2013-09-19
21:57:09,392::misc::84::Storage.Misc.excCmd::(<lambda>) '/usr/bin/dd
iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
bs=4096 count=1' (cwd None)
Thread-22::DEBUG::2013-09-19
21:57:09,400::misc::84::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> =
'1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
0.000547161 s, 7.5 MB/s\n'; <rc> = 0
Thread-23::DEBUG::2013-09-19
21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
reload operation' got the operation mutex
Thread-23::DEBUG::2013-09-19
21:57:16,587::misc::84::Storage.Misc.excCmd::(<lambda>) u'/usr/bin/sudo
-n /sbin/lvm vgs --config " devices { preferred_names =
[\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0
disable_after_error_count=3 filter = [
\\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] } global {
locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup {
retain_min = 50 retain_days = 0 } " --noheadings --units b --nosuffix
--separator | -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
Thread-23::DEBUG::2013-09-19
21:57:16,643::misc::84::Storage.Misc.excCmd::(<lambda>) FAILED: <err> =
' Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n';
<rc> = 5
Thread-23::WARNING::2013-09-19
21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
[' Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found']
Thread-23::DEBUG::2013-09-19
21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
reload operation' released the operation mutex
Thread-23::ERROR::2013-09-19
21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
monitoring information
Traceback (most recent call last):
File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in
_monitorDomain
self.domain = sdCache.produce(self.sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
domain.getRealDomain()
File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)
vgs output (Note that I don't know what the device
(Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :
[root@node01 vdsm]# vgs
Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
VG #PV #LV #SN Attr VSize VFree
b358e46b-635b-4c0e-8e73-0a494602e21d 1 39 0 wz--n- 8.19t 5.88t
build 2 2 0 wz-pn- 299.75g 16.00m
fedora 1 3 0 wz--n- 557.88g 0
lvs output:
[root@node01 vdsm]# lvs
Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
LV VG
Attr LSize Pool Origin Data% Move Log Copy% Convert
0b8cca47-313f-48da-84f2-154810790d5a
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
0f6f7572-8797-4d84-831b-87dbc4e1aa48
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
19a1473f-c375-411f-9a02-c6054b9a28d2
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 50.00g
221144dc-51dc-46ae-9399-c0b8e030f38a
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
2386932f-5f68-46e1-99a4-e96c944ac21b
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
3e027010-931b-43d6-9c9f-eeeabbdcd47a
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 2.00g
4257ccc2-94d5-4d71-b21a-c188acbf7ca1
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 200.00g
4979b2a4-04aa-46a1-be0d-f10be0a1f587
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
4e1b8a1a-1704-422b-9d79-60f15e165cb7
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
70bce792-410f-479f-8e04-a2a4093d3dfb
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
791f6bda-c7eb-4d90-84c1-d7e33e73de62
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
818ad6bc-8da2-4099-b38a-8c5b52f69e32
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 120.00g
861c9c44-fdeb-43cd-8e5c-32c00ce3cd3d
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
86b69521-14db-43d1-801f-9d21f0a0e00f
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
8a578e50-683d-47c3-af41-c7e508d493e8
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
90463a7a-ecd4-4838-bc91-adccf99d9997
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
9170a33d-3bdf-4c15-8e6b-451622c8093b
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 80.00g
964b9e32-c1ee-4152-a05b-0c43815f5ea6
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
975c0a26-f699-4351-bd27-dd7621eac6bd
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
9ec24f39-8b32-4247-bfb4-4b7f2cf86d9d
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
a4f303bf-6e89-43c3-a801-046920cb24d6
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
a7115874-0f1c-4f43-ab3a-a6026ad99013
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
b1fb5597-a3bb-4e4b-b73f-d1752cc576cb
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
bc28d7c6-a14b-4398-8166-ac2f25b17312
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
bc72da88-f5fd-4f53-9c2c-af2fcd14d117
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
c2c1ba71-c938-4d71-876a-1bba89a5d8a9
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
c54eb342-b79b-45fe-8117-aab7137f5f9d
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
c892f8b5-fadc-4774-a355-32655512a462
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
c9f636ce-efed-495d-9a29-cfaac1f289d3
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
cb657c62-44c8-43dd-8ea2-cbf5927cff72
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
cdba05ac-5f68-4213-827b-3d3518c67251
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
d98f3bfc-55b0-44f7-8a39-cb0920762cba
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g
e0708bc4-19df-4d48-a0e7-682070634dea
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g
ids
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-ao--- 128.00m
inbox
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 128.00m
leases
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 2.00g
master
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 1.00g
metadata
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 512.00m
outbox
b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 128.00m
root build
-wi-----p 298.74g
swap_1 build
-wi-----p 1020.00m
home fedora
-wi-ao--- 503.88g
root fedora
-wi-ao--- 50.00g
swap fedora
-wi-ao--- 4.00g
The strange thing is that I can see all of the LVM volumes for the VMs.
Both servers see the storage just fine. This error has me baffled and
it's a total show stopper since the cluster will not come back up.
If anyone can help, that would be very appreciated.
Thanks,
Dan