Hi Roman and Nir,
El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm
automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description
along the lines of "Created *before* lvremove".
You might want to select something older than the latest as lvm does a
backup also *after* running some command.
You were right. There actually *are* LV backups, I was specifying an
incorrect ID.
So the correct command would return:
# vgcfgrestore --list 219fa16f-13c9-44e4-a07d-a40c0a7fe206
[...]
File: /etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg
VG name: 219fa16f-13c9-44e4-a07d-a40c0a7fe206
Description: Created *before* executing 'vgs --noheading --nosuffix
--units b -o +vg_uuid,vg_extent_size'
Backup Time: Sat Sep 11 03:41:25 2021
[...]
That one seems ok.
2) Find UUID of your broken PV (filter might not be needed, depends
on
your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
/dev/mapper/36001405063455cf7cd74c20bc06e9304
As I understand it, the PV won't be listed in the 'pvs' command, this is
just a matter of finding the associated VG. The command above won't list
a PV associated to the VG in step 1, it just complains the PV cannot be
read.
# pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
/dev/mapper/36001405063455cf7cd74c20bc06e9304
/dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
offset 2198927383040
Couldn't read volume group metadata from
/dev/mapper/36001405063455cf7cd74c20bc06e9304.
Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at
2198927383040 has invalid summary for VG.
Failed to read metadata summary from
/dev/mapper/36001405063455cf7cd74c20bc06e9304
Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
No physical volume label read from
/dev/mapper/36001405063455cf7cd74c20bc06e9304.
So, associated PV ID is: 36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using
the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE>
<EMPTY_DISK>
I have a question here. As I understand it, pvcreate will restore the
correct metadata on <EMPTY_DISK>. Then how do you restore that metadata
on the broken storage domain, so other hosts can see the right
information as well? Or is this just a step to recover data on
<EMPTY_DISK> and then reattach the disks on the affected VMs?
Thanks so much.
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas(a)devels.es> wrote:
> I can also see...
>
> kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3
> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
> offset 2198927383040
> Couldn't read volume group metadata from
> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
> Metadata location on
> /dev/mapper/36001405063455cf7cd74c20bc06e9304 at
> 2198927383040 has invalid summary for VG.
> Failed to read metadata summary from
> /dev/mapper/36001405063455cf7cd74c20bc06e9304
> Failed to scan VG from
> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>
> Seems to me like metadata from that VG has been corrupted. Is there
> a
> way to recover?
>
> El 2021-09-16 11:19, nicolas(a)devels.es escribió:
>> The most relevant log snippet I have found is the following. I
> assume
>> it cannot scan the Storage Domain, but I'm unsure why, as the
> storage
>> domain backend is up and running.
>>
>> 021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM]
>> Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices
{
>> preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1
>> write_cache_state=0 disable_after_error_count=3
>>
>
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
>>
>
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
>>> b3541001957479c2baba500000002$|", "r|.*|"] } global {
>> locking_type=4
>>> prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup
>> {
>>> retain_min=50 retain_days=0 }', '--noheadings',
'--units', 'b',
>>> '--nosuffix', '--separator', '|',
'--ignoreskippedcluster', '-o',
>>>
>>
>
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
>>> '--select', 'vg_name =
219fa16f-13c9-44e4-a07d-a40c0a7fe206']
>>> succeeded with warnings: ['
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
>>> offset 2198927383040', " Couldn't read volume group metadata
from
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata
>> location
>>> on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040
>> has
>>> invalid summary for VG.', ' Failed to read metadata summary from
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan
>> VG
>>> from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462)
>>> 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16)
>> [storage.Monitor]
>>> Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed
>>> (monitor:330)
>>> Traceback (most recent call last):
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>> line 327, in _setupLoop
>>> self._setupMonitor()
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>> line 349, in _setupMonitor
>>> self._produceDomain()
>>> File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159,
>> in
>>> wrapper
>>> value = meth(self, *a, **kw)
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>> line 367, in _produceDomain
>>> self.domain = sdCache.produce(self.sdUUID)
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line
>>> 110, in produce
>>> domain.getRealDomain()
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line
>>> 51, in getRealDomain
>>> return self._cache._realProduce(self._sdUUID)
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line
>>> 134, in _realProduce
>>> domain = self._findDomain(sdUUID)
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line
>>> 151, in _findDomain
>>> return findMethod(sdUUID)
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line
>>> 176, in _findUnfetchedDomain
>>> raise se.StorageDomainDoesNotExist(sdUUID)
>>> StorageDomainDoesNotExist: Storage domain does not exist:
>>> (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
>>>
>>>
>>> El 2021-09-16 08:28, Vojtech Juranek escribió:
>>>> On Wednesday, 15 September 2021 14:52:27 CEST nicolas(a)devels.es
>> wrote:
>>>>> Hi,
>>>>>
>>>>> We're running oVirt 4.3.8 and we recently had a oVirt crash
>> after
>>>>> moving
>>>>> too much disks between storage domains.
>>>>>
>>>>> Concretely, one of the Storage Domains reports status
"Unknown",
>>>>> "Total/Free/Guaranteed free spaces" are "[N/A]".
>>>>>
>>>>> After trying to activate it in the Domain Center we see messages
>> like
>>>>> these from all of the hosts:
>>>>>
>>>>> VDSM hostX command GetVGInfoVDS failed: Volume Group does
>> not
>>>>> exist:
>>>>> (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
>>>>>
>>>>> I tried putting the Storage Domain in maintenance and it fails
>> with
>>>>> messages like:
>>>>>
>>>>> Storage Domain iaasb13 (Data Center KVMRojo) was
>> deactivated by
>>>>> system because it's not visible by any of the hosts.
>>>>> Failed to update OVF disks
>> 8661acd1-d1c4-44a0-a4d4-ddee834844e9,
>>>>> OVF
>>>>> data isn't updated on those OVF stores (Data Center KVMRojo,
>> Storage
>>>>> Domain iaasb13).
>>>>> Failed to update VMs/Templates OVF data for Storage Domain
>>>>> iaasb13
>>>>> in Data Center KVMRojo.
>>>>>
>>>>> I'm sure the storage domain backend is up and running, and the
>> LUN
>>>>> being
>>>>> exported.
>>>>>
>>>>> Any hints how can I debug this problem and restore the Storage
>>>>> Domain?
>>>>
>>>> I'd suggest to ssh to any of the host from given data center and
>>>> investigate
>>>> manually, if the device is visible to the host (e.g. using lsblk)
>> and
>>>> eventually check /var/log/messages to determine where the problem
>>
>>>> could be.
>>>>
>>>>
>>>>> Thanks.
>>>>> _______________________________________________
>>>>> Users mailing list -- users(a)ovirt.org
>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>> oVirt Code of Conduct:
>>>>>
https://www.ovirt.org/community/about/community-guidelines/ List
>>
>>>>> Archives:
>>>>>
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
>>>>> WLEYO6FFM4WOLD6YATW/
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>>
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5...
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
>>>
https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>>
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJ...
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>>
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTN...