So, I've made several attempts to restore the metadata.
In my last e-mail I said in step 2 that the PV ID is:
36001405063455cf7cd74c20bc06e9304, which is incorrect.
I'm trying to find out the PV UUID running "pvs -o pv_name,pv_uuid
--config='devices/filter = ["a|.*|"]'
/dev/mapper/36001405063455cf7cd74c20bc06e9304". However, it shows no PV
UUID. All I get from the command output is:
# pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
/dev/mapper/36001405063455cf7cd74c20bc06e9304
/dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
offset 2198927383040
Couldn't read volume group metadata from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f.
Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
2198927383040 has invalid summary for VG.
Failed to read metadata summary from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f
Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f
/dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
offset 2198927383040
Couldn't read volume group metadata from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f.
Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
2198927383040 has invalid summary for VG.
Failed to read metadata summary from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f
Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f
Failed to find device "/dev/mapper/36001405063455cf7cd74c20bc06e9304".
I tried running a bare "vgcfgrestore
219fa16f-13c9-44e4-a07d-a40c0a7fe206" command, which returned:
# vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206
/dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
offset 2198927383040
Couldn't read volume group metadata from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f.
Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
2198927383040 has invalid summary for VG.
Failed to read metadata summary from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f
Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f
Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
Cannot restore Volume Group 219fa16f-13c9-44e4-a07d-a40c0a7fe206 with
1 PVs marked as missing.
Restore failed.
Seems that the PV is missing, however, I assume the PV UUID (from output
above) is Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
So I tried running:
# pvcreate --uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb --restore
/etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00200-1084769199.vg
/dev/sdb1
Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
/dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
offset 2198927383040
Couldn't read volume group metadata from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f.
Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
2198927383040 has invalid summary for VG.
Failed to read metadata summary from
/dev/mapper/360014057b367e3a53b44ab392ae0f25f
Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f
Device /dev/sdb1 excluded by a filter.
Either the PV UUID is not the one I specified, or the system can't find
it (or both).
El 2021-09-20 09:21, nicolas@devels.es escribió:
> Hi Roman and Nir,
>
> El 2021-09-16 13:42, Roman Bednar escribió:
>> Hi Nicolas,
>>
>> You can try to recover VG metadata from a backup or archive which lvm
>> automatically creates by default.
>>
>> 1) To list all available backups for given VG:
>>
>> #vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
>>
>> Select the latest one which sounds right, something with a description
>> along the lines of "Created *before* lvremove".
>> You might want to select something older than the latest as lvm does a
>> backup also *after* running some command.
>>
>
> You were right. There actually *are* LV backups, I was specifying an
> incorrect ID.
>
> So the correct command would return:
>
> # vgcfgrestore --list 219fa16f-13c9-44e4-a07d-a40c0a7fe206
> [...]
>
> File: /etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg
> VG name: 219fa16f-13c9-44e4-a07d-a40c0a7fe206
> Description: Created *before* executing 'vgs --noheading --nosuffix
> --units b -o +vg_uuid,vg_extent_size'
> Backup Time: Sat Sep 11 03:41:25 2021
> [...]
>
> That one seems ok.
>
>> 2) Find UUID of your broken PV (filter might not be needed, depends on
>> your local conf):
>>
>> #pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>>
>
> As I understand it, the PV won't be listed in the 'pvs' command, this
> is just a matter of finding the associated VG. The command above won't
> list a PV associated to the VG in step 1, it just complains the PV
> cannot be read.
>
> # pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
> /dev/mapper/36001405063455cf7cd74c20bc06e9304
> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
> offset 2198927383040
> Couldn't read volume group metadata from
> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
> Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304
> at 2198927383040 has invalid summary for VG.
> Failed to read metadata summary from
> /dev/mapper/36001405063455cf7cd74c20bc06e9304
> Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
> No physical volume label read from
> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
>
> So, associated PV ID is: 36001405063455cf7cd74c20bc06e9304
>
>> 3) Create a new PV on a different partition or disk (/dev/sdX) using
>> the UUID found in previous step and restorefile option:
>>
>> #pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE>
>> <EMPTY_DISK>
>>
>
> I have a question here. As I understand it, pvcreate will restore the
> correct metadata on <EMPTY_DISK>. Then how do you restore that
> metadata on the broken storage domain, so other hosts can see the
> right information as well? Or is this just a step to recover data on
> <EMPTY_DISK> and then reattach the disks on the affected VMs?
>
> Thanks so much.
>
>> 4) Try to display the VG:
>>
>> # vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
>>
>> -Roman
>>
>> On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
>>
>>> I can also see...
>>>
>>> kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
>>> offset 2198927383040
>>> Couldn't read volume group metadata from
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
>>> Metadata location on
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304 at
>>> 2198927383040 has invalid summary for VG.
>>> Failed to read metadata summary from
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>>> Failed to scan VG from
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>>>
>>> Seems to me like metadata from that VG has been corrupted. Is there
>>> a
>>> way to recover?
>>>
>>> El 2021-09-16 11:19, nicolas@devels.es escribió:
>>>> The most relevant log snippet I have found is the following. I
>>> assume
>>>> it cannot scan the Storage Domain, but I'm unsure why, as the
>>> storage
>>>> domain backend is up and running.
>>>>
>>>> 021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM]
>>>> Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices {
>>>> preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1
>>>> write_cache_state=0 disable_after_error_count=3
>>>>
>>>
>>
>
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
>>>>
>>>
>>
>
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
>>>> b3541001957479c2baba500000002$|", "r|.*|"] } global {
>>> locking_type=4
>>>> prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup
>>> {
>>>> retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b',
>>>> '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
>>>>
>>>
>> 'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
>>>> '--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206']
>>>> succeeded with warnings: ['
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
>>>> offset 2198927383040', " Couldn't read volume group metadata from
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata
>>> location
>>>> on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040
>>> has
>>>> invalid summary for VG.', ' Failed to read metadata summary from
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan
>>> VG
>>>> from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462)
>>>> 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16)
>>> [storage.Monitor]
>>>> Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed
>>>> (monitor:330)
>>>> Traceback (most recent call last):
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>>> line 327, in _setupLoop
>>>> self._setupMonitor()
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>>> line 349, in _setupMonitor
>>>> self._produceDomain()
>>>> File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159,
>>> in
>>>> wrapper
>>>> value = meth(self, *a, **kw)
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>>> line 367, in _produceDomain
>>>> self.domain = sdCache.produce(self.sdUUID)
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>> line
>>>> 110, in produce
>>>> domain.getRealDomain()
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>> line
>>>> 51, in getRealDomain
>>>> return self._cache._realProduce(self._sdUUID)
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>> line
>>>> 134, in _realProduce
>>>> domain = self._findDomain(sdUUID)
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>> line
>>>> 151, in _findDomain
>>>> return findMethod(sdUUID)
>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>> line
>>>> 176, in _findUnfetchedDomain
>>>> raise se.StorageDomainDoesNotExist(sdUUID)
>>>> StorageDomainDoesNotExist: Storage domain does not exist:
>>>> (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
>>>>
>>>>
>>>> El 2021-09-16 08:28, Vojtech Juranek escribió:
>>>>> On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es
>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We're running oVirt 4.3.8 and we recently had a oVirt crash
>>> after
>>>>>> moving
>>>>>> too much disks between storage domains.
>>>>>>
>>>>>> Concretely, one of the Storage Domains reports status "Unknown",
>>>>>> "Total/Free/Guaranteed free spaces" are "[N/A]".
>>>>>>
>>>>>> After trying to activate it in the Domain Center we see messages
>>> like
>>>>>> these from all of the hosts:
>>>>>>
>>>>>> VDSM hostX command GetVGInfoVDS failed: Volume Group does
>>> not
>>>>>> exist:
>>>>>> (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
>>>>>>
>>>>>> I tried putting the Storage Domain in maintenance and it fails
>>> with
>>>>>> messages like:
>>>>>>
>>>>>> Storage Domain iaasb13 (Data Center KVMRojo) was
>>> deactivated by
>>>>>> system because it's not visible by any of the hosts.
>>>>>> Failed to update OVF disks
>>> 8661acd1-d1c4-44a0-a4d4-ddee834844e9,
>>>>>> OVF
>>>>>> data isn't updated on those OVF stores (Data Center KVMRojo,
>>> Storage
>>>>>> Domain iaasb13).
>>>>>> Failed to update VMs/Templates OVF data for Storage Domain
>>>>>> iaasb13
>>>>>> in Data Center KVMRojo.
>>>>>>
>>>>>> I'm sure the storage domain backend is up and running, and the
>>> LUN
>>>>>> being
>>>>>> exported.
>>>>>>
>>>>>> Any hints how can I debug this problem and restore the Storage
>>>>>> Domain?
>>>>>
>>>>> I'd suggest to ssh to any of the host from given data center and
>>>>> investigate
>>>>> manually, if the device is visible to the host (e.g. using lsblk)
>>> and
>>>>> eventually check /var/log/messages to determine where the problem
>>>
>>>>> could be.
>>>>>
>>>>>
>>>>>> Thanks.
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users@ovirt.org
>>>>>> To unsubscribe send an email to users-leave@ovirt.org
>>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
>>>>>> https://www.ovirt.org/community/about/community-guidelines/ List
>>>
>>>>>> Archives:
>>>>>>
>>>
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
>>>>>> WLEYO6FFM4WOLD6YATW/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list -- users@ovirt.org
>>>>> To unsubscribe send an email to users-leave@ovirt.org
>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>>> oVirt Code of Conduct:
>>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>>> List Archives:
>>>>>
>>>
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25HGLS7N3GJ7OX7M2L6U/
>>>> _______________________________________________
>>>> Users mailing list -- users@ovirt.org
>>>> To unsubscribe send an email to users-leave@ovirt.org
>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>>
>>>
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCNPYBUICPDIOEA4FBNDD/
>>> _______________________________________________
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-leave@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>>
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFNFIIIH3P5PKOKZEHO2A/