lvm2 packages on all machines are the same as Nir suggested. I tried the
pvck command, but it said the PV UUID did not exist. Finally ended up
forcibly removing the storage domain.
Thanks anyway.
El 2021-09-20 15:13, Roman Bednar escribió:
Did you update the packages as suggested by Nir? If so and it still
does not work, maybe try the pvck recovery that Nir described too.
If that still does not work consider filing a bug for lvm and
providing a failing command(s) output with -vvvv option in the
description or attachment. Perhaps there is a better way or a known
workaround.
-Roman
On Mon, Sep 20, 2021 at 2:22 PM <nicolas(a)devels.es> wrote:
> So, I've made several attempts to restore the metadata.
>
> In my last e-mail I said in step 2 that the PV ID is:
> 36001405063455cf7cd74c20bc06e9304, which is incorrect.
>
> I'm trying to find out the PV UUID running "pvs -o pv_name,pv_uuid
> --config='devices/filter = ["a|.*|"]'
> /dev/mapper/36001405063455cf7cd74c20bc06e9304". However, it shows no
> PV
> UUID. All I get from the command output is:
>
> # pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
> /dev/mapper/36001405063455cf7cd74c20bc06e9304
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
> offset 2198927383040
> Couldn't read volume group metadata from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f.
> Metadata location on
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
> 2198927383040 has invalid summary for VG.
> Failed to read metadata summary from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> Failed to scan VG from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
> offset 2198927383040
> Couldn't read volume group metadata from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f.
> Metadata location on
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
> 2198927383040 has invalid summary for VG.
> Failed to read metadata summary from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> Failed to scan VG from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> Failed to find device
> "/dev/mapper/36001405063455cf7cd74c20bc06e9304".
>
> I tried running a bare "vgcfgrestore
> 219fa16f-13c9-44e4-a07d-a40c0a7fe206" command, which returned:
>
> # vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
> offset 2198927383040
> Couldn't read volume group metadata from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f.
> Metadata location on
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
> 2198927383040 has invalid summary for VG.
> Failed to read metadata summary from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> Failed to scan VG from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> Couldn't find device with uuid
> Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
> Cannot restore Volume Group 219fa16f-13c9-44e4-a07d-a40c0a7fe206
> with
> 1 PVs marked as missing.
> Restore failed.
>
> Seems that the PV is missing, however, I assume the PV UUID (from
> output
> above) is Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
>
> So I tried running:
>
> # pvcreate --uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb --restore
>
/etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00200-1084769199.vg
> [1]
> /dev/sdb1
> Couldn't find device with uuid
> Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at
> offset 2198927383040
> Couldn't read volume group metadata from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f.
> Metadata location on
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f at
> 2198927383040 has invalid summary for VG.
> Failed to read metadata summary from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> Failed to scan VG from
> /dev/mapper/360014057b367e3a53b44ab392ae0f25f
> Device /dev/sdb1 excluded by a filter.
>
> Either the PV UUID is not the one I specified, or the system can't
> find
> it (or both).
>
> El 2021-09-20 09:21, nicolas(a)devels.es escribió:
>> Hi Roman and Nir,
>>
>> El 2021-09-16 13:42, Roman Bednar escribió:
>>> Hi Nicolas,
>>>
>>> You can try to recover VG metadata from a backup or archive which
> lvm
>>> automatically creates by default.
>>>
>>> 1) To list all available backups for given VG:
>>>
>>> #vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
>>>
>>> Select the latest one which sounds right, something with a
> description
>>> along the lines of "Created *before* lvremove".
>>> You might want to select something older than the latest as lvm
> does a
>>> backup also *after* running some command.
>>>
>>
>> You were right. There actually *are* LV backups, I was specifying
> an
>> incorrect ID.
>>
>> So the correct command would return:
>>
>> # vgcfgrestore --list 219fa16f-13c9-44e4-a07d-a40c0a7fe206
>> [...]
>>
>> File:
>
/etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg
> [2]
>> VG name: 219fa16f-13c9-44e4-a07d-a40c0a7fe206
>> Description: Created *before* executing 'vgs --noheading
> --nosuffix
>> --units b -o +vg_uuid,vg_extent_size'
>> Backup Time: Sat Sep 11 03:41:25 2021
>> [...]
>>
>> That one seems ok.
>>
>>> 2) Find UUID of your broken PV (filter might not be needed,
> depends on
>>> your local conf):
>>>
>>> #pvs -o pv_name,pv_uuid --config='devices/filter =
["a|.*|"]'
>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>>>
>>
>> As I understand it, the PV won't be listed in the 'pvs' command,
> this
>> is just a matter of finding the associated VG. The command above
> won't
>> list a PV associated to the VG in step 1, it just complains the PV
>> cannot be read.
>>
>> # pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]'
>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
>> offset 2198927383040
>> Couldn't read volume group metadata from
>> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
>> Metadata location on
> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>> at 2198927383040 has invalid summary for VG.
>> Failed to read metadata summary from
>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>> Failed to scan VG from
> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>> No physical volume label read from
>> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
>>
>> So, associated PV ID is: 36001405063455cf7cd74c20bc06e9304
>>
>>> 3) Create a new PV on a different partition or disk (/dev/sdX)
> using
>>> the UUID found in previous step and restorefile option:
>>>
>>> #pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile
> <PATH_TO_BACKUP_FILE>
>>> <EMPTY_DISK>
>>>
>>
>> I have a question here. As I understand it, pvcreate will restore
> the
>> correct metadata on <EMPTY_DISK>. Then how do you restore that
>> metadata on the broken storage domain, so other hosts can see the
>> right information as well? Or is this just a step to recover data
> on
>> <EMPTY_DISK> and then reattach the disks on the affected VMs?
>>
>> Thanks so much.
>>
>>> 4) Try to display the VG:
>>>
>>> # vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
>>>
>>> -Roman
>>>
>>> On Thu, Sep 16, 2021 at 1:47 PM <nicolas(a)devels.es> wrote:
>>>
>>>> I can also see...
>>>>
>>>> kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at
>>>> offset 2198927383040
>>>> Couldn't read volume group metadata from
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304.
>>>> Metadata location on
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304 at
>>>> 2198927383040 has invalid summary for VG.
>>>> Failed to read metadata summary from
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>>>> Failed to scan VG from
>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304
>>>>
>>>> Seems to me like metadata from that VG has been corrupted. Is
> there
>>>> a
>>>> way to recover?
>>>>
>>>> El 2021-09-16 11:19, nicolas(a)devels.es escribió:
>>>>> The most relevant log snippet I have found is the following. I
>>>> assume
>>>>> it cannot scan the Storage Domain, but I'm unsure why, as the
>>>> storage
>>>>> domain backend is up and running.
>>>>>
>>>>> 021-09-16 11:16:58,884+0100 WARN (monitor/219fa16)
> [storage.LVM]
>>>>> Command ['/usr/sbin/lvm', 'vgs', '--config',
'devices {
>>>>> preferred_names=["^/dev/mapper/"]
ignore_suspended_devices=1
>>>>> write_cache_state=0 disable_after_error_count=3
>>>>>
>>>>
>>>
>>
>
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
>>>>>
>>>>
>>>
>>
>
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
>>>>>> b3541001957479c2baba500000002$|", "r|.*|"] }
global {
>>>>> locking_type=4
>>>>>> prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 }
>> backup
>>>>> {
>>>>>> retain_min=50 retain_days=0 }', '--noheadings',
'--units',
>> 'b',
>>>>>> '--nosuffix', '--separator', '|',
'--ignoreskippedcluster',
>> '-o',
>>>>>>
>>>>>
>>>>
>>
>
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
>>>>>> '--select', 'vg_name =
219fa16f-13c9-44e4-a07d-a40c0a7fe206']
>>>>>> succeeded with warnings: ['
>>>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error
>> at
>>>>>> offset 2198927383040', " Couldn't read volume group
metadata
>> from
>>>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304.", '
Metadata
>>>>> location
>>>>>> on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at
>> 2198927383040
>>>>> has
>>>>>> invalid summary for VG.', ' Failed to read metadata
summary
>> from
>>>>>> /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed
to
>> scan
>>>>> VG
>>>>>> from /dev/mapper/36001405063455cf7cd74c20bc06e9304']
(lvm:462)
>>>>>> 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16)
>>>>> [storage.Monitor]
>>>>>> Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206
>> failed
>>>>>> (monitor:330)
>>>>>> Traceback (most recent call last):
>>>>>> File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>>>>> line 327, in _setupLoop
>>>>>> self._setupMonitor()
>>>>>> File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>>>>> line 349, in _setupMonitor
>>>>>> self._produceDomain()
>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/utils.py",
line
>> 159,
>>>>> in
>>>>>> wrapper
>>>>>> value = meth(self, *a, **kw)
>>>>>> File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>>>>>> line 367, in _produceDomain
>>>>>> self.domain = sdCache.produce(self.sdUUID)
>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>>>> line
>>>>>> 110, in produce
>>>>>> domain.getRealDomain()
>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>>>> line
>>>>>> 51, in getRealDomain
>>>>>> return self._cache._realProduce(self._sdUUID)
>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>>>> line
>>>>>> 134, in _realProduce
>>>>>> domain = self._findDomain(sdUUID)
>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>>>> line
>>>>>> 151, in _findDomain
>>>>>> return findMethod(sdUUID)
>>>>>> File
"/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>>>> line
>>>>>> 176, in _findUnfetchedDomain
>>>>>> raise se.StorageDomainDoesNotExist(sdUUID)
>>>>>> StorageDomainDoesNotExist: Storage domain does not exist:
>>>>>> (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
>>>>>>
>>>>>>
>>>>>> El 2021-09-16 08:28, Vojtech Juranek escribió:
>>>>>>> On Wednesday, 15 September 2021 14:52:27 CEST
>> nicolas(a)devels.es
>>>>> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We're running oVirt 4.3.8 and we recently had a oVirt
crash
>>>>> after
>>>>>>>> moving
>>>>>>>> too much disks between storage domains.
>>>>>>>>
>>>>>>>> Concretely, one of the Storage Domains reports status
>> "Unknown",
>>>>>>>> "Total/Free/Guaranteed free spaces" are
"[N/A]".
>>>>>>>>
>>>>>>>> After trying to activate it in the Domain Center we see
>> messages
>>>>> like
>>>>>>>> these from all of the hosts:
>>>>>>>>
>>>>>>>> VDSM hostX command GetVGInfoVDS failed: Volume Group
does
>>>>> not
>>>>>>>> exist:
>>>>>>>> (u'vg_uuid:
Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
>>>>>>>>
>>>>>>>> I tried putting the Storage Domain in maintenance and it
>> fails
>>>>> with
>>>>>>>> messages like:
>>>>>>>>
>>>>>>>> Storage Domain iaasb13 (Data Center KVMRojo) was
>>>>> deactivated by
>>>>>>>> system because it's not visible by any of the hosts.
>>>>>>>> Failed to update OVF disks
>>>>> 8661acd1-d1c4-44a0-a4d4-ddee834844e9,
>>>>>>>> OVF
>>>>>>>> data isn't updated on those OVF stores (Data Center
KVMRojo,
>>>>> Storage
>>>>>>>> Domain iaasb13).
>>>>>>>> Failed to update VMs/Templates OVF data for Storage
Domain
>>>>>>>> iaasb13
>>>>>>>> in Data Center KVMRojo.
>>>>>>>>
>>>>>>>> I'm sure the storage domain backend is up and
running, and
>> the
>>>>> LUN
>>>>>>>> being
>>>>>>>> exported.
>>>>>>>>
>>>>>>>> Any hints how can I debug this problem and restore the
>> Storage
>>>>>>>> Domain?
>>>>>>>
>>>>>>> I'd suggest to ssh to any of the host from given data
center
>> and
>>>>>>> investigate
>>>>>>> manually, if the device is visible to the host (e.g. using
>> lsblk)
>>>>> and
>>>>>>> eventually check /var/log/messages to determine where the
>> problem
>>>>>
>>>>>>> could be.
>>>>>>>
>>>>>>>
>>>>>>>> Thanks.
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>>>> oVirt Code of Conduct:
>>>>>>>>
https://www.ovirt.org/community/about/community-guidelines/
>> List
>>>>>
>>>>>>>> Archives:
>>>>>>>>
>>>>>
>>>>
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
>>>>>>>> WLEYO6FFM4WOLD6YATW/
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>>> oVirt Code of Conduct:
>>>>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>>>>> List Archives:
>>>>>>>
>>>>>
>>>>
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5...
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
>>>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
>>>>>>
>>>>>
>>>>
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJ...
>>>>> _______________________________________________
>>>>> Users mailing list -- users(a)ovirt.org
>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>> oVirt Code of Conduct:
>>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>>> List Archives:
>>>>>
>>>>
>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTN...
>
>
> Links:
> ------
> [1]
http://219fa16f-13c9-44e4-a07d-a40c0a7fe206_00200-1084769199.vg
> [2]
http://219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg