Cannot activate a Storage Domain after an oVirt crash

Hi, We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains. Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]". After trying to activate it in the Domain Center we see messages like these from all of the hosts: VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',) I tried putting the Storage Domain in maintenance and it fails with messages like: Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo. I'm sure the storage domain backend is up and running, and the LUN being exported. Any hints how can I debug this problem and restore the Storage Domain? Thanks.

On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages like these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT WLEYO6FFM4WOLD6YATW/

The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running. 021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648 35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634 b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o', 'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name', '--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',) El 2021-09-16 08:28, Vojtech Juranek escribió:
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages like these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...

I can also see... kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Seems to me like metadata from that VG has been corrupted. Is there a way to recover? El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o', 'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name', '--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
El 2021-09-16 08:28, Vojtech Juranek escribió:
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages like these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...

Hi Nicolas, You can try to recover VG metadata from a backup or archive which lvm automatically creates by default. 1) To list all available backups for given VG: #vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command. 2) Find UUID of your broken PV (filter might not be needed, depends on your local conf): #pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option: #pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK> 4) Try to display the VG: # vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp -Roman On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is there a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
El 2021-09-16 08:28, Vojtech Juranek escribió:
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages like these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...

Hi Roman, Unfortunately, step 1 returns nothing: kvmr03:~# vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp No archives found in /etc/lvm/archive I tried several hosts and noone has a copy. Any other way to get a backup of the VG? El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command.
2) Find UUID of your broken PV (filter might not be needed, depends on your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK>
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is there a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages
El 2021-09-16 08:28, Vojtech Juranek escribió: like
these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem
could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List
Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...

Make sure the VG name is correct, it won't complain if the name is wrong. Also you can check if the backups are enabled on the hosts, to be sure: # lvmconfig --typeconfig current | egrep "backup|archive" backup { backup=1 backup_dir="/etc/lvm/backup" archive=1 archive_dir="/etc/lvm/archive" If the backups are not available I'm afraid there's not much you can do at this point. On Thu, Sep 16, 2021 at 2:56 PM <nicolas@devels.es> wrote:
Hi Roman,
Unfortunately, step 1 returns nothing:
kvmr03:~# vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp No archives found in /etc/lvm/archive
I tried several hosts and noone has a copy.
Any other way to get a backup of the VG?
El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command.
2) Find UUID of your broken PV (filter might not be needed, depends on your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK>
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is there a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages
El 2021-09-16 08:28, Vojtech Juranek escribió: like
these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem
could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List
Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...

On Thu, Sep 16, 2021 at 4:20 PM Roman Bednar <rbednar@redhat.com> wrote:
Make sure the VG name is correct, it won't complain if the name is wrong.
Also you can check if the backups are enabled on the hosts, to be sure:
# lvmconfig --typeconfig current | egrep "backup|archive" backup { backup=1 backup_dir="/etc/lvm/backup" archive=1 archive_dir="/etc/lvm/archive"
If the backups are not available I'm afraid there's not much you can do at this point.
Actually you can, since you may have a good metadata in the PV. There are 2 metadata areas, and when one of them is corrupt, you can restore the metadata from the other. The metadata areas contains also the history of the metadata, so even if both metadata areas are corrupted, you can extract an older metadata from one of the areas. If you build lvm from upstream from source on the host, you can extract the metadata from the PV using pvck --dump and repair the PV using pvck --repair, using the dumped metadata. But the most important thing is to upgrade - this is a known issue in 4.3.8. You need to upgrade to 4.3.11 providing vdsm >= 4.30.50 and lvm2 >= 2.02.187-6. The related bug: https://bugzilla.redhat.com/1849595 Nir
On Thu, Sep 16, 2021 at 2:56 PM <nicolas@devels.es> wrote:
Hi Roman,
Unfortunately, step 1 returns nothing:
kvmr03:~# vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp No archives found in /etc/lvm/archive
I tried several hosts and noone has a copy.
Any other way to get a backup of the VG?
El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command.
2) Find UUID of your broken PV (filter might not be needed, depends on your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK>
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is there a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote: > Hi, > > We're running oVirt 4.3.8 and we recently had a oVirt crash after > moving > too much disks between storage domains. > > Concretely, one of the Storage Domains reports status "Unknown", > "Total/Free/Guaranteed free spaces" are "[N/A]". > > After trying to activate it in the Domain Center we see messages
El 2021-09-16 08:28, Vojtech Juranek escribió: like
> these from all of the hosts: > > VDSM hostX command GetVGInfoVDS failed: Volume Group does not > exist: > (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',) > > I tried putting the Storage Domain in maintenance and it fails with > messages like: > > Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by > system because it's not visible by any of the hosts. > Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, > OVF > data isn't updated on those OVF stores (Data Center KVMRojo, Storage > Domain iaasb13). > Failed to update VMs/Templates OVF data for Storage Domain > iaasb13 > in Data Center KVMRojo. > > I'm sure the storage domain backend is up and running, and the LUN > being > exported. > > Any hints how can I debug this problem and restore the Storage > Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem
could be.
> Thanks. > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ List
> Archives: >
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
> WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SVQM2U5S4BWJIF...

Hi Roman and Nir, El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command.
You were right. There actually *are* LV backups, I was specifying an incorrect ID. So the correct command would return: # vgcfgrestore --list 219fa16f-13c9-44e4-a07d-a40c0a7fe206 [...] File: /etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg VG name: 219fa16f-13c9-44e4-a07d-a40c0a7fe206 Description: Created *before* executing 'vgs --noheading --nosuffix --units b -o +vg_uuid,vg_extent_size' Backup Time: Sat Sep 11 03:41:25 2021 [...] That one seems ok.
2) Find UUID of your broken PV (filter might not be needed, depends on your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304
As I understand it, the PV won't be listed in the 'pvs' command, this is just a matter of finding the associated VG. The command above won't list a PV associated to the VG in step 1, it just complains the PV cannot be read. # pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304 No physical volume label read from /dev/mapper/36001405063455cf7cd74c20bc06e9304. So, associated PV ID is: 36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK>
I have a question here. As I understand it, pvcreate will restore the correct metadata on <EMPTY_DISK>. Then how do you restore that metadata on the broken storage domain, so other hosts can see the right information as well? Or is this just a step to recover data on <EMPTY_DISK> and then reattach the disks on the affected VMs? Thanks so much.
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is there a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages
El 2021-09-16 08:28, Vojtech Juranek escribió: like
these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem
could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List
Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...

So, I've made several attempts to restore the metadata. In my last e-mail I said in step 2 that the PV ID is: 36001405063455cf7cd74c20bc06e9304, which is incorrect. I'm trying to find out the PV UUID running "pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304". However, it shows no PV UUID. All I get from the command output is: # pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to find device "/dev/mapper/36001405063455cf7cd74c20bc06e9304". I tried running a bare "vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206" command, which returned: # vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206 /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb. Cannot restore Volume Group 219fa16f-13c9-44e4-a07d-a40c0a7fe206 with 1 PVs marked as missing. Restore failed. Seems that the PV is missing, however, I assume the PV UUID (from output above) is Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb. So I tried running: # pvcreate --uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb --restore /etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00200-1084769199.vg /dev/sdb1 Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb. /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Device /dev/sdb1 excluded by a filter. Either the PV UUID is not the one I specified, or the system can't find it (or both). El 2021-09-20 09:21, nicolas@devels.es escribió:
Hi Roman and Nir,
El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command.
You were right. There actually *are* LV backups, I was specifying an incorrect ID.
So the correct command would return:
# vgcfgrestore --list 219fa16f-13c9-44e4-a07d-a40c0a7fe206 [...]
File: /etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg VG name: 219fa16f-13c9-44e4-a07d-a40c0a7fe206 Description: Created *before* executing 'vgs --noheading --nosuffix --units b -o +vg_uuid,vg_extent_size' Backup Time: Sat Sep 11 03:41:25 2021 [...]
That one seems ok.
2) Find UUID of your broken PV (filter might not be needed, depends on your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304
As I understand it, the PV won't be listed in the 'pvs' command, this is just a matter of finding the associated VG. The command above won't list a PV associated to the VG in step 1, it just complains the PV cannot be read.
# pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304 No physical volume label read from /dev/mapper/36001405063455cf7cd74c20bc06e9304.
So, associated PV ID is: 36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK>
I have a question here. As I understand it, pvcreate will restore the correct metadata on <EMPTY_DISK>. Then how do you restore that metadata on the broken storage domain, so other hosts can see the right information as well? Or is this just a step to recover data on <EMPTY_DISK> and then reattach the disks on the affected VMs?
Thanks so much.
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is there a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote:
Hi,
We're running oVirt 4.3.8 and we recently had a oVirt crash after moving too much disks between storage domains.
Concretely, one of the Storage Domains reports status "Unknown", "Total/Free/Guaranteed free spaces" are "[N/A]".
After trying to activate it in the Domain Center we see messages
El 2021-09-16 08:28, Vojtech Juranek escribió: like
these from all of the hosts:
VDSM hostX command GetVGInfoVDS failed: Volume Group does not exist: (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',)
I tried putting the Storage Domain in maintenance and it fails with messages like:
Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by system because it's not visible by any of the hosts. Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, OVF data isn't updated on those OVF stores (Data Center KVMRojo, Storage Domain iaasb13). Failed to update VMs/Templates OVF data for Storage Domain iaasb13 in Data Center KVMRojo.
I'm sure the storage domain backend is up and running, and the LUN being exported.
Any hints how can I debug this problem and restore the Storage Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem
could be.
Thanks. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List
Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...

Did you update the packages as suggested by Nir? If so and it still does not work, maybe try the pvck recovery that Nir described too. If that still does not work consider filing a bug for lvm and providing a failing command(s) output with -vvvv option in the description or attachment. Perhaps there is a better way or a known workaround. -Roman On Mon, Sep 20, 2021 at 2:22 PM <nicolas@devels.es> wrote:
So, I've made several attempts to restore the metadata.
In my last e-mail I said in step 2 that the PV ID is: 36001405063455cf7cd74c20bc06e9304, which is incorrect.
I'm trying to find out the PV UUID running "pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304". However, it shows no PV UUID. All I get from the command output is:
# pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to find device "/dev/mapper/36001405063455cf7cd74c20bc06e9304".
I tried running a bare "vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206" command, which returned:
# vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206 /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb. Cannot restore Volume Group 219fa16f-13c9-44e4-a07d-a40c0a7fe206 with 1 PVs marked as missing. Restore failed.
Seems that the PV is missing, however, I assume the PV UUID (from output above) is Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
So I tried running:
# pvcreate --uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb --restore /etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00200-1084769199.vg /dev/sdb1 Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb. /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Device /dev/sdb1 excluded by a filter.
Either the PV UUID is not the one I specified, or the system can't find it (or both).
El 2021-09-20 09:21, nicolas@devels.es escribió:
Hi Roman and Nir,
El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command.
You were right. There actually *are* LV backups, I was specifying an incorrect ID.
So the correct command would return:
# vgcfgrestore --list 219fa16f-13c9-44e4-a07d-a40c0a7fe206 [...]
File: /etc/lvm/archive/ 219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg VG name: 219fa16f-13c9-44e4-a07d-a40c0a7fe206 Description: Created *before* executing 'vgs --noheading --nosuffix --units b -o +vg_uuid,vg_extent_size' Backup Time: Sat Sep 11 03:41:25 2021 [...]
That one seems ok.
2) Find UUID of your broken PV (filter might not be needed, depends on your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304
As I understand it, the PV won't be listed in the 'pvs' command, this is just a matter of finding the associated VG. The command above won't list a PV associated to the VG in step 1, it just complains the PV cannot be read.
# pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304 No physical volume label read from /dev/mapper/36001405063455cf7cd74c20bc06e9304.
So, associated PV ID is: 36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK>
I have a question here. As I understand it, pvcreate will restore the correct metadata on <EMPTY_DISK>. Then how do you restore that metadata on the broken storage domain, so other hosts can see the right information as well? Or is this just a step to recover data on <EMPTY_DISK> and then reattach the disks on the affected VMs?
Thanks so much.
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is there a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote: > Hi, > > We're running oVirt 4.3.8 and we recently had a oVirt crash after > moving > too much disks between storage domains. > > Concretely, one of the Storage Domains reports status "Unknown", > "Total/Free/Guaranteed free spaces" are "[N/A]". > > After trying to activate it in the Domain Center we see messages
El 2021-09-16 08:28, Vojtech Juranek escribió: like
> these from all of the hosts: > > VDSM hostX command GetVGInfoVDS failed: Volume Group does not > exist: > (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',) > > I tried putting the Storage Domain in maintenance and it fails with > messages like: > > Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by > system because it's not visible by any of the hosts. > Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, > OVF > data isn't updated on those OVF stores (Data Center KVMRojo, Storage > Domain iaasb13). > Failed to update VMs/Templates OVF data for Storage Domain > iaasb13 > in Data Center KVMRojo. > > I'm sure the storage domain backend is up and running, and the LUN > being > exported. > > Any hints how can I debug this problem and restore the Storage > Domain?
I'd suggest to ssh to any of the host from given data center and investigate manually, if the device is visible to the host (e.g. using lsblk) and eventually check /var/log/messages to determine where the problem
could be.
> Thanks. > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ List
> Archives: >
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
> WLEYO6FFM4WOLD6YATW/
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...

lvm2 packages on all machines are the same as Nir suggested. I tried the pvck command, but it said the PV UUID did not exist. Finally ended up forcibly removing the storage domain. Thanks anyway. El 2021-09-20 15:13, Roman Bednar escribió:
Did you update the packages as suggested by Nir? If so and it still does not work, maybe try the pvck recovery that Nir described too.
If that still does not work consider filing a bug for lvm and providing a failing command(s) output with -vvvv option in the description or attachment. Perhaps there is a better way or a known workaround.
-Roman
On Mon, Sep 20, 2021 at 2:22 PM <nicolas@devels.es> wrote:
So, I've made several attempts to restore the metadata.
In my last e-mail I said in step 2 that the PV ID is: 36001405063455cf7cd74c20bc06e9304, which is incorrect.
I'm trying to find out the PV UUID running "pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304". However, it shows no PV UUID. All I get from the command output is:
# pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to find device "/dev/mapper/36001405063455cf7cd74c20bc06e9304".
I tried running a bare "vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206" command, which returned:
# vgcfgrestore 219fa16f-13c9-44e4-a07d-a40c0a7fe206 /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb. Cannot restore Volume Group 219fa16f-13c9-44e4-a07d-a40c0a7fe206 with 1 PVs marked as missing. Restore failed.
Seems that the PV is missing, however, I assume the PV UUID (from output above) is Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb.
So I tried running:
# pvcreate --uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb --restore
/etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00200-1084769199.vg
[1] /dev/sdb1 Couldn't find device with uuid Q3xkre-25cg-L3Do-aeMD-iLem-wOHh-fb8fzb. /dev/mapper/360014057b367e3a53b44ab392ae0f25f: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/360014057b367e3a53b44ab392ae0f25f. Metadata location on /dev/mapper/360014057b367e3a53b44ab392ae0f25f at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Failed to scan VG from /dev/mapper/360014057b367e3a53b44ab392ae0f25f Device /dev/sdb1 excluded by a filter.
Either the PV UUID is not the one I specified, or the system can't find it (or both).
El 2021-09-20 09:21, nicolas@devels.es escribió:
Hi Roman and Nir,
El 2021-09-16 13:42, Roman Bednar escribió:
Hi Nicolas,
You can try to recover VG metadata from a backup or archive which lvm automatically creates by default.
1) To list all available backups for given VG:
#vgcfgrestore --list Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
Select the latest one which sounds right, something with a description along the lines of "Created *before* lvremove". You might want to select something older than the latest as lvm does a backup also *after* running some command.
You were right. There actually *are* LV backups, I was specifying an incorrect ID.
So the correct command would return:
# vgcfgrestore --list 219fa16f-13c9-44e4-a07d-a40c0a7fe206 [...]
File:
/etc/lvm/archive/219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg
VG name: 219fa16f-13c9-44e4-a07d-a40c0a7fe206 Description: Created *before* executing 'vgs --noheading --nosuffix --units b -o +vg_uuid,vg_extent_size' Backup Time: Sat Sep 11 03:41:25 2021 [...]
That one seems ok.
2) Find UUID of your broken PV (filter might not be needed, depends on your local conf):
#pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304
As I understand it, the PV won't be listed in the 'pvs' command,
is just a matter of finding the associated VG. The command above won't list a PV associated to the VG in step 1, it just complains the PV cannot be read.
# pvs -o pv_name,pv_uuid --config='devices/filter = ["a|.*|"]' /dev/mapper/36001405063455cf7cd74c20bc06e9304 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304 No physical volume label read from /dev/mapper/36001405063455cf7cd74c20bc06e9304.
So, associated PV ID is: 36001405063455cf7cd74c20bc06e9304
3) Create a new PV on a different partition or disk (/dev/sdX) using the UUID found in previous step and restorefile option:
#pvcreate --uuid <ID_OF_BROKEN_PV> --restorefile <PATH_TO_BACKUP_FILE> <EMPTY_DISK>
I have a question here. As I understand it, pvcreate will restore
correct metadata on <EMPTY_DISK>. Then how do you restore that metadata on the broken storage domain, so other hosts can see the right information as well? Or is this just a step to recover data on <EMPTY_DISK> and then reattach the disks on the affected VMs?
Thanks so much.
4) Try to display the VG:
# vgdisplay Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp
-Roman
On Thu, Sep 16, 2021 at 1:47 PM <nicolas@devels.es> wrote:
I can also see...
kvmr03:~# lvs | grep 927f423a-6689-4ddb-8fda-b3375c3bbca3 /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040 Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304. Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG. Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304 Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304
Seems to me like metadata from that VG has been corrupted. Is
[2] this the there
a way to recover?
El 2021-09-16 11:19, nicolas@devels.es escribió:
The most relevant log snippet I have found is the following. I assume it cannot scan the Storage Domain, but I'm unsure why, as the storage domain backend is up and running.
021-09-16 11:16:58,884+0100 WARN (monitor/219fa16) [storage.LVM] Command ['/usr/sbin/lvm', 'vgs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
filter=["a|^/dev/mapper/36001405063455cf7cd74c20bc06e9304$|^/dev/mapper/360014056481868b09dd4d05bee5b4185$|^/dev/mapper/360014057d9d4bc57df046888b8d8b6eb$|^/dev/mapper/360014057e612d2079b649d5b539e5f6a$|^/dev/mapper/360014059b49883b502a4fa9b81add3e4$|^/dev/mapper/36001405acece27e83b547e3a873b19e2$|^/dev/mapper/36001405dc03f6be1b8c42219e8912fbd$|^/dev/mapper/36001405f3ab584afde347d3a8855baf0$|^/dev/mapper/3600c0ff00052a0fe013ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe033ec65f01000000$|^/dev/mapper/3600c0ff00052a0fe1b40c65f01000000$|^/dev/mapper/3600c0ff00052a0fe2294c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2394c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2494c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2594c75f01000000$|^/dev/mapper/3600c0ff00052a0fe2694c75f01000000$|^/dev/mapper/3600c0ff00052a0fee293c75f01000000$|^/dev/mapper/3600c0ff00052a0fee493c75f01000000$|^/dev/mapper/3600c0ff00064835b628d306101000000$|^/dev/mapper/3600c0ff00064835b628d306103000000$|^/dev/mapper/3600c0ff000648
35b628d306105000000$|^/dev/mapper/3600c0ff00064835b638d306101000000$|^/dev/mapper/3600c0ff00064835b638d306103000000$|^/dev/mapper/3600c0ff00064835b638d306105000000$|^/dev/mapper/3600c0ff00064835b638d306107000000$|^/dev/mapper/3600c0ff00064835b638d306109000000$|^/dev/mapper/3600c0ff00064835b638d30610b000000$|^/dev/mapper/3600c0ff00064835cb98f306101000000$|^/dev/mapper/3600c0ff00064835cb98f306103000000$|^/dev/mapper/3600c0ff00064835cb98f306105000000$|^/dev/mapper/3600c0ff00064835cb98f306107000000$|^/dev/mapper/3600c0ff00064835cb98f306109000000$|^/dev/mapper/3600c0ff00064835cba8f306101000000$|^/dev/mapper/3600c0ff00064835cba8f306103000000$|^/dev/mapper/3600c0ff00064835cba8f306105000000$|^/dev/mapper/3600c0ff00064835cba8f306107000000$|^/dev/mapper/3634b35410019574796dcb0e300000007$|^/dev/mapper/3634b35410019574796dcdffc00000008$|^/dev/mapper/3634b354100195747999c2dc500000003$|^/dev/mapper/3634b354100195747999c3c4a00000004$|^/dev/mapper/3634b3541001957479c2b9c6400000001$|^/dev/mapper/3634
b3541001957479c2baba500000002$|", "r|.*|"] } global { locking_type=4 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o',
'uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name',
'--select', 'vg_name = 219fa16f-13c9-44e4-a07d-a40c0a7fe206'] succeeded with warnings: [' /dev/mapper/36001405063455cf7cd74c20bc06e9304: Checksum error at offset 2198927383040', " Couldn't read volume group metadata from /dev/mapper/36001405063455cf7cd74c20bc06e9304.", ' Metadata location on /dev/mapper/36001405063455cf7cd74c20bc06e9304 at 2198927383040 has invalid summary for VG.', ' Failed to read metadata summary from /dev/mapper/36001405063455cf7cd74c20bc06e9304', ' Failed to scan VG from /dev/mapper/36001405063455cf7cd74c20bc06e9304'] (lvm:462) 2021-09-16 11:16:58,909+0100 ERROR (monitor/219fa16) [storage.Monitor] Setting up monitor for 219fa16f-13c9-44e4-a07d-a40c0a7fe206 failed (monitor:330) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 327, in _setupLoop self._setupMonitor() File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 349, in _setupMonitor self._produceDomain() File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in wrapper value = meth(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 367, in _produceDomain self.domain = sdCache.produce(self.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain raise se.StorageDomainDoesNotExist(sdUUID) StorageDomainDoesNotExist: Storage domain does not exist: (u'219fa16f-13c9-44e4-a07d-a40c0a7fe206',)
El 2021-09-16 08:28, Vojtech Juranek escribió: > On Wednesday, 15 September 2021 14:52:27 CEST nicolas@devels.es wrote: >> Hi, >> >> We're running oVirt 4.3.8 and we recently had a oVirt crash after >> moving >> too much disks between storage domains. >> >> Concretely, one of the Storage Domains reports status "Unknown", >> "Total/Free/Guaranteed free spaces" are "[N/A]". >> >> After trying to activate it in the Domain Center we see messages like >> these from all of the hosts: >> >> VDSM hostX command GetVGInfoVDS failed: Volume Group does not >> exist: >> (u'vg_uuid: Usi3y8-S4eq-EXtl-FA58-MA3K-b4vE-4d9SCp',) >> >> I tried putting the Storage Domain in maintenance and it fails with >> messages like: >> >> Storage Domain iaasb13 (Data Center KVMRojo) was deactivated by >> system because it's not visible by any of the hosts. >> Failed to update OVF disks 8661acd1-d1c4-44a0-a4d4-ddee834844e9, >> OVF >> data isn't updated on those OVF stores (Data Center KVMRojo, Storage >> Domain iaasb13). >> Failed to update VMs/Templates OVF data for Storage Domain >> iaasb13 >> in Data Center KVMRojo. >> >> I'm sure the storage domain backend is up and running, and the LUN >> being >> exported. >> >> Any hints how can I debug this problem and restore the Storage >> Domain? > > I'd suggest to ssh to any of the host from given data center and > investigate > manually, if the device is visible to the host (e.g. using lsblk) and > eventually check /var/log/messages to determine where the problem
> could be. > > >> Thanks. >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ List
>> Archives: >>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UNXKR7HRCRDTT
>> WLEYO6FFM4WOLD6YATW/ > > > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: >
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6BBLM6M5Z5S25...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZXFLCBM3UJTCN...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NCZEXH7NMTNLFN...
Links: ------ [1] http://219fa16f-13c9-44e4-a07d-a40c0a7fe206_00200-1084769199.vg [2] http://219fa16f-13c9-44e4-a07d-a40c0a7fe206_00202-1152107223.vg
participants (4)
-
nicolas@devels.es
-
Nir Soffer
-
Roman Bednar
-
Vojtech Juranek