[ovirt-users] Storage domain issue
Nathanaël Blanchet
blanchet at abes.fr
Mon Mar 23 16:13:40 UTC 2015
Thnak you for reporting this issue because, I met exactly the same : FC
storage domain and sometimes, many of my hosts (15 ) become sometimes
unavailable without any apparent action on them.
The issue message is : storage domain is unvailable. So it is a desaster
when power management is activated because hosts reboot at the same time
and all VMs go down without migrating.
It happened to me two times, and the second time it was less a pity
because I desactivated the power management.
It may be a serious issue because host stay reacheable and lun is still
okay when doing a lvs command.
The workaround in this case is to restart the engine (restarting vdsm
gives nothing) and then, all the hosts come up.
* el6 engine on a separate KVM
* implied el7 and el6 hosts
* ovirt 3.5.1 and vdsm 4.16.10-8
* 2 FC datacenter on two remote sites with the same engine and both
are impacted
Le 23/03/2015 16:54, Jonas Israelsson a écrit :
> Greetings.
>
> Running oVirt 3.5 with a mix of NFS and FC Storage.
>
> Engine running on a seperate KVM VM and Node installed with a pre 3.5
> ovirt-node "ovirt-node-iso-3.5.0.ovirt35.20140912.el6 (Edited)"
>
> I had some problems with my FC-Storage where the LUNS for a while
> became unavailable to my Ovirt-host. Everything is now up and running
> and those luns again are accessible by the host. The NFS domains goes
> back online but the FC does not.
>
> Thread-22::DEBUG::2015-03-23
> 14:53:02,706::lvm::290::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n
> /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"]
> ignore_suspended_devices=1 write_cache_state=0
> disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [
> '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1
> wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50
> retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|'
> --ignoreskippedcluster -o
> uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
> 29f9b165-3674-4384-a1d4-7aa87d923d56 (cwd None)
>
> Thread-24::DEBUG::2015-03-23
> 14:53:02,981::lvm::290::Storage.Misc.excCmd::(cmd) FAILED: <err> = '
> Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found\n
> Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56\n'; <rc> = 5
>
> Thread-24::WARNING::2015-03-23
> 14:53:02,986::lvm::372::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
> [' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found', '
> Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56']
>
>
> Running the command above manually does indeed give the same output:
>
> # /sbin/lvm vgs --config ' devices { preferred_names =
> ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0
> disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [
> '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1
> wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50
> retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|'
> --ignoreskippedcluster -o
> uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
> 29f9b165-3674-4384-a1d4-7aa87d923d56
>
> Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found
> Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56
>
> What puzzles me is that those volume does exist.
>
> lvm vgs
> VG #PV #LV #SN Attr VSize VFree
> 22cf06d1-faca-4e17-ac78-d38b7fc300b1 1 13 0 wz--n- 999.62g 986.50g
> 29f9b165-3674-4384-a1d4-7aa87d923d56 1 8 0 wz--n- 99.62g 95.50g
> HostVG 1 4 0 wz--n- 13.77g 52.00m
>
>
> --- Volume group ---
> VG Name 29f9b165-3674-4384-a1d4-7aa87d923d56
> System ID
> Format lvm2
> Metadata Areas 2
> Metadata Sequence No 20
> VG Access read/write
> VG Status resizable
> MAX LV 0
> Cur LV 8
> Open LV 0
> Max PV 0
> Cur PV 1
> Act PV 1
> VG Size 99.62 GiB
> PE Size 128.00 MiB
> Total PE 797
> Alloc PE / Size 33 / 4.12 GiB
> Free PE / Size 764 / 95.50 GiB
> VG UUID aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk
>
> lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"]
> ignore_suspended_devices=1 write_cache_state=0
> disable_after_error_count=3 obtain_device_list_from_udev=0 } global {
> locking_type=1 prioritise_write_locks=1 wait_for_locks=1
> use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } '
> --noheadings --units b --nosuffix --separator '|'
> --ignoreskippedcluster -o
> uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
> 29f9b165-3674-4384-a1d4-7aa87d923d56
>
>
> aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk|29f9b165-3674-4384-a1d4-7aa87d923d56|wz--n-|106971529216|102542344192|134217728|797|764|MDT_LEASETIMESEC=60,MDT_CLASS=Data,MDT_VERSION=3,MDT_SDUUID=29f9b165-3674-4384-a1d4-7aa87d923d56,MDT_PV0=pv:36001405c94d80be2ed0482c91a1841b8&44&uuid:muHcYl-sobG-3LyY-jjfg-3fGf-1cHO-uDk7da&44&pestart:0&44&pecount:797&44&mapoffset:0,MDT_LEASERETRIES=3,MDT_VGUUID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk,MDT_IOOPTIMEOUTSEC=10,MDT_LOCKRENEWALINTERVALSEC=5,MDT_PHYBLKSIZE=512,MDT_LOGBLKSIZE=512,MDT_TYPE=FCP,MDT_LOCKPOLICY=,MDT_DESCRIPTION=Master,RHAT_storage_domain,MDT_POOL_SPM_ID=-1,MDT_POOL_DESCRIPTION=Elementary,MDT_POOL_SPM_LVER=-1,MDT_POOL_UUID=8c3c5df9-e8ff-4313-99c9-385b6c7d896b,MDT_MASTER_VERSION=10,MDT_POOL_DOMAINS=22cf06d1-faca-4e17-ac78-d38b7fc300b1:Active&44&c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1:Active&44&96e62d18-652d-401a-b4b5-b54ecefa331c:Active&44&29f9b165-3674-4384-a1d4-7aa87d923d56:Active&44&1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062:Active,MDT__SH
>
> A_CKSUM=7ea9af890755d96563cb7a736f8e3f46ea986f67,MDT_ROLE=Regular|134217728|67103744|8|1|/dev/sda
>
>
>
> [root at patty vdsm]# vdsClient -s 0 getStorageDomainsList (Returns all
> but only the NFS-Domains)
> c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1
> 1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062
> a8fd9df0-48f2-40a2-88d4-7bf47fef9b07
>
>
> engine=# select id,storage,storage_name,storage_domain_type from
> storage_domain_static ;
> id | storage |
> storage_name | storage_domain_type
> --------------------------------------+----------------------------------------+------------------------+---------------------
>
> 072fbaa1-08f3-4a40-9f34-a5ca22dd1d74 |
> ceab03af-7220-4d42-8f5c-9b557f5d29af | ovirt-image-repository | 4
> 1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062 |
> 6564a0b2-2f92-48de-b986-e92de7e28885 | ISO | 2
> c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1 |
> bb54b2b8-00a2-4b84-a886-d76dd70c3cb0 | Export | 3
> 22cf06d1-faca-4e17-ac78-d38b7fc300b1 |
> e43eRZ-HACv-YscJ-KNZh-HVwe-tAd2-0oGNHh | Hinken | 1 <---- 'GONE'
> 29f9b165-3674-4384-a1d4-7aa87d923d56 |
> aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk | Master | 1 <---- 'GONE'
> a8fd9df0-48f2-40a2-88d4-7bf47fef9b07 |
> 0299ca61-d68e-4282-b6c3-f6e14aef2688 | NFS-DATA | 0
>
> When manually trying to activate one of the above domains the
> following is written to the engine.log
>
> 2015-03-23 16:37:27,193 INFO
> [org.ovirt.engine.core.bll.storage.SyncLunsInfoForBlockStorageDomainCommand]
> (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] Running command:
> SyncLunsInfoForBlockStorageDomainCommand internal: true. Entities
> affected : ID: 29f9b165-3674-4384-a1d4-7aa87d923d56 Type: Storage
> 2015-03-23 16:37:27,202 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] START,
> GetVGInfoVDSCommand(HostName = patty.elemementary.se, HostId =
> 38792a69-76f3-46d8-8620-9d4b9a5ec21f,
> VGID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk), log id: 6e6f6792
> 2015-03-23 16:37:27,404 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-28) [3258de6d] Failed in GetVGInfoVDS
> method
> 2015-03-23 16:37:27,404 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-28) [3258de6d] Command
> org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand return
> value
>
> OneVGReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=506,
> mMessage=Volume Group does not exist: (u'vg_uuid:
> aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',)]]
>
> 2015-03-23 16:37:27,406 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-28) [3258de6d] HostName =
> patty.elemementary.se
> 2015-03-23 16:37:27,407 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-28) [3258de6d] Command
> GetVGInfoVDSCommand(HostName = patty.elemementary.se, HostId =
> 38792a69-76f3-46d8-8620-9d4b9a5ec21f,
> VGID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk) execution failed.
> Exception: VDSErrorException: VDSGenericException: VDSErrorException:
> Failed to GetVGInfoVDS, error = Volume Group does not exist:
> (u'vg_uuid: aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',), code = 506
> 2015-03-23 16:37:27,409 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-28) [3258de6d] FINISH,
> GetVGInfoVDSCommand, log id: 2edb7c0d
> 2015-03-23 16:37:27,410 ERROR
> [org.ovirt.engine.core.bll.storage.SyncLunsInfoForBlockStorageDomainCommand]
> (org.ovirt.thread.pool-8-thread-28) [3258de6d] Command
> org.ovirt.engine.core.bll.storage.SyncLunsInfoForBlockStorageDomainCommand
> throw Vdc Bll exception. With error message VdcBLLException:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
> VDSGenericException: VDSErrorException: Failed to GetVGInfoVDS, error
> = Volume Group does not exist: (u'vg_uuid:
> aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',), code = 506 (Failed with
> error VolumeGroupDoesNotExist and code 506)
> 2015-03-23 16:37:27,413 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.ActivateStorageDomainVDSCommand]
> (org.ovirt.thread.pool-8-thread-28) [3258de6d] START,
> ActivateStorageDomainVDSCommand( storagePoolId =
> 8c3c5df9-e8ff-4313-99c9-385b6c7d896b, ignoreFailoverLimit = false,
> storageDomainId = 29f9b165-3674-4384-a1d4-7aa87d923d56), log id: 795253ee
> 2015-03-23 16:37:27,482 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] Failed in GetVGInfoVDS
> method
> 2015-03-23 16:37:27,482 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand]
> (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] Command
> org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand return
> value
> OneVGReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=506,
> mMessage=Volume Group does not exist: (u'vg_uuid:
> aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',)]]
>
>
> Could someone (pretty please with sugar on top) point me in the right
> direction ?
>
> Brgds Jonas
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20150323/fc1e7aa2/attachment-0001.html>
More information about the Users
mailing list