I was playing with some gluster performance options on my storage volume
and managed to crash it.
On recovery (and heal) my storage domain won't come up.
I'm getting these errors in vdsm.log:
Thread-24::WARNING::2014-03-13
15:51:18,741::lvm::391::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
Volume group "47df1e9f-6a83-46b9-a7ec-9568abd10d1a" not found', '
Skipping volume group 47df1e9f-6a83-46b9-a7ec-9568abd10d1a']
Thread-24::DEBUG::2014-03-13
15:51:18,741::lvm::428::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' released the operation mutex
Thread-24::ERROR::2014-03-13
15:51:18,748::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
47df1e9f-6a83-46b9-a7ec-9568abd10d1a not found
Traceback (most recent call last):
File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
dom = findMethod(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
('47df1e9f-6a83-46b9-a7ec-9568abd10d1a',)
Thread-24::ERROR::2014-03-13
15:51:18,748::domainMonitor::231::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain 47df1e9f-6a83-46b9-a7ec-9568abd10d1a
monitoring information
Traceback (most recent call last):
File "/usr/share/vdsm/storage/domainMonitor.py", line 196, in
_monitorDomain
self.domain = sdCache.produce(self.sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
domain.getRealDomain()
File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
dom = findMethod(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
('47df1e9f-6a83-46b9-a7ec-9568abd10d1a',)
I destroyed (UI) the storage domain
(47df1e9f-6a83-46b9-a7ec-9568abd10d1a was an iso domain) but still get the
same errors. I've tried re-installing the hosts without change. Is there a
local cache vdsm builds that it might be referencing that I can clear out?
I'm only showing one SD in the db which is my gluster SD:
engine=# select * from storage_domain_static;
id | storage
| storage_name | storage_domain_type | storage_type |
storage_domain_format_type | _create_date |
_update_dat
e | recoverable | last_time_used_as_master | storage_description |
storage_comment
--------------------------------------+--------------------------------------+--------------+---------------------+--------------+----------------------------+-------------------------------+--------------------
-----------+-------------+--------------------------+---------------------+-----------------
0de3b516-6c74-4ad8-8958-d3f571ceda8d |
e329ad38-b5d8-47a3-ac01-3dfdd5193032 | rep2 | 0 |
7 | 3 | 2014-02-23 16:20:42.327492-05 |
2014-02-23 16:20:49
.094751-05 | t | 0 | |
(1 row)
engine=# select * from storage_domain_;
storage_domain_dynamic storage_domain_file_repos
storage_domain_static storage_domain_static_view
engine=# select * from storage_domain_dynamic;
id | available_disk_size |
used_disk_size | _update_date
--------------------------------------+---------------------+----------------+-------------------------------
0de3b516-6c74-4ad8-8958-d3f571ceda8d | 1887 |
160 | 2014-03-12 01:59:52.258384-04
(1 row)
[root@ovirt001 vdsm]# vdsClient -s 0 getStorageDomainsList
[root@ovirt001 vdsm]#
engine.log contains this repeating:
TaskStatusListReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=654,
mMessage=Not SPM]]
2014-03-13 16:02:03,538 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] HostName = ovirt001
2014-03-13 16:02:03,538 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] Command
HSMGetAllTasksStatusesVDS execution failed. Exception:
IRSNonOperationalException: IRSGenericException: IRSErrorException:
IRSNonOperationalException: Not SPM
2014-03-13 16:02:03,539 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, SpmStopVDSCommand,
log id: 41fe5668
2014-03-13 16:02:03,539 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] spm stop succeeded,
continuing with spm selection
2014-03-13 16:02:03,560 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] starting spm on vds ovirt002,
storage pool IT, prevId 1, LVER 7
2014-03-13 16:02:03,561 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] START,
SpmStartVDSCommand(HostName = ovirt002, HostId =
fac716fc-baff-46fe-9323-fd581a26a983, storagePoolId =
8da661c0-8125-4efb-851e-c9320d268578, prevId=1, prevLVER=7,
storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=true), log id:
53385b6b
2014-03-13 16:02:03,572 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling started:
taskId = 0fb302dd-32ff-4442-aa98-5eecaf571da4
2014-03-13 16:02:05,598 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] Failed in HSMGetTaskStatusVDS
method
2014-03-13 16:02:05,599 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] Error code
AcquireHostIdFailure and error message VDSGenericException:
VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire
host id
2014-03-13 16:02:05,599 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling ended:
taskId = 0fb302dd-32ff-4442-aa98-5eecaf571da4 task status = finished
2014-03-13 16:02:05,600 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] Start SPM Task failed -
result: cleanSuccess, message: VDSGenericException: VDSErrorException:
Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-03-13 16:02:05,606 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling ended, spm
status: Free
2014-03-13 16:02:05,607 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] START,
HSMClearTaskVDSCommand(HostName = ovirt002, HostId =
fac716fc-baff-46fe-9323-fd581a26a983,
taskId=0fb302dd-32ff-4442-aa98-5eecaf571da4), log id: 6e65ac77
2014-03-13 16:02:05,613 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] FINISH,
HSMClearTaskVDSCommand, log id: 6e65ac77
2014-03-13 16:02:05,613 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, SpmStartVDSCommand,
return:
org.ovirt.engine.core.common.businessentities.SpmStatusResult@798917ef, log
id: 53385b6b
2014-03-13 16:02:05,615 INFO
[org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand]
(DefaultQuartzScheduler_Worker-21) [17828a3f] Running command:
SetStoragePoolStatusCommand internal: true. Entities affected : ID:
8da661c0-8125-4efb-851e-c9320d268578 Type: StoragePool
2014-03-13 16:02:05,656 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-21) [17828a3f]
IrsBroker::Failed::GetStoragePoolInfoVDS due to:
IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
SpmStart failed
2014-03-13 16:02:15,674 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
(DefaultQuartzScheduler_Worker-39) [7b210b3d] Command
org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand
return value
TaskStatusListReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=654,
mMessage=Not SPM]]
Any help appreciated. vdsm.log from both hosts and engine.log attached.
*Steve Dainard *
IT Infrastructure Manager
Miovision <
http://miovision.com/> | *Rethink Traffic*
*Blog <
http://miovision.com/blog> | **LinkedIn
<
https://www.linkedin.com/company/miovision-technologies> | Twitter
<
https://twitter.com/miovision> | Facebook
<
https://www.facebook.com/miovision>*
------------------------------
Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.