<div dir="ltr">I was playing with some gluster performance options on my storage volume and managed to crash it.<div><br></div><div>On recovery (and heal) my storage domain won't come up.</div><div><br></div><div>I'm getting these errors in vdsm.log:</div>
<div><br></div><div><div>Thread-24::WARNING::2014-03-13 15:51:18,741::lvm::391::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] [' Volume group "47df1e9f-6a83-46b9-a7ec-9568abd10d1a" not found', ' Skipping volume group 47df1e9f-6a83-46b9-a7ec-9568abd10d1a']</div>
<div>Thread-24::DEBUG::2014-03-13 15:51:18,741::lvm::428::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' released the operation mutex</div><div>Thread-24::ERROR::2014-03-13 15:51:18,748::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 47df1e9f-6a83-46b9-a7ec-9568abd10d1a not found</div>
<div>Traceback (most recent call last):</div><div> File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain</div><div> dom = findMethod(sdUUID)</div><div> File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain</div>
<div> raise se.StorageDomainDoesNotExist(sdUUID)</div><div>StorageDomainDoesNotExist: Storage domain does not exist: ('47df1e9f-6a83-46b9-a7ec-9568abd10d1a',)</div><div>Thread-24::ERROR::2014-03-13 15:51:18,748::domainMonitor::231::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain 47df1e9f-6a83-46b9-a7ec-9568abd10d1a monitoring information</div>
<div>Traceback (most recent call last):</div><div> File "/usr/share/vdsm/storage/domainMonitor.py", line 196, in _monitorDomain</div><div> self.domain = sdCache.produce(self.sdUUID)</div><div> File "/usr/share/vdsm/storage/sdc.py", line 98, in produce</div>
<div> domain.getRealDomain()</div><div> File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain</div><div> return self._cache._realProduce(self._sdUUID)</div><div> File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce</div>
<div> domain = self._findDomain(sdUUID)</div><div> File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain</div><div> dom = findMethod(sdUUID)</div><div> File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain</div>
<div> raise se.StorageDomainDoesNotExist(sdUUID)</div><div>StorageDomainDoesNotExist: Storage domain does not exist: ('47df1e9f-6a83-46b9-a7ec-9568abd10d1a',)</div><div><br></div><div>I destroyed (UI) the storage domain (47df1e9f-6a83-46b9-a7ec-9568abd10d1a was an iso domain) but still get the same errors. I've tried re-installing the hosts without change. Is there a local cache vdsm builds that it might be referencing that I can clear out?</div>
<div><br></div><div>I'm only showing one SD in the db which is my gluster SD:</div><div><br></div><div><div>engine=# select * from storage_domain_static;</div><div> id | storage | storage_name | storage_domain_type | storage_type | storage_domain_format_type | _create_date | _update_dat</div>
<div>e | recoverable | last_time_used_as_master | storage_description | storage_comment </div><div>--------------------------------------+--------------------------------------+--------------+---------------------+--------------+----------------------------+-------------------------------+--------------------</div>
<div>-----------+-------------+--------------------------+---------------------+-----------------</div><div> 0de3b516-6c74-4ad8-8958-d3f571ceda8d | e329ad38-b5d8-47a3-ac01-3dfdd5193032 | rep2 | 0 | 7 | 3 | 2014-02-23 16:20:42.327492-05 | 2014-02-23 16:20:49</div>
<div>.094751-05 | t | 0 | | </div><div>(1 row)</div></div><div><br></div><div><div>engine=# select * from storage_domain_;</div><div>storage_domain_dynamic storage_domain_file_repos storage_domain_static storage_domain_static_view </div>
<div>engine=# select * from storage_domain_dynamic;</div><div> id | available_disk_size | used_disk_size | _update_date </div><div>--------------------------------------+---------------------+----------------+-------------------------------</div>
<div> 0de3b516-6c74-4ad8-8958-d3f571ceda8d | 1887 | 160 | 2014-03-12 01:59:52.258384-04</div><div>(1 row)</div></div><div><br></div><div><br></div><div><div>[root@ovirt001 vdsm]# vdsClient -s 0 getStorageDomainsList</div>
<div><br></div><div>[root@ovirt001 vdsm]# </div></div><div><br></div><div>engine.log contains this repeating:</div><div><br></div><div><div>TaskStatusListReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=654, mMessage=Not SPM]]</div>
<div><br></div><div>2014-03-13 16:02:03,538 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] HostName = ovirt001</div><div>2014-03-13 16:02:03,538 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Command HSMGetAllTasksStatusesVDS execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM</div>
<div>2014-03-13 16:02:03,539 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, SpmStopVDSCommand, log id: 41fe5668</div><div>2014-03-13 16:02:03,539 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spm stop succeeded, continuing with spm selection</div>
<div>2014-03-13 16:02:03,560 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] starting spm on vds ovirt002, storage pool IT, prevId 1, LVER 7</div><div>2014-03-13 16:02:03,561 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] START, SpmStartVDSCommand(HostName = ovirt002, HostId = fac716fc-baff-46fe-9323-fd581a26a983, storagePoolId = 8da661c0-8125-4efb-851e-c9320d268578, prevId=1, prevLVER=7, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=true), log id: 53385b6b</div>
<div>2014-03-13 16:02:03,572 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling started: taskId = 0fb302dd-32ff-4442-aa98-5eecaf571da4</div>
<div>2014-03-13 16:02:05,598 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Failed in HSMGetTaskStatusVDS method</div><div>2014-03-13 16:02:05,599 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Error code AcquireHostIdFailure and error message VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id</div>
<div>2014-03-13 16:02:05,599 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling ended: taskId = 0fb302dd-32ff-4442-aa98-5eecaf571da4 task status = finished</div>
<div>2014-03-13 16:02:05,600 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id</div>
<div>2014-03-13 16:02:05,606 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling ended, spm status: Free</div><div>2014-03-13 16:02:05,607 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] START, HSMClearTaskVDSCommand(HostName = ovirt002, HostId = fac716fc-baff-46fe-9323-fd581a26a983, taskId=0fb302dd-32ff-4442-aa98-5eecaf571da4), log id: 6e65ac77</div>
<div>2014-03-13 16:02:05,613 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, HSMClearTaskVDSCommand, log id: 6e65ac77</div><div>2014-03-13 16:02:05,613 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@798917ef, log id: 53385b6b</div>
<div>2014-03-13 16:02:05,615 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-21) [17828a3f] Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: 8da661c0-8125-4efb-851e-c9320d268578 Type: StoragePool</div>
<div>2014-03-13 16:02:05,656 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-21) [17828a3f] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed</div>
<div>2014-03-13 16:02:15,674 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-39) [7b210b3d] Command org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand return value </div>
<div> </div><div>TaskStatusListReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=654, mMessage=Not SPM]]</div></div><div><br></div><div><br></div><div>Any help appreciated. vdsm.log from both hosts and engine.log attached.</div>
<div><br></div><div><br></div><div><div dir="ltr"><span style="font-family:arial,sans-serif;font-size:16px"><strong>Steve Dainard </strong></span><span style="font-size:12px"></span><br>
<span style="font-family:arial,sans-serif;font-size:12px">IT Infrastructure Manager<br>
<a href="http://miovision.com/" target="_blank">Miovision</a> | <em>Rethink Traffic</em><br><br>
<strong style="font-family:arial,sans-serif;font-size:13px;color:rgb(153,153,153)"><a href="http://miovision.com/blog" target="_blank">Blog</a> | </strong><font color="#999999" style="font-family:arial,sans-serif;font-size:13px"><strong><a href="https://www.linkedin.com/company/miovision-technologies" target="_blank">LinkedIn</a> | <a href="https://twitter.com/miovision" target="_blank">Twitter</a> | <a href="https://www.facebook.com/miovision" target="_blank">Facebook</a></strong></font> </span>
<hr style="font-family:arial,sans-serif;font-size:13px;color:rgb(51,51,51);clear:both">
<div style="color:rgb(153,153,153);font-family:arial,sans-serif;font-size:13px;padding-top:5px">
        <span style="font-family:arial,sans-serif;font-size:12px">Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3</span><br>
        <span style="font-family:arial,sans-serif;font-size:12px">This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.</span></div>
</div></div>
</div></div>