<div dir="ltr">I was playing with some gluster performance options on my storage volume and managed to crash it.<div><br></div><div>On recovery (and heal) my storage domain won&#39;t come up.</div><div><br></div><div>I&#39;m getting these errors in vdsm.log:</div>
<div><br></div><div><div>Thread-24::WARNING::2014-03-13 15:51:18,741::lvm::391::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] [&#39;  Volume group &quot;47df1e9f-6a83-46b9-a7ec-9568abd10d1a&quot; not found&#39;, &#39;  Skipping volume group 47df1e9f-6a83-46b9-a7ec-9568abd10d1a&#39;]</div>
<div>Thread-24::DEBUG::2014-03-13 15:51:18,741::lvm::428::OperationMutex::(_reloadvgs) Operation &#39;lvm reload operation&#39; released the operation mutex</div><div>Thread-24::ERROR::2014-03-13 15:51:18,748::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 47df1e9f-6a83-46b9-a7ec-9568abd10d1a not found</div>
<div>Traceback (most recent call last):</div><div>  File &quot;/usr/share/vdsm/storage/sdc.py&quot;, line 141, in _findDomain</div><div>    dom = findMethod(sdUUID)</div><div>  File &quot;/usr/share/vdsm/storage/sdc.py&quot;, line 171, in _findUnfetchedDomain</div>
<div>    raise se.StorageDomainDoesNotExist(sdUUID)</div><div>StorageDomainDoesNotExist: Storage domain does not exist: (&#39;47df1e9f-6a83-46b9-a7ec-9568abd10d1a&#39;,)</div><div>Thread-24::ERROR::2014-03-13 15:51:18,748::domainMonitor::231::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain 47df1e9f-6a83-46b9-a7ec-9568abd10d1a monitoring information</div>
<div>Traceback (most recent call last):</div><div>  File &quot;/usr/share/vdsm/storage/domainMonitor.py&quot;, line 196, in _monitorDomain</div><div>    self.domain = sdCache.produce(self.sdUUID)</div><div>  File &quot;/usr/share/vdsm/storage/sdc.py&quot;, line 98, in produce</div>
<div>    domain.getRealDomain()</div><div>  File &quot;/usr/share/vdsm/storage/sdc.py&quot;, line 52, in getRealDomain</div><div>    return self._cache._realProduce(self._sdUUID)</div><div>  File &quot;/usr/share/vdsm/storage/sdc.py&quot;, line 122, in _realProduce</div>
<div>    domain = self._findDomain(sdUUID)</div><div>  File &quot;/usr/share/vdsm/storage/sdc.py&quot;, line 141, in _findDomain</div><div>    dom = findMethod(sdUUID)</div><div>  File &quot;/usr/share/vdsm/storage/sdc.py&quot;, line 171, in _findUnfetchedDomain</div>
<div>    raise se.StorageDomainDoesNotExist(sdUUID)</div><div>StorageDomainDoesNotExist: Storage domain does not exist: (&#39;47df1e9f-6a83-46b9-a7ec-9568abd10d1a&#39;,)</div><div><br></div><div>I destroyed (UI) the storage domain (47df1e9f-6a83-46b9-a7ec-9568abd10d1a was an iso domain) but still get the same errors. I&#39;ve tried re-installing the hosts without change. Is there a local cache vdsm builds that it might be referencing that I can clear out?</div>
<div><br></div><div>I&#39;m only showing one SD in the db which is my gluster SD:</div><div><br></div><div><div>engine=# select * from storage_domain_static;</div><div>                  id                  |               storage                | storage_name | storage_domain_type | storage_type | storage_domain_format_type |         _create_date          |         _update_dat</div>
<div>e          | recoverable | last_time_used_as_master | storage_description | storage_comment </div><div>--------------------------------------+--------------------------------------+--------------+---------------------+--------------+----------------------------+-------------------------------+--------------------</div>
<div>-----------+-------------+--------------------------+---------------------+-----------------</div><div> 0de3b516-6c74-4ad8-8958-d3f571ceda8d | e329ad38-b5d8-47a3-ac01-3dfdd5193032 | rep2         |                   0 |            7 | 3                          | 2014-02-23 16:20:42.327492-05 | 2014-02-23 16:20:49</div>
<div>.094751-05 | t           |                        0 |                     | </div><div>(1 row)</div></div><div><br></div><div><div>engine=# select * from storage_domain_;</div><div>storage_domain_dynamic      storage_domain_file_repos   storage_domain_static       storage_domain_static_view  </div>
<div>engine=# select * from storage_domain_dynamic;</div><div>                  id                  | available_disk_size | used_disk_size |         _update_date          </div><div>--------------------------------------+---------------------+----------------+-------------------------------</div>
<div> 0de3b516-6c74-4ad8-8958-d3f571ceda8d |                1887 |            160 | 2014-03-12 01:59:52.258384-04</div><div>(1 row)</div></div><div><br></div><div><br></div><div><div>[root@ovirt001 vdsm]# vdsClient -s 0 getStorageDomainsList</div>
<div><br></div><div>[root@ovirt001 vdsm]# </div></div><div><br></div><div>engine.log contains this repeating:</div><div><br></div><div><div>TaskStatusListReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=654, mMessage=Not SPM]]</div>
<div><br></div><div>2014-03-13 16:02:03,538 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] HostName = ovirt001</div><div>2014-03-13 16:02:03,538 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Command HSMGetAllTasksStatusesVDS execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM</div>
<div>2014-03-13 16:02:03,539 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, SpmStopVDSCommand, log id: 41fe5668</div><div>2014-03-13 16:02:03,539 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spm stop succeeded, continuing with spm selection</div>
<div>2014-03-13 16:02:03,560 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] starting spm on vds ovirt002, storage pool IT, prevId 1, LVER 7</div><div>2014-03-13 16:02:03,561 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] START, SpmStartVDSCommand(HostName = ovirt002, HostId = fac716fc-baff-46fe-9323-fd581a26a983, storagePoolId = 8da661c0-8125-4efb-851e-c9320d268578, prevId=1, prevLVER=7, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=true), log id: 53385b6b</div>
<div>2014-03-13 16:02:03,572 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling started: taskId = 0fb302dd-32ff-4442-aa98-5eecaf571da4</div>
<div>2014-03-13 16:02:05,598 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Failed in HSMGetTaskStatusVDS method</div><div>2014-03-13 16:02:05,599 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Error code AcquireHostIdFailure and error message VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id</div>
<div>2014-03-13 16:02:05,599 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling ended: taskId = 0fb302dd-32ff-4442-aa98-5eecaf571da4 task status = finished</div>
<div>2014-03-13 16:02:05,600 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire host id</div>
<div>2014-03-13 16:02:05,606 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] spmStart polling ended, spm status: Free</div><div>2014-03-13 16:02:05,607 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] START, HSMClearTaskVDSCommand(HostName = ovirt002, HostId = fac716fc-baff-46fe-9323-fd581a26a983, taskId=0fb302dd-32ff-4442-aa98-5eecaf571da4), log id: 6e65ac77</div>
<div>2014-03-13 16:02:05,613 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, HSMClearTaskVDSCommand, log id: 6e65ac77</div><div>2014-03-13 16:02:05,613 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-21) [4f649953] FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@798917ef, log id: 53385b6b</div>
<div>2014-03-13 16:02:05,615 INFO  [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-21) [17828a3f] Running command: SetStoragePoolStatusCommand internal: true. Entities affected :  ID: 8da661c0-8125-4efb-851e-c9320d268578 Type: StoragePool</div>
<div>2014-03-13 16:02:05,656 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-21) [17828a3f] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed</div>
<div>2014-03-13 16:02:15,674 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-39) [7b210b3d] Command org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand return value </div>
<div> </div><div>TaskStatusListReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=654, mMessage=Not SPM]]</div></div><div><br></div><div><br></div><div>Any help appreciated. vdsm.log from both hosts and engine.log attached.</div>
<div><br></div><div><br></div><div><div dir="ltr"><span style="font-family:arial,sans-serif;font-size:16px"><strong>Steve Dainard </strong></span><span style="font-size:12px"></span><br>
<span style="font-family:arial,sans-serif;font-size:12px">IT Infrastructure Manager<br>
<a href="http://miovision.com/" target="_blank">Miovision</a> | <em>Rethink Traffic</em><br><br>
<strong style="font-family:arial,sans-serif;font-size:13px;color:rgb(153,153,153)"><a href="http://miovision.com/blog" target="_blank">Blog</a>  |  </strong><font color="#999999" style="font-family:arial,sans-serif;font-size:13px"><strong><a href="https://www.linkedin.com/company/miovision-technologies" target="_blank">LinkedIn</a>  |  <a href="https://twitter.com/miovision" target="_blank">Twitter</a>  |  <a href="https://www.facebook.com/miovision" target="_blank">Facebook</a></strong></font> </span>
<hr style="font-family:arial,sans-serif;font-size:13px;color:rgb(51,51,51);clear:both">
<div style="color:rgb(153,153,153);font-family:arial,sans-serif;font-size:13px;padding-top:5px">
        <span style="font-family:arial,sans-serif;font-size:12px">Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3</span><br>
        <span style="font-family:arial,sans-serif;font-size:12px">This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.</span></div>
</div></div>
</div></div>