<div dir="ltr"><div>I narrowed down on the commit where the originally reported issue crept in:<br><table class="" style="padding:8px 4px;border-spacing:0px;color:rgb(0,0,0);font-family:monospace;font-size:small;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)">
<tbody><tr><td style="padding:2px 5px;font-size:12px;vertical-align:top">commit</td><td class="" style="padding:2px 5px;font-size:12px;vertical-align:top;font-family:monospace">fc3a44f71d2ef202cff18d7203b9e4165b546621</td>
</tr></tbody></table>building and testing with this commit or subsequent commits yields the original issue.<br><br></div><div>- DHC<br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jan 23, 2013 at 3:56 PM, Dead Horse <span dir="ltr"><<a href="mailto:deadhorseconsulting@gmail.com" target="_blank">deadhorseconsulting@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>Indeed reverting back to an older vdsm clears up the above issue. However now I the issue is see is:<br>
Thread-18::ERROR::2013-01-23 15:50:42,885::task::833::TaskManager.Task::(_setError) Task=`08709e68-bcbc-40d8-843a-d69d4df40ac6`::Unexpected error<div class="im"><br>
Traceback (most recent call last):<br></div> File "/usr/share/vdsm/storage/task.py", line 840, in _run<br> return fn(*args, **kargs)<br> File "/usr/share/vdsm/logUtils.py", line 42, in wrapper<br>
res = f(*args, **kwargs)<br>
File "/usr/share/vdsm/storage/hsm.py", line 923, in connectStoragePool<br> masterVersion, options)<br> File "/usr/share/vdsm/storage/hsm.py", line 970, in _connectStoragePool<br> res = pool.connect(hostID, scsiKey, msdUUID, masterVersion)<br>
File "/usr/share/vdsm/storage/sp.py", line 643, in connect<br> self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)<br> File "/usr/share/vdsm/storage/sp.py", line 1167, in __rebuild<br> self.masterDomain = self.getMasterDomain(msdUUID=msdUUID, masterVersion=masterVersion)<br>
File "/usr/share/vdsm/storage/sp.py", line 1506, in getMasterDomain<br> raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID)<br>StoragePoolMasterNotFound: Cannot find master domain: 'spUUID=f90a0d1c-06ca-11e2-a05b-00151712f280, msdUUID=67534cca-1327-462a-b455-a04464084b31'<br>
Thread-18::DEBUG::2013-01-23 15:50:42,887::task::852::TaskManager.Task::(_run) Task=`08709e68-bcbc-40d8-843a-d69d4df40ac6`::Task._run: 08709e68-bcbc-40d8-843a-d69d4df40ac6 ('f90a0d1c-06ca-11e2-a05b-00151712f280', 2, 'f90a0d1c-06ca-11e2-a05b-00151712f280', '67534cca-1327-462a-b455-a04464084b31', 433) {} failed - stopping task<br>
<br></div>This is with vdsm built from <br><table style="border-spacing:0px;text-indent:0px;letter-spacing:normal;font-variant:normal;text-align:start;font-style:normal;font-weight:normal;padding:8px 4px;line-height:normal;text-transform:none;font-size:small;white-space:normal;font-family:monospace;word-spacing:0px">
<tbody><tr><td style="padding:2px 5px;font-size:12px;vertical-align:top">commit</td><td style="padding:2px 5px;font-size:12px;vertical-align:top;font-family:monospace">25a2d8572ad32352227c98a86631300fbd6523c1</td>
</tr></tbody></table><br></div>- DHC<br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jan 23, 2013 at 10:44 AM, Dead Horse <span dir="ltr"><<a href="mailto:deadhorseconsulting@gmail.com" target="_blank">deadhorseconsulting@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>VDSM was built from:<br>commit 166138e37e75767b32227746bb671b1dab9cdd5e<br><br></div>Attached is the full vdsm log<br>
<br></div>I should also note that from engine perspective it sees the master storage domain as locked and the others as unknown. <br>
</div><div><div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jan 23, 2013 at 2:49 AM, Dan Kenigsberg <span dir="ltr"><<a href="mailto:danken@redhat.com" target="_blank">danken@redhat.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Tue, Jan 22, 2013 at 04:02:24PM -0600, Dead Horse wrote:<br>
> Any ideas on this one? (from VDSM log):<br>
> Thread-25::DEBUG::2013-01-22<br>
> 15:35:29,065::BindingXMLRPC::914::vds::(wrapper) client [3.57.111.30]::call<br>
> getCapabilities with () {}<br>
> Thread-25::ERROR::2013-01-22 15:35:29,113::netinfo::159::root::(speed)<br>
> cannot read ib0 speed<br>
> Traceback (most recent call last):<br>
> File "/usr/lib64/python2.6/site-packages/vdsm/netinfo.py", line 155, in<br>
> speed<br>
> s = int(file('/sys/class/net/%s/speed' % dev).read())<br>
> IOError: [Errno 22] Invalid argument<br>
><br>
> Causes VDSM to fail to attach storage<br>
<br>
</div>I doubt that this is the cause of the failure, as vdsm has always<br>
reported "0" for ib devices, and still is.<br>
<br>
Does a former version works with your Engine?<br>
Could you share more of your vdsm.log? I suppose the culprit lies in one<br>
one of the storage-related commands, not in statistics retrieval.<br>
<div><div><br>
><br>
> Engine side sees:<br>
> ERROR [org.ovirt.engine.core.bll.storage.NFSStorageHelper]<br>
> (QuartzScheduler_Worker-96) [553ef26e] The connection with details<br>
> 192.168.0.1:/ovirt/ds failed because of error code 100 and error message<br>
> is: general exception<br>
> 2013-01-22 15:35:30,160 INFO<br>
> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand]<br>
> (QuartzScheduler_Worker-96) [1ab78378] Running command:<br>
> SetNonOperationalVdsCommand internal: true. Entities affected : ID:<br>
> 8970b3fe-1faf-11e2-bc1f-00151712f280 Type: VDS<br>
> 2013-01-22 15:35:30,200 INFO<br>
> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]<br>
> (QuartzScheduler_Worker-96) [1ab78378] START,<br>
> SetVdsStatusVDSCommand(HostName = kezan, HostId =<br>
> 8970b3fe-1faf-11e2-bc1f-00151712f280, status=NonOperational,<br>
> nonOperationalReason=STORAGE_DOMAIN_UNREACHABLE), log id: 4af5c4cd<br>
> 2013-01-22 15:35:30,211 INFO<br>
> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]<br>
> (QuartzScheduler_Worker-96) [1ab78378] FINISH, SetVdsStatusVDSCommand, log<br>
> id: 4af5c4cd<br>
> 2013-01-22 15:35:30,242 ERROR<br>
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]<br>
> (QuartzScheduler_Worker-96) [1ab78378] Try to add duplicate audit log<br>
> values with the same name. Type: VDS_SET_NONOPERATIONAL_DOMAIN. Value:<br>
> storagepoolname<br>
><br>
> Engine = latest master<br>
> VDSM = latest master<br>
<br>
</div></div>Since "latest master" is an unstable reference by definition, I'm sure<br>
that History would thank you if you post the exact version (git hash?)<br>
of the code.<br>
<br>
> node = el6<br>
<br>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>