<div dir="ltr"><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1443913">Done</a></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Apr 20, 2017 at 11:32 AM, Yaniv Kaul <span dir="ltr"><<a href="mailto:ykaul@redhat.com" target="_blank">ykaul@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">No, that's not the issue. <div>I've seen it happening few times.</div><div><br></div><div>1. It always with the ISO domain (which we don't use anyway in o-s-t)</div><div>2. Apparently, only one host is asking for a mount:</div><div> authenticated mount request from <a href="http://192.168.201.4:713" target="_blank">192.168.201.4:713</a> for /exports/nfs/iso (/exports/nfs/iso)<br></div><div><br></div><div>(/var/log/messages of the NFS server)</div><div><br></div><div>And indeed, you can see in[1] that host1 made the request and all is well on it.</div><div><br></div><div>However, there are connection issues with host0 which cause a timeout to connectStorageServer():</div><div><div>From[2]:</div><div></div></div><div><br></div><div><span style="color:rgb(0,0,0)">2017-04-19 18:58:58,465-04 DEBUG [org.ovirt.vdsm.jsonrpc.<wbr>client.internal.<wbr>ResponseWorker] (ResponseWorker) [] Message received: {"jsonrpc":"2.0","error":{"<wbr>code":"lago-basic-suite-<wbr>master-host0:192912448","<wbr>message":"Vds timeout occured"},"id":null}</span><br></div><div><pre style="color:rgb(0,0,0)">2017-04-19 18:58:58,475-04 ERROR [org.ovirt.engine.core.dal.<wbr>dbbroker.auditloghandling.<wbr>AuditLogDirector] (org.ovirt.thread.pool-7-<wbr>thread-37) [755b908a] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,<wbr>802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM lago-basic-suite-master-host0 command ConnectStorageServerVDS failed: Message timeout which can be caused by communication issues
2017-04-19 18:58:58,475-04 INFO [org.ovirt.engine.core.<wbr>vdsbroker.vdsbroker.<wbr>ConnectStorageServerVDSCommand<wbr>] (org.ovirt.thread.pool-7-<wbr>thread-37) [755b908a] Command 'org.ovirt.engine.core.<wbr>vdsbroker.vdsbroker.<wbr>ConnectStorageServerVDSCommand<wbr>' return value '
ServerConnectionStatusReturn:{<wbr>status='Status [code=5022, message=Message timeout which can be caused by communication issues]'}
</pre></div><div><br></div><div>I wonder why, but on /var/log/messages[3], I'm seeing:</div><div><div>Apr 19 18:56:58 lago-basic-suite-master-host0 journal: vdsm Executor WARN Worker blocked: <Worker name=jsonrpc/3 running <Task <JsonRpcTask {'params': {u'connectionParams': [{u'id': u'4ca8fc84-d872-4a7f-907f-<wbr>9445bda7b6d1', u'connection': u'192.168.201.3:/exports/nfs/<wbr>share1', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'protocol_version': u'4.2', u'password': '********', u'port': u''}], u'storagepoolID': u'00000000-0000-0000-0000-<wbr>000000000000', u'domainType': 1}, 'jsonrpc': '2.0', 'method': u'StoragePool.<wbr>connectStorageServer', 'id': u'057da9c2-1e67-4c2f-9511-<wbr>7d9de250386b'} at 0x2f44110> timeout=60, duration=60 at 0x2f44310> task#=9 at 0x2ac11d0></div></div><div>...</div><div><br></div><div><br></div><div>3. Also, there is still the infamous unable to update response issues.</div><div><br></div><div><pre style="color:rgb(0,0,0)">{"jsonrpc":"2.0","method":"<wbr>Host.ping","params":{},"id":"<wbr>7cb6052f-c732-4f7c-bd2d-<wbr>e48c2ae1f5e0"}�
2017-04-19 18:54:27,843-04 DEBUG [org.ovirt.vdsm.jsonrpc.<wbr>client.reactors.stomp.<wbr>StompCommonClient] (org.ovirt.thread.pool-7-<wbr>thread-15) [62d198cc] Message sent: SEND
destination:jms.topic.vdsm_<wbr>requests
content-length:94
ovirtCorrelationId:62d198cc
reply-to:jms.topic.vdsm_<wbr>responses
<JsonRpcRequest id: "7cb6052f-c732-4f7c-bd2d-<wbr>e48c2ae1f5e0", method: Host.ping, params: {}>
2017-04-19 18:54:27,885-04 DEBUG [org.ovirt.vdsm.jsonrpc.<wbr>client.reactors.stomp.impl.<wbr>Message] (org.ovirt.thread.pool-7-<wbr>thread-16) [1f9aac13] SEND
ovirtCorrelationId:1f9aac13
destination:jms.topic.vdsm_<wbr>requests
reply-to:jms.topic.vdsm_<wbr>responses
content-length:94</pre><pre style="color:rgb(0,0,0)">...</pre><pre style="color:rgb(0,0,0)"><pre>{"jsonrpc": "2.0", "id": "7cb6052f-c732-4f7c-bd2d-<wbr>e48c2ae1f5e0", "result": true}�
2017-04-19 18:54:32,132-04 DEBUG [org.ovirt.vdsm.jsonrpc.<wbr>client.internal.<wbr>ResponseWorker] (ResponseWorker) [] Message received: {"jsonrpc": "2.0", "id": "7cb6052f-c732-4f7c-bd2d-<wbr>e48c2ae1f5e0", "result": true}
2017-04-19 18:54:32,133-04 ERROR [org.ovirt.vdsm.jsonrpc.<wbr>client.JsonRpcClient] (ResponseWorker) [] Not able to update response for "7cb6052f-c732-4f7c-bd2d-<wbr>e48c2ae1f5e0"</pre><pre><br></pre><pre>Would be nice to understand why.</pre><pre><br></pre><pre><br></pre><pre>4. Lastly, MOM is not running. Why?</pre><pre><br></pre></pre></div><div><div>Please open a bug with the details from item #2 above.</div><div>Y.</div></div><div><br></div><div><br></div><div>[1] <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6403/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host1/_var_log/vdsm/supervdsm.log" target="_blank">http://jenkins.ovirt.org/<wbr>job/test-repo_ovirt_<wbr>experimental_master/6403/<wbr>artifact/exported-artifacts/<wbr>basic-suit-master-el7/test_<wbr>logs/basic-suite-master/post-<wbr>002_bootstrap.py/lago-basic-<wbr>suite-master-host1/_var_log/<wbr>vdsm/supervdsm.log</a></div><div>[2] <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6403/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log" target="_blank">http://jenkins.ovirt.org/<wbr>job/test-repo_ovirt_<wbr>experimental_master/6403/<wbr>artifact/exported-artifacts/<wbr>basic-suit-master-el7/test_<wbr>logs/basic-suite-master/post-<wbr>002_bootstrap.py/lago-basic-<wbr>suite-master-engine/_var_log/<wbr>ovirt-engine/engine.log</a></div><div>[3] <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6403/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host0/_var_log/messages" target="_blank">http://jenkins.ovirt.org/<wbr>job/test-repo_ovirt_<wbr>experimental_master/6403/<wbr>artifact/exported-artifacts/<wbr>basic-suit-master-el7/test_<wbr>logs/basic-suite-master/post-<wbr>002_bootstrap.py/lago-basic-<wbr>suite-master-host0/_var_log/<wbr>messages</a></div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Thu, Apr 20, 2017 at 9:27 AM, Gil Shinar <span dir="ltr"><<a href="mailto:gshinar@redhat.com" target="_blank">gshinar@redhat.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div dir="ltr">Test failed: add_secondary_storage_domains<br>Link to suspected patches:<br>Link to Job: <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6403" target="_blank">http://jenkins.ovirt.org/<wbr>job/test-repo_ovirt_experiment<wbr>al_master/6403</a><br>Link to all logs: <a href="http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6403/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py" target="_blank">http://jenkins.ovirt.org<wbr>/job/test-repo_ovirt_experimen<wbr>tal_master/6403/artifact/<wbr>exported-artifacts/basic-suit-<wbr>master-el7/test_logs/basic-<wbr>suite-master/post-002_<wbr>bootstrap.py</a><pre style="color:rgb(0,0,0)"><br></pre>Error seems to be:<br><b>2017-04-19 18:58:58,774-0400 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='8f9699ed-cc2f-434b-aa1e<wbr>-b3c8ff30324a') Unexpected error (task:871)<br>Traceback (most recent call last):<br> File "/usr/lib/python2.7/site-packa<wbr>ges/vdsm/storage/task.py", line 878, in _run<br> return fn(*args, **kargs)<br> File "/usr/lib/python2.7/site-packa<wbr>ges/vdsm/logUtils.py", line 52, in wrapper<br> res = f(*args, **kwargs)<br> File "/usr/share/vdsm/storage/hsm.p<wbr>y", line 2709, in getStorageDomainInfo<br> dom = self.validateSdUUID(sdUUID)<br> File "/usr/share/vdsm/storage/hsm.p<wbr>y", line 298, in validateSdUUID<br> sdDom = sdCache.produce(sdUUID=sdUUID)<br> File "/usr/share/vdsm/storage/sdc.p<wbr>y", line 112, in produce<br> domain.getRealDomain()<br> File "/usr/share/vdsm/storage/sdc.p<wbr>y", line 53, in getRealDomain<br> return self._cache._realProduce(self.<wbr>_sdUUID)<br> File "/usr/share/vdsm/storage/sdc.p<wbr>y", line 136, in _realProduce<br> domain = self._findDomain(sdUUID)<br> File "/usr/share/vdsm/storage/sdc.p<wbr>y", line 153, in _findDomain<br> return findMethod(sdUUID)<br> File "/usr/share/vdsm/storage/sdc.p<wbr>y", line 178, in _findUnfetchedDomain<br> raise se.StorageDomainDoesNotExist(s<wbr>dUUID)<br>StorageDomainDoesNotExist: Storage domain does not exist: (u'ac3bbc93-26ba-4ea8-8e76-c5b<wbr>761f01931',)<br>2017-04-19 18:58:58,777-0400 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='8f9699ed-cc2f-434b-aa1e<wbr>-b3c8ff30324a') aborting: Task is aborted: 'Storage domain does not exist' - code 358 (task:1176)<br>2017-04-19 18:58:58,777-0400 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Storage domain does not exist: (u'ac3bbc93-26ba-4ea8-8e76-c5b<wbr>761f01931',)", 'code': 358}} (dispatcher:78)</b></div>
<br></div></div>______________________________<wbr>_________________<br>
Devel mailing list<br>
<a href="mailto:Devel@ovirt.org" target="_blank">Devel@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/devel</a><br></blockquote></div><br></div>
</blockquote></div><br></div>