No, that's not the issue.I've seen it happening few times.1. It always with the ISO domain (which we don't use anyway in o-s-t)2. Apparently, only one host is asking for a mount:authenticated mount request from 192.168.201.4:713 for /exports/nfs/iso (/exports/nfs/iso)(/var/log/messages of the NFS server)And indeed, you can see in[1] that host1 made the request and all is well on it.However, there are connection issues with host0 which cause a timeout to connectStorageServer():From[2]:2017-04-19 18:58:58,465-04 DEBUG [org.ovirt.vdsm.jsonrpc.client.internal. ResponseWorker] (ResponseWorker) [] Message received: {"jsonrpc":"2.0","error":{" code":"lago-basic-suite- master-host0:192912448"," message":"Vds timeout occured"},"id":null} 2017-04-19 18:58:58,475-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling. AuditLogDirector] (org.ovirt.thread.pool-7- thread-37) [755b908a] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10, 802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM lago-basic-suite-master-host0 command ConnectStorageServerVDS failed: Message timeout which can be caused by communication issues 2017-04-19 18:58:58,475-04 INFO [org.ovirt.engine.core. vdsbroker.vdsbroker. ConnectStorageServerVDSCommand ] (org.ovirt.thread.pool-7- thread-37) [755b908a] Command 'org.ovirt.engine.core. vdsbroker.vdsbroker. ConnectStorageServerVDSCommand ' return value ' ServerConnectionStatusReturn:{ status='Status [code=5022, message=Message timeout which can be caused by communication issues]'} I wonder why, but on /var/log/messages[3], I'm seeing:Apr 19 18:56:58 lago-basic-suite-master-host0 journal: vdsm Executor WARN Worker blocked: <Worker name=jsonrpc/3 running <Task <JsonRpcTask {'params': {u'connectionParams': [{u'id': u'4ca8fc84-d872-4a7f-907f-9445bda7b6d1', u'connection': u'192.168.201.3:/exports/nfs/ share1', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'protocol_version': u'4.2', u'password': '********', u'port': u''}], u'storagepoolID': u'00000000-0000-0000-0000- 000000000000', u'domainType': 1}, 'jsonrpc': '2.0', 'method': u'StoragePool. connectStorageServer', 'id': u'057da9c2-1e67-4c2f-9511- 7d9de250386b'} at 0x2f44110> timeout=60, duration=60 at 0x2f44310> task#=9 at 0x2ac11d0> ...3. Also, there is still the infamous unable to update response issues.{"jsonrpc":"2.0","method":"Host.ping","params":{},"id":" 7cb6052f-c732-4f7c-bd2d- e48c2ae1f5e0"}� 2017-04-19 18:54:27,843-04 DEBUG [org.ovirt.vdsm.jsonrpc. client.reactors.stomp. StompCommonClient] (org.ovirt.thread.pool-7- thread-15) [62d198cc] Message sent: SEND destination:jms.topic.vdsm_ requests content-length:94 ovirtCorrelationId:62d198cc reply-to:jms.topic.vdsm_ responses <JsonRpcRequest id: "7cb6052f-c732-4f7c-bd2d- e48c2ae1f5e0", method: Host.ping, params: {}> 2017-04-19 18:54:27,885-04 DEBUG [org.ovirt.vdsm.jsonrpc. client.reactors.stomp.impl. Message] (org.ovirt.thread.pool-7- thread-16) [1f9aac13] SEND ovirtCorrelationId:1f9aac13 destination:jms.topic.vdsm_ requests reply-to:jms.topic.vdsm_ responses content-length:94 ...{"jsonrpc": "2.0", "id": "7cb6052f-c732-4f7c-bd2d-e48c2ae1f5e0", "result": true}� 2017-04-19 18:54:32,132-04 DEBUG [org.ovirt.vdsm.jsonrpc. client.internal. ResponseWorker] (ResponseWorker) [] Message received: {"jsonrpc": "2.0", "id": "7cb6052f-c732-4f7c-bd2d- e48c2ae1f5e0", "result": true} 2017-04-19 18:54:32,133-04 ERROR [org.ovirt.vdsm.jsonrpc. client.JsonRpcClient] (ResponseWorker) [] Not able to update response for "7cb6052f-c732-4f7c-bd2d- e48c2ae1f5e0" Would be nice to understand why.4. Lastly, MOM is not running. Why?Please open a bug with the details from item #2 above.Y.On Thu, Apr 20, 2017 at 9:27 AM, Gil Shinar <gshinar@redhat.com> wrote:______________________________Test failed: add_secondary_storage_domains
Link to suspected patches:
Link to Job: http://jenkins.ovirt.org/job/test-repo_ovirt_experiment al_master/6403
Link to all logs: http://jenkins.ovirt.org/job/test-repo_ovirt_experimen tal_master/6403/artifact/ exported-artifacts/basic-suit- master-el7/test_logs/basic- suite-master/post-002_ bootstrap.py Error seems to be:
2017-04-19 18:58:58,774-0400 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='8f9699ed-cc2f-434b-aa1e-b3c8ff30324a') Unexpected error (task:871)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 878, in _run
return fn(*args, **kargs)
File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 2709, in getStorageDomainInfo
dom = self.validateSdUUID(sdUUID)
File "/usr/share/vdsm/storage/hsm.py", line 298, in validateSdUUID
sdDom = sdCache.produce(sdUUID=sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 112, in produce
domain.getRealDomain()
File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 136, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 153, in _findDomain
return findMethod(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 178, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'ac3bbc93-26ba-4ea8-8e76-c5b761f01931',)
2017-04-19 18:58:58,777-0400 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='8f9699ed-cc2f-434b-aa1e-b3c8ff30324a') aborting: Task is aborted: 'Storage domain does not exist' - code 358 (task:1176)
2017-04-19 18:58:58,777-0400 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Storage domain does not exist: (u'ac3bbc93-26ba-4ea8-8e76-c5b761f01931',)", 'code': 358}} (dispatcher:78) _________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel