On Sun, Mar 18, 2018 at 4:21 PM Yedidyah Bar David <
didi@redhat.com> wrote:
On Sun, Mar 18, 2018 at 2:48 PM, Yedidyah Bar David <didi@redhat.com> wrote:
> On Sun, Mar 18, 2018 at 1:45 PM, Yedidyah Bar David <didi@redhat.com> wrote:
>> On Sun, Mar 18, 2018 at 11:20 AM, <jenkins@jenkins.phx.ovirt.org> wrote:
>>> Project: http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-4.1/
>>> Build: http://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-4.1/223/
>>> Build Number: 223
>>> Build Status: Still Failing
>>> Triggered By: Started by timer
>>
>> It was broken by:
>>
>> [1] https://gerrit.ovirt.org/88483
>>
>> It should be fixed by:
>>
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1554283
>>
>> [2] is on modified, no idea about its status other than that.
>>
>> I didn't intend to merge [1] before [2] is fixed, not sure if
>> Sandro didn't notice my comment there, or thought that [2] is fixed.
>>
>> If it's annoying we can revert [1] and re-merge when [2] is fixed.
>
> Talked with Gal, and pushed this, which should hopefully fix:
>
> https://gerrit.ovirt.org/89136
It indeed seems to fix [1][2]:
13:13:24 # he_get_shared_config:
13:13:26 # he_get_shared_config: Success (in 0:00:01)
13:13:26 # sleep:
13:15:26 # sleep: Success (in 0:02:00)
13:15:26 # add_he_hosts:
13:16:18 # add_he_hosts: Success (in 0:00:52)
13:16:18 # he_check_ha_agent:
13:16:19 # he_check_ha_agent: Success (in 0:00:00)
But later fails:
13:16:20 # add_secondary_storage_domains:
13:19:30 Error while running thread
13:19:30 Traceback (most recent call last):
13:19:30 File "/usr/lib/python2.7/site-packages/lago/utils.py", line
58, in _ret_via_queue
13:19:30 queue.put({'return': func()})
13:19:30 File
"/home/jenkins/workspace/ovirt-system-tests_master_check-patch-el7-x86_64/ovirt-system-tests/he-basic-suite-4.1/test-scenarios/002_bootstrap.py",
line 491, in add_nfs_storage_domain
13:19:30 add_generic_nfs_storage_domain(prefix, SD_NFS_NAME,
SD_NFS_HOST_NAME, SD_NFS_PATH)
13:19:30 File
"/home/jenkins/workspace/ovirt-system-tests_master_check-patch-el7-x86_64/ovirt-system-tests/he-basic-suite-4.1/test-scenarios/002_bootstrap.py",
line 496, in add_generic_nfs_storage_domain
13:19:30 add_generic_nfs_storage_domain_4(prefix, sd_nfs_name,
nfs_host_name, mount_path, sd_format, sd_type, nfs_version)
13:19:30 File
"/home/jenkins/workspace/ovirt-system-tests_master_check-patch-el7-x86_64/ovirt-system-tests/he-basic-suite-4.1/test-scenarios/002_bootstrap.py",
line 552, in add_generic_nfs_storage_domain_4
13:19:30 _add_storage_domain_4(api, p)
13:19:30 File
"/home/jenkins/workspace/ovirt-system-tests_master_check-patch-el7-x86_64/ovirt-system-tests/he-basic-suite-4.1/test-scenarios/002_bootstrap.py",
line 466, in _add_storage_domain_4
13:19:30 id=sd.id,
13:19:30 File
"/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py", line 2219,
in add
13:19:30 return self._internal_add(storage_domain, headers, query, wait)
13:19:30 File
"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 223,
in _internal_add
13:19:30 return future.wait() if wait else future
13:19:30 File
"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 53, in
wait
13:19:30 return self._code(response)
13:19:30 File
"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 220,
in callback
13:19:30 self._check_fault(response)
13:19:30 File
"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 123,
in _check_fault
13:19:30 self._raise_error(response, body)
13:19:30 File
"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 109,
in _raise_error
13:19:30 raise error
13:19:30 Error: Fault reason is "Operation Failed". Fault detail is
"[Storage domain cannot be reached. Please ensure it is accessible
from the host(s).]". HTTP response code is 400.
13:19:30 Error while running thread
I'll retrigger now, but perhaps someone from storage wants to check.
vdsm log has[3]:
2018-03-18 08:59:52,918-0400 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer]
RPC call Host.getHardwareInfo succeeded in 0.00 seconds (__init__:539)
2018-03-18 08:59:52,924-0400 INFO (jsonrpc/4) [vdsm.api] START
prepareImage(sdUUID=u'424e809d-b7ad-4ed5-b6a1-5426d373a5d2',
spUUID=u'8509c64f-cdd7-4713-9e09-a79d90ba26ed',
imgUUID=u'718090ba-b36e-45cc-bcd6-597c16a766b9',
leafUUID=u'23ee843b-20e0-4afe-98dc-e165334ac710', allowIllegal=False)
from=::1,36020, task_id=92aeabf2-084a-4c98-a6e5-03efae38e7b3 (api:46)
2018-03-18 08:59:52,928-0400 INFO (jsonrpc/4) [vdsm.api] FINISH
prepareImage error=Volume does not exist:
(u'23ee843b-20e0-4afe-98dc-e165334ac710',) from=::1,36020,
task_id=92aeabf2-084a-4c98-a6e5-03efae38e7b3 (api:50)
2018-03-18 08:59:52,928-0400 ERROR (jsonrpc/4)
[storage.TaskManager.Task]
(Task='92aeabf2-084a-4c98-a6e5-03efae38e7b3') Unexpected error
(task:872)
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 879, in _run
return fn(*args, **kargs)
File "<string>", line 2, in prepareImage
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
ret = func(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 3137, in prepareImage
raise se.VolumeDoesNotExist(leafUUID)
VolumeDoesNotExist: Volume does not exist:
(u'23ee843b-20e0-4afe-98dc-e165334ac710',)
Immediately after this failed prepare, we see:
2018-03-18 08:59:52,941-0400 INFO (jsonrpc/7) [vdsm.api] START createVolume(sdUUID=u'424e809d-b7ad-4ed5-b6a1-5426d373a5d2', spUUID=u'8509c64f-cdd7-4713-9e09-a79d90ba26ed', imgUUID=u'718090ba-b36e-45cc-bcd6-597c16a766b9', size=u'1048576', volFormat=5, preallocate=1, diskType=2, volUUID=u'23ee843b-20e0-4afe-98dc-e165334ac710', desc=u'hosted-engine.lockspace', srcImgUUID=u'00000000-0000-0000-0000-000000000000', srcVolUUID=u'00000000-0000-0000-0000-000000000000', initialSize=None) from=::1,36020, task_id=c4c73314-ee10-419a-8265-b8b87763e99f (api:46)
So that volume really did not exit when it was prepared, which means it is a client
error, not vdsm error.
exits. Non existing volume is considered expected error in this call and will be
logged as with INFO log level.