[ovirt-devel] Failures in OST (4.0/master) ( was error msg from Jenkins )

Nir Soffer nsoffer at redhat.com
Sun Nov 20 16:37:16 UTC 2016


On Sun, Nov 20, 2016 at 6:30 PM, Eyal Edri <eedri at redhat.com> wrote:
> Renaming title and adding devel.
>
> On Sun, Nov 20, 2016 at 2:36 PM, Piotr Kliczewski <pkliczew at redhat.com>
> wrote:
>>
>> The last failure seems to be storage related.
>>
>> @Nir please take a look.
>>
>> Here is engine side error:
>>
>> 2016-11-20 05:54:59,605 DEBUG
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
>> (default task-5) [59fc0074] Exception:
>> org.ovirt.engine.core.vdsbroker.irsbroker.IRSNoMasterDomainException:
>> IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Cannot
>> find master domain: u'spUUID=1ca141f1-b64d-4a52-8861-05c7de2a72b2,
>> msdUUID=7d4bf750-4fb8-463f-bbb0-92156c47306e'
>>
>> and here is vdsm:
>>
>> jsonrpc.Executor/5::ERROR::2016-11-20
>> 05:54:56,331::multipath::95::Storage.Multipath::(resize_devices) Could not
>> resize device 360014052749733c7b8248628637b990f
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/storage/multipath.py", line 93, in resize_devices
>>     _resize_if_needed(guid)
>>   File "/usr/share/vdsm/storage/multipath.py", line 101, in
>> _resize_if_needed
>>     for slave in devicemapper.getSlaves(name)]
>>   File "/usr/share/vdsm/storage/multipath.py", line 158, in getDeviceSize
>>     bs, phyBs = getDeviceBlockSizes(devName)
>>   File "/usr/share/vdsm/storage/multipath.py", line 150, in
>> getDeviceBlockSizes
>>     "queue", "logical_block_size")).read())
>> IOError: [Errno 2] No such file or directory:
>> '/sys/block/sdb/queue/logical_block_size'

Please open a bug for this, this is an expected situation (when device is
during a scan), and we should be able to cope with it.

Adding Fred who worked on this area.

Nir

> We now see a different error in master [1], which also indicates the hosts
> are in a problematic state: ( failing 'assign_hosts_network_label' test  )
>
> status: 409
> reason: Conflict
> detail: Cannot add Label. Operation can be performed only when Host status
> is  Maintenance, Up, NonOperational.
> -------------------- >> begin captured logging << --------------------
>
>
> [1]
> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/testReport/junit/(root)/006_network_by_label/assign_hosts_network_label/
>
>
>>
>>
>>
>> On Sun, Nov 20, 2016 at 12:50 PM, Eyal Edri <eedri at redhat.com> wrote:
>>>
>>>
>>>
>>> On Sun, Nov 20, 2016 at 1:42 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>
>>>>
>>>>
>>>> On Sun, Nov 20, 2016 at 1:30 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Nov 20, 2016 at 1:18 PM, Eyal Edri <eedri at redhat.com> wrote:
>>>>>>
>>>>>> the test fails to run VM because no hosts are in UP state(?) [1], not
>>>>>> sure it is related to the triggering patch[2]
>>>>>>
>>>>>> status: 400
>>>>>> reason: Bad Request
>>>>>> detail: There are no hosts to use. Check that the cluster contains at
>>>>>> least one host in Up state.
>>>>>>
>>>>>> Thoughts? Shouldn't we fail the test earlier we hosts are not UP?
>>>>>
>>>>>
>>>>> Yes. It's more likely that we are picking the wrong host or so, but who
>>>>> knows - where are the engine and VDSM logs?
>>>>
>>>>
>>>> A simple grep on the engine.log[1] finds serveral unrelated issues I'm
>>>> not sure are reported, it's despairing to even begin...
>>>> That being said, I don't see the issue there. We may need better logging
>>>> on the API level, to see what is being sent. Is it consistent?
>>>
>>>
>>> Just failed now the first time, I didn't see it before.
>>>
>>>>
>>>> Y.
>>>>
>>>>
>>>> [1]
>>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/artifact/exported-artifacts/basic_suite_4.0.sh-el7/exported-artifacts/test_logs/basic-suite-4.0/post-004_basic_sanity.py/lago-basic-suite-4-0-engine/_var_log_ovirt-engine/engine.log
>>>>>
>>>>> Y.
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/testReport/junit/(root)/004_basic_sanity/vm_run/
>>>>>> [2]
>>>>>> http://jenkins.ovirt.org/job/ovirt-engine_4.0_build-artifacts-el7-x86_64/1535/changes#detail
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Nov 20, 2016 at 1:00 PM, <jenkins at jenkins.phx.ovirt.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> Build:
>>>>>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/,
>>>>>>> Build Number: 3015,
>>>>>>> Build Status: FAILURE
>>>>>>> _______________________________________________
>>>>>>> Infra mailing list
>>>>>>> Infra at ovirt.org
>>>>>>> http://lists.ovirt.org/mailman/listinfo/infra
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Eyal Edri
>>>>>> Associate Manager
>>>>>> RHV DevOps
>>>>>> EMEA ENG Virtualization R&D
>>>>>> Red Hat Israel
>>>>>>
>>>>>> phone: +972-9-7692018
>>>>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Eyal Edri
>>> Associate Manager
>>> RHV DevOps
>>> EMEA ENG Virtualization R&D
>>> Red Hat Israel
>>>
>>> phone: +972-9-7692018
>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>
>>
>
>
>
> --
> Eyal Edri
> Associate Manager
> RHV DevOps
> EMEA ENG Virtualization R&D
> Red Hat Israel
>
> phone: +972-9-7692018
> irc: eedri (on #tlv #rhev-dev #rhev-integ)



More information about the Devel mailing list