[ovirt-devel] [OST Failure Report][oVirt master][2017-04-11] add_secondary_storage_domains

Barak Korren bkorren at redhat.com
Wed Apr 12 06:24:43 UTC 2017


This seems to have been the Secodary SD issue I reported on Sunday:
http://lists.ovirt.org/pipermail/devel/2017-April/030139.html

That was up until build #6307 where AddHost started failing:
http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6307


On 11 April 2017 at 18:12, Piotr Kliczewski <piotr.kliczewski at gmail.com>
wrote:

> I looks like SD activate failed on the engine side and the spm connection
> was gently closed by the engine:
>
> 2017-04-11 09:10:24,807-04 INFO  [org.ovirt.engine.core.bll.
> storage.domain.ActivateStorageDomainCommand] (default task-29) [2b66a42b]
> ActivateStorage Domain. After Connect all hosts to pool. Time: Tue Apr 11
> 09:10:24 EDT 2017
> 2017-04-11 09:10:24,842-04 ERROR [org.ovirt.engine.core.bll.storage.pool.
> RefreshPoolSingleAsyncOperation] (org.ovirt.thread.pool-7-thread-29)
> [2b66a42b] Could not connect vds 'lago-basic-suite-master-host0' to pool
> 'test-dc' - moving host to non-operational: null
> 2017-04-11 09:10:24,842-04 ERROR [org.ovirt.engine.core.bll.storage.pool.
> RefreshPoolSingleAsyncOperation] (org.ovirt.thread.pool-7-thread-28)
> [2b66a42b] Could not connect vds 'lago-basic-suite-master-host1' to pool
> 'test-dc' - moving host to non-operational: null
> 2017-04-11 09:10:24,904-04 INFO  [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand]
> (default task-29) [2bc83b1c] Running command: SetNonOperationalVdsCommand
> internal: true. Entities affected :  ID: e1b611f2-67fb-4acb-8261-07133134e0e7
> Type: VDS
> 2017-04-11 09:10:24,909-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (default task-29) [2bc83b1c] START, SetVdsStatusVDSCommand(HostName =
> lago-basic-suite-master-host0, SetVdsStatusVDSCommandParameters:{runAsync='true',
> hostId='e1b611f2-67fb-4acb-8261-07133134e0e7', status='NonOperational',
> nonOperationalReason='STORAGE_DOMAIN_UNREACHABLE',
> stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 5c45239
> 2017-04-11 09:10:24,919-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (default task-29) [2bc83b1c] FINISH, SetVdsStatusVDSCommand, log id: 5c45239
> 2017-04-11 09:10:24,922-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.
> PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-29)
> [2bc83b1c] Compiled stored procedure. Call string is [{call
> getvmsrunningonvds(?)}]
> 2017-04-11 09:10:24,922-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.
> PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-29)
> [2bc83b1c] SqlCall for procedure [GetVmsRunningOnVds] compiled
> 2017-04-11 09:10:24,985-04 WARN  [org.ovirt.engine.core.dal.
> dbbroker.auditloghandling.AuditLogDirector] (default task-29) [2bc83b1c]
> EVENT_ID: VDS_SET_NONOPERATIONAL_DOMAIN(522), Correlation ID: 2bc83b1c,
> Job ID: d6455a45-48ad-42cd-9815-e9826944b9a0, Call Stack: null, Custom
> Event ID: -1, Message: Host lago-basic-suite-master-host0 cannot access the
> Storage Domain(s) iscsi attached to the Data Center test-dc. Setting Host
> state to Non-Operational.
> 2017-04-11 09:10:24,985-04 DEBUG [org.ovirt.engine.core.utils.timer.FixedDelayJobListener]
> (DefaultQuartzScheduler4) [] Rescheduling DEFAULT.org.ovirt.engine.core.
> vdsbroker.VdsManager.onTimer#-9223372036854775741 as there is no unfired
> trigger.
> 2017-04-11 09:10:25,036-04 INFO  [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand]
> (default task-29) [2ffa5981] Running command: SetNonOperationalVdsCommand
> internal: true. Entities affected :  ID: 87a5f691-ea5c-4692-a4ab-52dad0dafe34
> Type: VDS
> 2017-04-11 09:10:25,039-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (default task-29) [2ffa5981] START, SetVdsStatusVDSCommand(HostName =
> lago-basic-suite-master-host1, SetVdsStatusVDSCommandParameters:{runAsync='true',
> hostId='87a5f691-ea5c-4692-a4ab-52dad0dafe34', status='NonOperational',
> nonOperationalReason='STORAGE_DOMAIN_UNREACHABLE',
> stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 706ffc3b
>
> .....
>
> 2017-04-11 09:10:25,366-04 INFO  [org.ovirt.engine.core.
> vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-29) [2ffa5981]
> FINISH, SpmStopVDSCommand, log id: 62f82cfb
> 2017-04-11 09:10:25,366-04 DEBUG [org.ovirt.vdsm.jsonrpc.
> client.reactors.stomp.impl.Message] (default task-29) [2ffa5981]
> UNSUBSCRIBE
> id:4cd59c67-1b57-4456-9a16-52a72ac6f689
>
> \00
> 2017-04-11 09:10:25,367-04 DEBUG [org.ovirt.vdsm.jsonrpc.
> client.reactors.stomp.StompCommonClient] (default task-29) [2ffa5981]
> Message sent: UNSUBSCRIBE
>
>
>
> On Tue, Apr 11, 2017 at 4:49 PM, Sandro Bonazzola <sbonazzo at redhat.com>
> wrote:
>
>> Link to Job: http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ma
>> ster/6303
>>
>>
>> *OST is failing in the last 50 builds*
>>
>>
>>
>> *00:18:22.090* [basic_suit_el7]   # add_secondary_storage_domains: *00:18:22.090* [basic_suit_el7] Error while running thread*00:18:22.090* [basic_suit_el7] Traceback (most recent call last):*00:18:22.090* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 57, in _ret_via_queue*00:18:22.090* [basic_suit_el7]     queue.put({'return': func()})*00:18:22.090* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 427, in add_nfs_storage_domain*00:18:22.090* [basic_suit_el7]     add_generic_nfs_storage_domain(prefix, SD_NFS_NAME, SD_NFS_HOST_NAME, SD_NFS_PATH, nfs_version='v4_2')*00:18:22.091* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 441, in add_generic_nfs_storage_domain*00:18:22.091* [basic_suit_el7]     add_generic_nfs_storage_domain_4(prefix, sd_nfs_name, nfs_host_name, mount_path, sd_format, sd_type, nfs_version)*00:18:22.091* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 490, in add_generic_nfs_storage_domain_4*00:18:22.091* [basic_suit_el7]     host=_random_host_from_dc_4(api, DC_NAME),*00:18:22.091* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 106, in _random_host_from_dc_4*00:18:22.091* [basic_suit_el7]     return random.choice(_hosts_in_dc_4(api, dc_name))*00:18:22.091* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 100, in _hosts_in_dc_4*00:18:22.091* [basic_suit_el7]     raise RuntimeError('Could not find hosts that are up in DC %s' % dc_name)*00:18:22.092* [basic_suit_el7] RuntimeError: Could not find hosts that are up in DC test-dc*00:18:22.092* [basic_suit_el7] Error while running thread*00:18:22.092* [basic_suit_el7] Traceback (most recent call last):*00:18:22.092* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 57, in _ret_via_queue*00:18:22.092* [basic_suit_el7]     queue.put({'return': func()})*00:18:22.092* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 583, in add_iso_storage_domain*00:18:22.092* [basic_suit_el7]     add_generic_nfs_storage_domain(prefix, SD_ISO_NAME, SD_ISO_HOST_NAME, SD_ISO_PATH, sd_format='v1', sd_type='iso', nfs_version='v3')*00:18:22.092* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 441, in add_generic_nfs_storage_domain*00:18:22.092* [basic_suit_el7]     add_generic_nfs_storage_domain_4(prefix, sd_nfs_name, nfs_host_name, mount_path, sd_format, sd_type, nfs_version)*00:18:22.093* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 490, in add_generic_nfs_storage_domain_4*00:18:22.093* [basic_suit_el7]     host=_random_host_from_dc_4(api, DC_NAME),*00:18:22.093* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 106, in _random_host_from_dc_4*00:18:22.093* [basic_suit_el7]     return random.choice(_hosts_in_dc_4(api, dc_name))*00:18:22.093* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 100, in _hosts_in_dc_4*00:18:22.093* [basic_suit_el7]     raise RuntimeError('Could not find hosts that are up in DC %s' % dc_name)*00:18:22.093* [basic_suit_el7] RuntimeError: Could not find hosts that are up in DC test-dc*00:18:22.094* [basic_suit_el7] Error while running thread*00:18:22.094* [basic_suit_el7] Traceback (most recent call last):*00:18:22.094* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 57, in _ret_via_queue*00:18:22.094* [basic_suit_el7]     queue.put({'return': func()})*00:18:22.094* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 587, in add_templates_storage_domain*00:18:22.094* [basic_suit_el7]     add_generic_nfs_storage_domain(prefix, SD_TEMPLATES_NAME, SD_TEMPLATES_HOST_NAME, SD_TEMPLATES_PATH, sd_format='v1', sd_type='export', nfs_version='v4_1')*00:18:22.094* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 441, in add_generic_nfs_storage_domain*00:18:22.094* [basic_suit_el7]     add_generic_nfs_storage_domain_4(prefix, sd_nfs_name, nfs_host_name, mount_path, sd_format, sd_type, nfs_version)*00:18:22.094* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 490, in add_generic_nfs_storage_domain_4*00:18:22.095* [basic_suit_el7]     host=_random_host_from_dc_4(api, DC_NAME),*00:18:22.095* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 106, in _random_host_from_dc_4*00:18:22.095* [basic_suit_el7]     return random.choice(_hosts_in_dc_4(api, dc_name))*00:18:22.095* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 100, in _hosts_in_dc_4*00:18:22.095* [basic_suit_el7]     raise RuntimeError('Could not find hosts that are up in DC %s' % dc_name)*00:18:22.095* [basic_suit_el7] RuntimeError: Could not find hosts that are up in DC test-dc*00:18:22.095* [basic_suit_el7] Error while running thread*00:18:22.095* [basic_suit_el7] Traceback (most recent call last):*00:18:22.096* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 57, in _ret_via_queue*00:18:22.096* [basic_suit_el7]     queue.put({'return': func()})*00:18:22.096* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 436, in add_second_nfs_storage_domain*00:18:22.096* [basic_suit_el7]     SD_NFS_HOST_NAME, SD_SECOND_NFS_PATH)*00:18:22.096* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 441, in add_generic_nfs_storage_domain*00:18:22.096* [basic_suit_el7]     add_generic_nfs_storage_domain_4(prefix, sd_nfs_name, nfs_host_name, mount_path, sd_format, sd_type, nfs_version)*00:18:22.096* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 490, in add_generic_nfs_storage_domain_4*00:18:22.096* [basic_suit_el7]     host=_random_host_from_dc_4(api, DC_NAME),*00:18:22.096* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 106, in _random_host_from_dc_4*00:18:22.097* [basic_suit_el7]     return random.choice(_hosts_in_dc_4(api, dc_name))*00:18:22.097* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 100, in _hosts_in_dc_4*00:18:22.097* [basic_suit_el7]     raise RuntimeError('Could not find hosts that are up in DC %s' % dc_name)*00:18:22.097* [basic_suit_el7] RuntimeError: Could not find hosts that are up in DC test-dc*00:18:22.097* [basic_suit_el7] Error while running thread*00:18:22.097* [basic_suit_el7] Traceback (most recent call last):*00:18:22.097* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 57, in _ret_via_queue*00:18:22.098* [basic_suit_el7]     queue.put({'return': func()})*00:18:22.098* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 803, in import_template_from_glance*00:18:22.098* [basic_suit_el7]     generic_import_from_glance(api, image_name=CIRROS_IMAGE_NAME, image_ext='_glance_template', as_template=True)*00:18:22.098* [basic_suit_el7]   File "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/002_bootstrap.py", line 637, in generic_import_from_glance*00:18:22.098* [basic_suit_el7]     target_image.import_image(import_action)*00:18:22.098* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/brokers.py", line 26017, in import_image*00:18:22.098* [basic_suit_el7]     headers={"Correlation-Id":correlation_id}*00:18:22.098* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line 122, in request*00:18:22.098* [basic_suit_el7]     persistent_auth=self.__persistent_auth*00:18:22.099* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 79, in do_request*00:18:22.099* [basic_suit_el7]     persistent_auth)*00:18:22.099* [basic_suit_el7]   File "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py", line 162, in __do_request*00:18:22.099* [basic_suit_el7]     raise errors.RequestError(response_code, response_reason, response_body)*00:18:22.099* [basic_suit_el7] RequestError: *00:18:22.099* [basic_suit_el7] status: 400*00:18:22.099* [basic_suit_el7] reason: Bad Request*00:18:22.099* [basic_suit_el7] detail: Cannot import Virtual Disk: Storage Domain cannot be accessed.*00:18:22.099* [basic_suit_el7] -Please check that at least one Host is operational and Data Center state is up.
>>
>>
>> --
>>
>> SANDRO BONAZZOLA
>>
>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>>
>> Red Hat EMEA <https://www.redhat.com/>
>> <https://red.ht/sig>
>> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
>>
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>



-- 
Barak Korren
bkorren at redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170412/d6acaae3/attachment-0001.html>


More information about the Devel mailing list