Hi,
changing the hostname to include also the domain name fixed the  cert deployment issue:
https://gerrit.ovirt.org/#/c/109842/

not sure how it affects the engine certificate content.
from my offline discussion with @Martin Perina  this was that change that could cause it:
https://gerrit.ovirt.org/#/c/109636/

any thoughts?




On Wed, Jun 17, 2020 at 9:32 AM Yedidyah Bar David <didi@redhat.com> wrote:
On Wed, Jun 17, 2020 at 6:28 AM <jenkins@jenkins.phx.ovirt.org> wrote:
>
> Project: https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/
> Build: https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1641/

This one failed while trying to create the disk image for the hosted-egnine VM:

https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1641/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-ansible-create_target_vm-20200616230220-yfumoc.log
:

2020-06-16 23:03:20,527-0400 INFO ansible task start {'status': 'OK',
'ansible_type': 'task', 'ansible_playbook':
'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml',
'ansible_task': 'ovirt.hosted_engine_setup : Add HE disks'}
...
2020-06-16 23:14:12,702-0400 DEBUG var changed: host "localhost" var
"add_disks" type "<class 'dict'>" value: "{
...
            "msg": "Timeout exceed while waiting on result state of the entity."

https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1641/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/ovirt-hosted-engine-setup/engine-logs-2020-06-17T03%3A14%3A18Z/ovirt-engine/engine.log
:

2020-06-16 23:03:22,612-04 INFO
[org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (default task-1)
[16c24599-0048-44eb-a410-d39b7ce98712]
CommandMultiAsyncTasks::attachTask: Attaching task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' to command
'fa81759d-c57a-4237-81e0-beb210faa64d'.
2020-06-16 23:03:22,659-04 INFO
[org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (default task-1)
[16c24599-0048-44eb-a410-d39b7ce98712] Adding task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' (Parent Command
'AddImageFromScratch', Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'),
polling hasn't started yet..
2020-06-16 23:03:22,699-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (default task-1)
[16c24599-0048-44eb-a410-d39b7ce98712]
BaseAsyncTask::startPollingTask: Starting to poll task
'6b2a7648-748c-430b-94b6-5e3f719df2ac'.
...
2020-06-16 23:03:25,835-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-25)
[] SPMAsyncTask::PollTask: Polling task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' (Parent Command
'AddImageFromScratch', Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters')
returned status 'finished', result 'success'.
2020-06-16 23:03:25,863-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-25)
[] BaseAsyncTask::onTaskEndSuccess: Task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' (Parent Command
'AddImageFromScratch', Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
successfully.

But then:

2020-06-16 23:03:25,897-04 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712]
CommandAsyncTask::HandleEndActionResult [within thread]: endAction for
action type 'AddImageFromScratch' succeeded, clearing tasks.
2020-06-16 23:03:25,897-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712] SPMAsyncTask::ClearAsyncTask:
Attempting to clear task '6b2a7648-748c-430b-94b6-5e3f719df2ac'
2020-06-16 23:03:25,899-04 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712] START, SPMClearTaskVDSCommand(
SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='3bcde3b4-b044-11ea-bbb6-5452c0a8c863',
ignoreFailoverLimit='false',
taskId='6b2a7648-748c-430b-94b6-5e3f719df2ac'}), log id: 481c2d3d
2020-06-16 23:03:25,900-04 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712] START,
HSMClearTaskVDSCommand(HostName = lago-he-basic-suite-master-host-0,
HSMTaskGuidBaseVDSCommandParameters:{hostId='85ecc51c-f2cb-46a1-9452-fd487399d8dd',
taskId='6b2a7648-748c-430b-94b6-5e3f719df2ac'}), log id: 17360b3d
...
2020-06-16 23:03:26,054-04 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engine-Thread-29)
[16c24599-0048-44eb-a410-d39b7ce98712]
BaseAsyncTask::removeTaskFromDB: Removed task
'6b2a7648-748c-430b-94b6-5e3f719df2ac' from DataBase

But then:

2020-06-16 23:03:26,315-04 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-55)
[7fe7b467] Command 'UploadStreamVDSCommand(HostName =
lago-he-basic-suite-master-host-0,
UploadStreamVDSCommandParameters:{hostId='85ecc51c-f2cb-46a1-9452-fd487399d8dd'})'
execution failed: javax.net.ssl.SSLPeerUnverifiedException:
Certificate for <lago-he-basic-suite-master-host-0.lago.local> doesn't
match any of the subject alternative names:
[lago-he-basic-suite-master-host-0.lago.local]
2020-06-16 23:03:26,315-04 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-55)
[7fe7b467] FINISH, UploadStreamVDSCommand, return: , log id: 7e3a3e80
2020-06-16 23:03:26,316-04 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-55)
[7fe7b467] Command
'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand'
failed: EngineException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
javax.net.ssl.SSLPeerUnverifiedException: Certificate for
<lago-he-basic-suite-master-host-0.lago.local> doesn't match any of
the subject alternative names:
[lago-he-basic-suite-master-host-0.lago.local] (Failed with error
VDS_NETWORK_ERROR and code 5022)

Any idea why?
Anything changed in how we check the certificate?
Perhaps related to upgrade to CentOS 8.2?
And, how come it failed only this late? Don't we check the certificate earlier?

Anyway, this left the host in "not responding" state, so:

2020-06-16 23:03:29,994-04 ERROR
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommandCallback]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-79)
[16c24599-0048-44eb-a410-d39b7ce98712] Failed to get volume info:
org.ovirt.engine.core.common.errors.EngineException: EngineException:
No host was found to perform the operation (Failed with error
RESOURCE_MANAGER_VDS_NOT_FOUND and code 5004)

And perhaps due to an unrelated issue, also:

2020-06-16 23:03:31,177-04 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMRevertTaskVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-43)
[16c24599-0048-44eb-a410-d39b7ce98712] Trying to revert unknown task
'6b2a7648-748c-430b-94b6-5e3f719df2ac'

I looked a bit also at:

https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1641/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/vdsm/vdsm.log

and see there some relevant stuff, but nothing I can spot about the
root cause (e.g. the word "cert" does not appear there).

Can anyone please have a look? Thanks.

> Build Number: 1641
> Build Status:  Still Failing
> Triggered By: Started by timer
>
> -------------------------------------
> Changes Since Last Success:
> -------------------------------------
> Changes for Build #1633
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
> [Ehud Yonasi] mock: fix yum repos injection.
>
> [Ehud Yonasi] onboard ost-images to stdci.
>
>
> Changes for Build #1634
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1635
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1636
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1637
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1638
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
> [Ehud Yonasi] stdci_runner: update templates node to ost-images.
>
>
> Changes for Build #1639
> [Marcin Sobczyk] ost-images: Drop rebasing of qcows
>
>
> Changes for Build #1640
> [Yedidyah Bar David] Allow engine 20 minutes to come up after VM restart
>
>
> Changes for Build #1641
> [Michal Skrivanek] test live storage migration again
>
> [Ehud Yonasi] poll: add ost-images to nightly.
>
>
>
>
> -----------------
> Failed Tests:
> -----------------
> No tests ran.



--
Didi
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/AI6KENCA35EK5RDLKR5BWU7HC6H3FIJ7/