[JIRA] (OVIRT-2583) ovirt-engine failing on master on test
check_update_host
by Dafna Ron (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2583?page=com.atlassian.jir... ]
Dafna Ron updated OVIRT-2583:
-----------------------------
Resolution: Fixed
Status: Done (was: To Do)
> ovirt-engine failing on master on test check_update_host
> ---------------------------------------------------------
>
> Key: OVIRT-2583
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2583
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Dafna Ron
> Assignee: infra
> Priority: Highest
> Labels: ost_code_regression, ost_failures
>
> ansible: Remove duplication of vnc_tls option - https://gerrit.ovirt.org/#/c/95169/
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11267
> Mail sent to dev
> {noformat}
> [dron@dron post-002_bootstrap.py]$ egrep 89d99b74-fa6c-4d1d-a261-fb7ca7fcada2 lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log
> 2018-11-21 13:41:34,209-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] method: get, params: [ad768a3b-d44b-4fd4-b57a-9029fca470a0], timeElapsed: 9ms
> 2018-11-21 13:41:34,224-05 INFO [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCheckCommand] (default task-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Running command: HostUpgradeCheckCommand internal: false. Entities affected : ID: ad768a3b-d44b-4fd4-b57a-9029fca470a0 Type: VDSAction group EDIT_HOST_CONFIGURATION with role type ADMIN
> 2018-11-21 13:41:34,259-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] method: get, params: [ad768a3b-d44b-4fd4-b57a-9029fca470a0], timeElapsed: 13ms
> 2018-11-21 13:41:34,273-05 INFO [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCheckInternalCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Running command: HostUpgradeCheckInternalCommand internal: true. Entities affected : ID: ad768a3b-d44b-4fd4-b57a-9029fca470a0 Type: VDS
> 2018-11-21 13:41:34,277-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] method: get, params: [ad768a3b-d44b-4fd4-b57a-9029fca470a0], timeElapsed: 4ms
> 2018-11-21 13:41:34,281-05 DEBUG [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Inventory hosts: [lago-basic-suite-master-host-0]
> 2018-11-21 13:41:34,282-05 INFO [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Executing Ansible command: ANSIBLE_STDOUT_CALLBACK=hostupgradeplugin [/usr/bin/ansible-playbook, --ssh-common-args=-F /var/lib/ovirt-engine/.ssh/config, --check, --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa, --inventory=/tmp/ansible-inventory2326258642232337463, /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml] [Logfile: null]
> 2018-11-21 13:41:37,610-05 INFO [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Ansible playbook command has exited with value: 2
> 2018-11-21 13:41:37,611-05 ERROR [org.ovirt.engine.core.bll.host.HostUpgradeManager] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Failed to run check-update of host 'lago-basic-suite-master-host-0'.
> 2018-11-21 13:41:37,611-05 ERROR [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Failed to check if updates are available for host 'lago-basic-suite-master-host-0' with error message 'Failed to run check-update of host 'lago-basic-suite-master-host-0'.'
> 2018-11-21 13:41:37,616-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Failed to check for available updates on host lago-basic-suite-master-host-0 with message 'Failed to run check-update of host 'lago-basic-suite-master-host-0'.'.
> [dron@dron post-002_bootstrap.py]$
> {noformat}
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
[JIRA] (OVIRT-2583) ovirt-engine failing on master on test
check_update_host
by Dafna Ron (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2583?page=com.atlassian.jir... ]
Dafna Ron commented on OVIRT-2583:
----------------------------------
The change causing this issue has been reverted.
Tomas is now debugging the issue so I am closing this ticket
> ovirt-engine failing on master on test check_update_host
> ---------------------------------------------------------
>
> Key: OVIRT-2583
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2583
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Dafna Ron
> Assignee: infra
> Priority: Highest
> Labels: ost_code_regression, ost_failures
>
> ansible: Remove duplication of vnc_tls option - https://gerrit.ovirt.org/#/c/95169/
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11267
> Mail sent to dev
> {noformat}
> [dron@dron post-002_bootstrap.py]$ egrep 89d99b74-fa6c-4d1d-a261-fb7ca7fcada2 lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log
> 2018-11-21 13:41:34,209-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] method: get, params: [ad768a3b-d44b-4fd4-b57a-9029fca470a0], timeElapsed: 9ms
> 2018-11-21 13:41:34,224-05 INFO [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCheckCommand] (default task-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Running command: HostUpgradeCheckCommand internal: false. Entities affected : ID: ad768a3b-d44b-4fd4-b57a-9029fca470a0 Type: VDSAction group EDIT_HOST_CONFIGURATION with role type ADMIN
> 2018-11-21 13:41:34,259-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] method: get, params: [ad768a3b-d44b-4fd4-b57a-9029fca470a0], timeElapsed: 13ms
> 2018-11-21 13:41:34,273-05 INFO [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCheckInternalCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Running command: HostUpgradeCheckInternalCommand internal: true. Entities affected : ID: ad768a3b-d44b-4fd4-b57a-9029fca470a0 Type: VDS
> 2018-11-21 13:41:34,277-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] method: get, params: [ad768a3b-d44b-4fd4-b57a-9029fca470a0], timeElapsed: 4ms
> 2018-11-21 13:41:34,281-05 DEBUG [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Inventory hosts: [lago-basic-suite-master-host-0]
> 2018-11-21 13:41:34,282-05 INFO [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Executing Ansible command: ANSIBLE_STDOUT_CALLBACK=hostupgradeplugin [/usr/bin/ansible-playbook, --ssh-common-args=-F /var/lib/ovirt-engine/.ssh/config, --check, --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa, --inventory=/tmp/ansible-inventory2326258642232337463, /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml] [Logfile: null]
> 2018-11-21 13:41:37,610-05 INFO [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Ansible playbook command has exited with value: 2
> 2018-11-21 13:41:37,611-05 ERROR [org.ovirt.engine.core.bll.host.HostUpgradeManager] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Failed to run check-update of host 'lago-basic-suite-master-host-0'.
> 2018-11-21 13:41:37,611-05 ERROR [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] Failed to check if updates are available for host 'lago-basic-suite-master-host-0' with error message 'Failed to run check-update of host 'lago-basic-suite-master-host-0'.'
> 2018-11-21 13:41:37,616-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-1) [89d99b74-fa6c-4d1d-a261-fb7ca7fcada2] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Failed to check for available updates on host lago-basic-suite-master-host-0 with message 'Failed to run check-update of host 'lago-basic-suite-master-host-0'.'.
> [dron@dron post-002_bootstrap.py]$
> {noformat}
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
[JIRA] (OVIRT-2586) Jenkins terribly slow and unresponsive
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2586?page=com.atlassian.jir... ]
Evgheni Dereveanchin commented on OVIRT-2586:
---------------------------------------------
With regards to SSE-gateway I've found a very similar issue on their issue tracker to the one we're having:
https://issues.jenkins-ci.org/browse/JENKINS-51057
People report reboots as a good workaround and I'm planning to do so in OVIRT-2606 so we should be good there. There's also a groovy script published to clean up the heap of SSE-gateway but it needs testing on staging.
> Jenkins terribly slow and unresponsive
> --------------------------------------
>
> Key: OVIRT-2586
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2586
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Reporter: sbonazzo
> Assignee: Evgheni Dereveanchin
> Priority: Highest
>
> Hi,
> jenkins is terribly slow and becoming worse every day.
> I tried to gain some speed by adding 4 cores to the VM through engine-phx.
> It's a bit better but the real issue doesn't seem related to CPU power.
> Can anybody investigate?
> --
> SANDRO BONAZZOLA
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
> Red Hat EMEA <https://www.redhat.com/>
> sbonazzo(a)redhat.com
> <https://red.ht/sig>
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
[JIRA] (OVIRT-2586) Jenkins terribly slow and unresponsive
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2586?page=com.atlassian.jir... ]
Evgheni Dereveanchin commented on OVIRT-2586:
---------------------------------------------
Here's the page linking directly to Jenkins: https://www.ovirt.org/node/#ovirt-node-master
This is not the root cause of the major slowness so let's split it into a separate ticket and decide where to publish these ISOs and how to ensure they don't pile up. Pretty sure that currently most downloads are performed by search engine crawlers, not end users.
> Jenkins terribly slow and unresponsive
> --------------------------------------
>
> Key: OVIRT-2586
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2586
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Reporter: sbonazzo
> Assignee: Evgheni Dereveanchin
> Priority: Highest
>
> Hi,
> jenkins is terribly slow and becoming worse every day.
> I tried to gain some speed by adding 4 cores to the VM through engine-phx.
> It's a bit better but the real issue doesn't seem related to CPU power.
> Can anybody investigate?
> --
> SANDRO BONAZZOLA
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
> Red Hat EMEA <https://www.redhat.com/>
> sbonazzo(a)redhat.com
> <https://red.ht/sig>
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
Build failed in Jenkins:
system-sync_mirrors-glusterfs-3.10-el7-x86_64 #1590
by jenkins@jenkins.phx.ovirt.org
See <http://jenkins.ovirt.org/job/system-sync_mirrors-glusterfs-3.10-el7-x86_6...>
Changes:
[Galit Rosenthal] Remove from lago fc27-EOL
------------------------------------------
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on mirrors.phx.ovirt.org (mirrors) in workspace <http://jenkins.ovirt.org/job/system-sync_mirrors-glusterfs-3.10-el7-x86_6...>
> git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
> git config remote.origin.url http://gerrit.ovirt.org/jenkins.git # timeout=10
Cleaning workspace
> git rev-parse --verify HEAD # timeout=10
Resetting working tree
> git reset --hard # timeout=10
> git clean -fdx # timeout=10
Pruning obsolete local branches
Fetching upstream changes from http://gerrit.ovirt.org/jenkins.git
> git --version # timeout=10
> git fetch --tags --progress http://gerrit.ovirt.org/jenkins.git +refs/heads/*:refs/remotes/origin/* --prune
> git rev-parse origin/master^{commit} # timeout=10
Checking out Revision fe8d4c3fee68d33b3407bba8a1408937639442a4 (origin/master)
> git config core.sparsecheckout # timeout=10
> git checkout -f fe8d4c3fee68d33b3407bba8a1408937639442a4
Commit message: "Remove from lago fc27-EOL"
> git rev-list --no-walk bb346506183a00e82791b798f0fbc591cc22957a # timeout=10
[system-sync_mirrors-glusterfs-3.10-el7-x86_64] $ /bin/bash -xe /tmp/jenkins5426108617155492741.sh
+ jenkins/scripts/mirror_mgr.sh resync_yum_mirror glusterfs-3.10-el7 x86_64 jenkins/data/mirrors-reposync.conf
Checking if mirror needs a resync
Traceback (most recent call last):
File "/usr/bin/reposync", line 343, in <module>
main()
File "/usr/bin/reposync", line 175, in main
my.doRepoSetup()
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 681, in doRepoSetup
return self._getRepos(thisrepo, True)
File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 721, in _getRepos
self._repos.doSetup(thisrepo)
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 157, in doSetup
self.retrieveAllMD()
File "/usr/lib/python2.7/site-packages/yum/repos.py", line 88, in retrieveAllMD
dl = repo._async and repo._commonLoadRepoXML(repo)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1479, in _commonLoadRepoXML
result = self._getFileRepoXML(local, text)
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1256, in _getFileRepoXML
size=102400) # setting max size as 100K
File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1039, in _getFile
raise e
yum.Errors.NoMoreMirrorsRepoError: failure: repodata/repomd.xml from glusterfs-3.10-el7: [Errno 256] No more mirrors to try.
http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.10/repodata/re...: [Errno 14] HTTP Error 404 - Not Found
Build step 'Execute shell' marked build as failure
5 years, 11 months
[JIRA] (OVIRT-2586) Jenkins terribly slow and unresponsive
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2586?page=com.atlassian.jir... ]
Eyal Edri commented on OVIRT-2586:
----------------------------------
[~yturgema(a)redhat.com] can we change the OST job not to read from the jenkins the artifact?
It might overload the server which is already overloaded. can you read it from another location maybe? perhaps resources.ovirt.org?
> Jenkins terribly slow and unresponsive
> --------------------------------------
>
> Key: OVIRT-2586
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2586
> Project: oVirt - virtualization made easy
> Issue Type: Outage
> Reporter: sbonazzo
> Assignee: Evgheni Dereveanchin
> Priority: Highest
>
> Hi,
> jenkins is terribly slow and becoming worse every day.
> I tried to gain some speed by adding 4 cores to the VM through engine-phx.
> It's a bit better but the real issue doesn't seem related to CPU power.
> Can anybody investigate?
> --
> SANDRO BONAZZOLA
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
> Red Hat EMEA <https://www.redhat.com/>
> sbonazzo(a)redhat.com
> <https://red.ht/sig>
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months