[Ovirt] [CQ weekly status] [15-02-2019]
by Dafna Ron
Hi,
This mail is to provide the current status of CQ and allow people to review
status before and after the weekend.
Please refer to below colour map for further information on the meaning of
the colours.
*CQ-4.2*: RED (#1)
1. We had 4.2 failure due to selinux-policy package not downloaded
automatically by ost as a dependency to vmconsoule package.
The package stopped downloading after this change in vconsole:
https://gerrit.ovirt.org/#/c/97704/ - clean up and reorganize
I reported the issue to the list after discussing it with Sandro and after
a discussion with the developer Sandro approved merging a patch to CI that
would manually add the package:
https://gerrit.ovirt.org/#/c/97785/
I am re-running the project's last patch now.
- Action item for Galit to check why OST is not detecting the package if
the project deps are ok.
*CQ-Master:* RED (#1)
1. The same failure in vmconsole was causing upgrade suite to fail on
master.
Current running jobs for 4.2 [1] and master [2] can be found here:
[1]
http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-4.2_change-...
[2]
http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_chan...
Happy week!
Dafna
-------------------------------------------------------------------------------------------------------------------
COLOUR MAP
Green = job has been passing successfully
** green for more than 3 days may suggest we need a review of our test
coverage
1.
1-3 days GREEN (#1)
2.
4-7 days GREEN (#2)
3.
Over 7 days GREEN (#3)
Yellow = intermittent failures for different projects but no lasting or
current regressions
** intermittent would be a healthy project as we expect a number of
failures during the week
** I will not report any of the solved failures or regressions.
1.
Solved job failures YELLOW (#1)
2.
Solved regressions YELLOW (#2)
Red = job has been failing
** Active Failures. The colour will change based on the amount of time the
project/s has been broken. Only active regressions would be reported.
1.
1-3 days RED (#1)
2.
4-7 days RED (#2)
3.
Over 7 days RED (#3)
5 years, 10 months
[VDSM] Running the new storage tests on your laptop
by Nir Soffer
I want to share our new block storage tests, running on your laptop, from
your editor, creating
real block storage domain with real logical volumes.
One catch, these tests require root - there is no way to create devices
without root.
To make it easy to run as root only the tests that need root, they are
marked with "root" mark.
Here an example of running the root tests for block storage domain:
$ sudo ~/.local/bin/tox -e storage-py27 tests/storage/blocksd_test.py
-- -m root
And for lvm:
$ sudo ~/.local/bin/tox -e storage-py27 tests/storage/lvm_test.py -- -m
root
To run all storage tests that require root:
$ sudo ~/.local/bin/tox -e storage-py27 -- -m root tests/storage
Another issue - after running the tests as root, you need to fix ownership
of some
files in .tox/, tests/htmlcov*, and /var/tmp/vdsm. You can:
$ sudo chown -R $USER:$USER .tox tests /var/tmp/vdsm
We will improve this later.
Note that I'm running tox installed as user:
$ pip install --user tox
This gets most recent tox with minimal breakage of system python.
With the new tests, our code coverage is now 57%:
https://jenkins.ovirt.org/job/vdsm_standard-check-patch/2888/artifact/che...
- blockSD: 47%
https://jenkins.ovirt.org/job/vdsm_standard-check-patch/2888/artifact/che...
- lvm: 74%
https://jenkins.ovirt.org/job/vdsm_standard-check-patch/2888/artifact/che...
These tests are rather slow, all the root tests take 26 seconds. But OST
takes more then
40 minutes, and cover less code in this area.
OST coverage for lvm mudle: 71%
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/4071/artifact/exp...
To get debug the tests, you can use new option in recent pytest
--log-cli-level:
$ sudo ~/.local/bin/tox -e storage-py27 tests/storage/blocksd_test.py
-- -m root --log-cli-level=info
Here is an example output from a test creating a storage domain
(use --log-cli-leve=debug if this is not verbose enough)
-------------------------------------------------------------------------------
live log call
-------------------------------------------------------------------------------
blockSD.py 1034 INFO
sdUUID=d4d7649d-4849-4413-bdc2-b7b84f239092 domainName=loop-domain
domClass=1 vgUUID=3OJX6U-UDLc-VFtg-2cRO-q3kR-UH2g-Nvf78I storageType=3
version=3, block_size=512, alignment=1048576
blockSD.py 600 INFO size 512 MB (metaratio 262144)
lvm.py 1168 INFO Creating LV
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=metadata, size=512m,
activate=True, contiguous=False, initialTags=(), device=None)
lvm.py 1198 WARNING Could not change ownership of one
or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - metadata
blockSD.py 522 INFO Create: SORT MAPPING: ['/dev/loop2']
lvm.py 1168 INFO Creating LV
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=inbox, size=16m,
activate=True, contiguous=False, initialTags=(), device=None)
lvm.py 1198 WARNING Could not change ownership of one
or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - inbox
lvm.py 1168 INFO Creating LV
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=outbox, size=16m,
activate=True, contiguous=False, initialTags=(), device=None)
lvm.py 1198 WARNING Could not change ownership of one
or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - outbox
lvm.py 1168 INFO Creating LV
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=ids, size=8m, activate=True,
contiguous=False, initialTags=(), device=None)
lvm.py 1198 WARNING Could not change ownership of one
or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - ids
lvm.py 1168 INFO Creating LV
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=leases, size=2048m,
activate=True, contiguous=False, initialTags=(), device=None)
lvm.py 1198 WARNING Could not change ownership of one
or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - leases
lvm.py 1168 INFO Creating LV
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=master, size=1024m,
activate=True, contiguous=False, initialTags=(), device=None)
lvm.py 1198 WARNING Could not change ownership of one
or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - master
lvm.py 1333 INFO Deactivating lvs:
vg=d4d7649d-4849-4413-bdc2-b7b84f239092 lvs=['master']
blockdev.py 84 INFO Zeroing device
/dev/d4d7649d-4849-4413-bdc2-b7b84f239092/metadata (size=41943040)
utils.py 454 INFO Zero device
/dev/d4d7649d-4849-4413-bdc2-b7b84f239092/metadata: 0.00 seconds
blockdev.py 84 INFO Zeroing device
/dev/d4d7649d-4849-4413-bdc2-b7b84f239092/inbox (size=1024000)
utils.py 454 INFO Zero device
/dev/d4d7649d-4849-4413-bdc2-b7b84f239092/inbox: 0.02 seconds
blockdev.py 84 INFO Zeroing device
/dev/d4d7649d-4849-4413-bdc2-b7b84f239092/outbox (size=1024000)
utils.py 454 INFO Zero device
/dev/d4d7649d-4849-4413-bdc2-b7b84f239092/outbox: 0.02 seconds
lvm.py 1438 INFO Changing VG tags
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092, delTags=[],
addTags=['MDT_LEASETIMESEC=60', 'MDT_IOOPTIMEOUTSEC=10',
'MDT_LEASERETRIES=3', 'MDT_LOCKRENEWALINTERVALSEC=5',
'MDT_SDUUID=d4d7649d-4849-4413-bdc2-b7b84f239092', 'MDT_ROLE=Regular',
'MDT_POOL_UUID=',
'MDT_PV0=pv:loop2&44&uuid:wzupcF-uQME-3PIa-4WNf-THJJ-57MS-3UIdxD&44&pestart:0&44&pecount:157&44&mapoffset:0',
'MDT_CLASS=Data',
'MDT__SHA_CKSUM=ee58868dee52c4cc128f0ee89a0c382de1fe6419',
'MDT_LOGBLKSIZE=512', 'MDT_VGUUID=3OJX6U-UDLc-VFtg-2cRO-q3kR-UH2g-Nvf78I',
'MDT_PHYBLKSIZE=512', 'MDT_DESCRIPTION=loop-domain', 'MDT_TYPE=ISCSI',
'MDT_VERSION=3', 'MDT_LOCKPOLICY='])
lvm.py 1438 INFO Changing VG tags
(vg=d4d7649d-4849-4413-bdc2-b7b84f239092,
delTags=['RHAT_storage_domain_UNREADY'], addTags=['RHAT_storage_domain'])
lvm.py 1325 INFO Activating lvs:
vg=d4d7649d-4849-4413-bdc2-b7b84f239092 lvs=['master']
PASSED
[ 50%]
Nir
5 years, 10 months
Failed dependencies Ovirt 4.3 on clean install Centos 7.6
by Erick Perez
Centos 7.6 (minimal install ISO)
UEFI boot
yum -y update
reboot
[root@ovirt01] yum install cockpit-ovirt-dashboard
Error: Package: 1:openvswitch-2.10.1-1.el7.x86_64 (ovirt-4.3-centos-ovirt43)
Requires: librte_mbuf.so.3()(64bit)
Available: dpdk-17.11-13.el7.x86_64 (extras)
librte_mbuf.so.3()(64bit)
Available: dpdk-17.11-15.el7.x86_64 (extras)
librte_mbuf.so.3()(64bit)
Installed:dpdk-18.11-2.el7_6.x86_64 (@extras)
~librte_mbuf.so.4()(64bit)
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest
[root@ovirt01]
Note: The above message appears for librte_mbuf/mempool/pmd and several others.
It seems openvswitch specifically need DPDK v17 and not v18.
Please clarify.
thanks,
5 years, 10 months
oVirt Glance repo not accessible blocked OST
by Tian Xu
Hi Experts,
I am trying to run ovirt-system-test in my local lab, which need proxy
to access internet. ovirt cannot work with proxy to access ovirt glance
repository, whose URL is: http://glance.ovirt.org:9292, a related
bugzilla was filed by someone and in open status:
https://bugzilla.redhat.com/show_bug.cgi?id=1362433
Some ovirt system tests depend on ovirt glance repository for VM images,
these tests either skipped or failed, then subsequent VM related tests
cannot be run. Here is the test file:
https://github.com/oVirt/ovirt-system-tests/blob/master/basic-suite-4.2/t...,
and the verify_glance_import failed, then other test blocked too.
Below is exception message and details test log see attachment.
Running test scenario 004_basic_sanity.py
<?xml version="1.0" encoding="UTF-8"?>
<testsuite name="nosetests" tests="3" errors="1" failures="0" skip="1">
<testcase classname="004_basic_sanity" name="add_blank_vms"
time="2.473"/>
<testcase classname="004_basic_sanity" name="verify_glance_import"
time="0.050">
<skipped type="unittest.case.SkipTest" message="Glance is not
available"><![CDATA[SkipTest: Glance is not available
]]></skipped>
</testcase>
<testcase classname="004_basic_sanity" name="add_vm1_from_template"
time="0.266">
<error type="exceptions.IndexError" message="list index out of
range"><![CDATA[Traceback (most recent call last):
File "/usr/lib64/python2.7/unittest/case.py", line 369, in run
testMethod()
File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in
runTest
self.test(*self.arg)
File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
142, in wrapped_test
test()
File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
60, in wrapper
return func(get_test_prefix(), *args, **kwargs)
File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
79, in wrapper
prefix.virt_env.engine_vm().get_api(api_ver=4), *args, **kwargs
File
"/mnt/zz/bak/ovirt-system-tests-master/basic-suite-4.2/test-scenarios/004_basic_sanity.py",
line 748, in add_vm1_from_template
)[0]
IndexError: list index out of range
]]></error>
</testcase>
</testsuite>
Thanks,
Xu
5 years, 10 months
Fwd: [ovirt-users] Ovirt Cluster completely unstable
by Sandro Bonazzola
Any suggestion from Gluster team on how to get back to a stable system in a
very short loop?
opened https://bugzilla.redhat.com/show_bug.cgi?id=1677160 to track this on
gluster side.
---------- Forwarded message ---------
From: <dscott(a)umbctraining.com>
Date: gio 14 feb 2019 alle ore 00:26
Subject: [ovirt-users] Ovirt Cluster completely unstable
To: <users(a)ovirt.org>
I'm abandoning my production ovirt cluster due to instability. I have a 7
host cluster running about 300 vms and have been for over a year. It has
become unstable over the past three days. I have random hosts both,
compute and storage disconnecting. AND many vms disconnecting and becoming
unusable.
7 host are 4 compute hosts running Ovirt 4.2.8 and three glusterfs hosts
running 3.12.5. I submitted a bugzilla bug and they immediately assigned
it to the storage people but have not responded with any meaningful
information. I have submitted several logs.
I have found some discussion on problems with instability with gluster
3.12.5. I would be willing to upgrade my gluster to a more stable version
if that's the culprit. I installed gluster using the ovirt gui and this is
the version the ovirt gui installed.
Is there an ovirt health monitor available? Where should I be looking to
get a resolution the problems I'm facing.
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BL4M3JQA3IE...
--
SANDRO BONAZZOLA
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <https://www.redhat.com/>
sbonazzo(a)redhat.com
<https://red.ht/sig>
5 years, 10 months
disk cache issues
by Hetz Ben Hamo
Hi,
After digging around and finding a bit of info about viodiskcache - I
understand that if the user enable it - then the VM cannot be live migrated.
Umm, unless the op decides to do a live migration including changing
storage - I don't understand why the live migration is disabled. If the VM
will only be live migrated between nodes, then the storage is the same,
nothing is saved locally on the node's hard disk, so what is the reason to
disable live migration?
Thanks,
Hetz
5 years, 10 months
[ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 13-02-2019 ] [ 001_initialize_engine.initialize_engine ]
by Dafna Ron
Hi,
we are failing on project ovirt-engine in master branch on initialize engine
Change suspected: https://gerrit.ovirt.org/#/c/95743/8 - core: Remove
unused field VdsStatic.vdsStrength
setup log is here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/12864/arti...
full logs are here:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/12864/arti...
Can you please check the issue?
Thanks,
Dafna
<error>
INSERT INTO gluster_config_master(config_key, config_description,
minimum_supported_cluster, config_possible_values, config_feature)
values('use_meta_volume', 'Meta volume for the geo-replication
session', '3.5', 'false;true', 'geo_replication');
**************************
INSERT 0 1
2019-02-13 09:03:04,406-0500 Saving custom users permissions on
database objects...
********* QUERY **********
copy (
select count(*)
from pg_available_extensions
where
name = 'uuid-ossp' and
installed_version IS NOT NULL
) to stdout with delimiter as '|';
**************************
2019-02-13 09:03:04,906-0500 dbfunc_psql_die
--file=/usr/share/ovirt-engine/dbscripts/upgrade/04_00_0000_set_version.sql
********* QUERY **********
select 4000000;
**************************
4000000
2019-02-13 09:03:05,534-0500 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema
plugin.executeRaw:863 execute-result:
['/usr/share/ovirt-engine/dbscripts/schema.sh', '-s', 'localhost',
'-p', '5432', '-u', 'engine', '-d', 'engine', '-l',
'/var/log/ovirt-engine/setup/ovirt-engine-setup-20190213090242-msviji.log',
'-c', 'apply'], rc=1
2019-02-13 09:03:05,534-0500 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema
plugin.execute:921 execute-output:
['/usr/share/ovirt-engine/dbscripts/schema.sh', '-s', 'localhost',
'-p', '5432', '-u', 'engine', '-d', 'engine', '-l',
'/var/log/ovirt-engine/setup/ovirt-engine-setup-20190213090242-msviji.log',
'-c', 'apply'] stdout:
2019-02-13 09:03:05,535-0500 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema
plugin.execute:926 execute-output:
['/usr/share/ovirt-engine/dbscripts/schema.sh', '-s', 'localhost',
'-p', '5432', '-u', 'engine', '-d', 'engine', '-l',
'/var/log/ovirt-engine/setup/ovirt-engine-setup-20190213090242-msviji.log',
'-c', 'apply'] stderr:
FATAL: Operation aborted, found duplicate version: 04030790
2019-02-13 09:03:05,535-0500 ERROR
otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema
schema._misc:435 schema.sh: FATAL: Operation aborted, found duplicate
version: 04030790
2019-02-13 09:03:05,535-0500 DEBUG otopi.context
context._executeMethod:142 method exception
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132,
in _executeMethod
method['method']()
File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py",
line 437, in _misc
raise RuntimeError(_('Engine schema refresh failed'))
RuntimeError: Engine schema refresh failed
2019-02-13 09:03:05,536-0500 ERROR otopi.context
context._executeMethod:151 Failed to execute stage 'Misc
configuration': Engine schema refresh failed
2019-02-13 09:03:05,562-0500 DEBUG
otopi.plugins.otopi.debug.debug_failure.debug_failure
debug_failure._notification:100 tcp connections:
id uid local foreign state pid exe
<error>
5 years, 10 months
Recovering from power failure
by Sandro Bonazzola
Moving the discussion to devel list on this scenario.
Il giorno mar 12 feb 2019 alle ore 11:16 Hetz Ben Hamo <hetz(a)hetz.biz> ha
scritto:
> Hi,
>
> Well, there is a severe bug that I complained about it on 4.2 (or 4.1? I
> don't remember) and it's regarding "yanking the power cable".
> Basically I'm performing a simple test: kill all hosts immediately to
> simulate a power loss without UPS.
>
> For this test I have 2 nodes, and 4 storage domains: hosted_storage (that
> was setup during the HE installation), 1 iSCSI domain, 1 NAS domain and 1
> ISO domain.
>
> After all the nodes loose power, I power them on and the following
> procedure happens:
> 1. The node with HE finishes booting, and it takes few minutes until the
> HE is up.
> 2. When the HE is up, all the storage domains comes back to life as online
> and VM's with high availability starting to boot.
> 3. Few minutes later, *all* (with the exception of hosted_storage)
> storage domains are going down
> 4. After about 5 minutes, all the other storage domains which went down,
> are coming up, but by then, and VM's without high availability that are not
> hosted on hosted_storage remains down, you'll need to power them manually
> back.
>
> This whole procedure takes about 15-25 minutes after booting the nodes,
> and this issue is always repeatable, just kill the power to the nodes,
> power them up again and see for yourself.
>
> The solution would be to change the code and if a storage domain is up - *leave
> it up*, skip the check.
>
>
Tal, Nir, what do you think about this?
> Thanks
>
>
> On Tue, Feb 12, 2019 at 11:56 AM Sandro Bonazzola <sbonazzo(a)redhat.com>
> wrote:
>
>> Hi,
>> We are planning to release the first candidate of 4.3.1 on February
>> 20th[1] and the final release on February 26th.
>> Please join us testing this release candidate right after it will be
>> announced!
>> We are going to coordinate the testing effort with a public Trello board
>> at https://trello.com/b/5ZNJgPC3
>> You'll find instructions on how to use the board there.
>>
>> If you have an environment dedicated to testing, remember you can setup a
>> few VMs and test the deployment with nested virtualization.
>> To ease the setup of such environment you can use Lago (
>> https://github.com/lago-project)
>>
>> The oVirt team will monitor the Trello board, the #ovirt IRC channel on
>> irc.oftc.net server and the users(a)ovirt.org mailing list to assist with
>> the testing.
>>
>> [1]
>> https://www.ovirt.org/develop/release-management/releases/4.3.z/release-m...
>>
>> --
>>
>> SANDRO BONAZZOLA
>>
>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>>
>> Red Hat EMEA <https://www.redhat.com/>
>>
>> sbonazzo(a)redhat.com
>> <https://red.ht/sig>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/URIGV3LPTE2...
>>
>
--
SANDRO BONAZZOLA
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <https://www.redhat.com/>
sbonazzo(a)redhat.com
<https://red.ht/sig>
5 years, 10 months