February 2019 - Devel - oVirt List Archives

[Ovirt] [CQ weekly status] [15-02-2019]
by Dafna Ron 15 Feb '19

15 Feb '19

Hi, This mail is to provide the current status of CQ and allow people to review status before and after the weekend. Please refer to below colour map for further information on the meaning of the colours. *CQ-4.2*: RED (#1) 1. We had 4.2 failure due to selinux-policy package not downloaded automatically by ost as a dependency to vmconsoule package. The package stopped downloading after this change in vconsole: https://gerrit.ovirt.org/#/c/97704/ - clean up and reorganize I reported the issue to the list after discussing it with Sandro and after a discussion with the developer Sandro approved merging a patch to CI that would manually add the package: https://gerrit.ovirt.org/#/c/97785/ I am re-running the project's last patch now. - Action item for Galit to check why OST is not detecting the package if the project deps are ok. *CQ-Master:* RED (#1) 1. The same failure in vmconsole was causing upgrade suite to fail on master. Current running jobs for 4.2 [1] and master [2] can be found here: [1] http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-4.2_change-qu… [2] http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change… Happy week! Dafna ------------------------------------------------------------------------------------------------------------------- COLOUR MAP Green = job has been passing successfully ** green for more than 3 days may suggest we need a review of our test coverage 1. 1-3 days GREEN (#1) 2. 4-7 days GREEN (#2) 3. Over 7 days GREEN (#3) Yellow = intermittent failures for different projects but no lasting or current regressions ** intermittent would be a healthy project as we expect a number of failures during the week ** I will not report any of the solved failures or regressions. 1. Solved job failures YELLOW (#1) 2. Solved regressions YELLOW (#2) Red = job has been failing ** Active Failures. The colour will change based on the amount of time the project/s has been broken. Only active regressions would be reported. 1. 1-3 days RED (#1) 2. 4-7 days RED (#2) 3. Over 7 days RED (#3)

1 0

[VDSM] Running the new storage tests on your laptop
by Nir Soffer 15 Feb '19

15 Feb '19

I want to share our new block storage tests, running on your laptop, from your editor, creating real block storage domain with real logical volumes. One catch, these tests require root - there is no way to create devices without root. To make it easy to run as root only the tests that need root, they are marked with "root" mark. Here an example of running the root tests for block storage domain: $ sudo ~/.local/bin/tox -e storage-py27 tests/storage/blocksd_test.py -- -m root And for lvm: $ sudo ~/.local/bin/tox -e storage-py27 tests/storage/lvm_test.py -- -m root To run all storage tests that require root: $ sudo ~/.local/bin/tox -e storage-py27 -- -m root tests/storage Another issue - after running the tests as root, you need to fix ownership of some files in .tox/, tests/htmlcov*, and /var/tmp/vdsm. You can: $ sudo chown -R $USER:$USER .tox tests /var/tmp/vdsm We will improve this later. Note that I'm running tox installed as user: $ pip install --user tox This gets most recent tox with minimal breakage of system python. With the new tests, our code coverage is now 57%: https://jenkins.ovirt.org/job/vdsm_standard-check-patch/2888/artifact/check… - blockSD: 47% https://jenkins.ovirt.org/job/vdsm_standard-check-patch/2888/artifact/check… - lvm: 74% https://jenkins.ovirt.org/job/vdsm_standard-check-patch/2888/artifact/check… These tests are rather slow, all the root tests take 26 seconds. But OST takes more then 40 minutes, and cover less code in this area. OST coverage for lvm mudle: 71% https://jenkins.ovirt.org/job/ovirt-system-tests_manual/4071/artifact/expor… To get debug the tests, you can use new option in recent pytest --log-cli-level: $ sudo ~/.local/bin/tox -e storage-py27 tests/storage/blocksd_test.py -- -m root --log-cli-level=info Here is an example output from a test creating a storage domain (use --log-cli-leve=debug if this is not verbose enough) ------------------------------------------------------------------------------- live log call ------------------------------------------------------------------------------- blockSD.py 1034 INFO sdUUID=d4d7649d-4849-4413-bdc2-b7b84f239092 domainName=loop-domain domClass=1 vgUUID=3OJX6U-UDLc-VFtg-2cRO-q3kR-UH2g-Nvf78I storageType=3 version=3, block_size=512, alignment=1048576 blockSD.py 600 INFO size 512 MB (metaratio 262144) lvm.py 1168 INFO Creating LV (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=metadata, size=512m, activate=True, contiguous=False, initialTags=(), device=None) lvm.py 1198 WARNING Could not change ownership of one or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - metadata blockSD.py 522 INFO Create: SORT MAPPING: ['/dev/loop2'] lvm.py 1168 INFO Creating LV (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=inbox, size=16m, activate=True, contiguous=False, initialTags=(), device=None) lvm.py 1198 WARNING Could not change ownership of one or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - inbox lvm.py 1168 INFO Creating LV (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=outbox, size=16m, activate=True, contiguous=False, initialTags=(), device=None) lvm.py 1198 WARNING Could not change ownership of one or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - outbox lvm.py 1168 INFO Creating LV (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=ids, size=8m, activate=True, contiguous=False, initialTags=(), device=None) lvm.py 1198 WARNING Could not change ownership of one or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - ids lvm.py 1168 INFO Creating LV (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=leases, size=2048m, activate=True, contiguous=False, initialTags=(), device=None) lvm.py 1198 WARNING Could not change ownership of one or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - leases lvm.py 1168 INFO Creating LV (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, lv=master, size=1024m, activate=True, contiguous=False, initialTags=(), device=None) lvm.py 1198 WARNING Could not change ownership of one or more volumes in vg (d4d7649d-4849-4413-bdc2-b7b84f239092) - master lvm.py 1333 INFO Deactivating lvs: vg=d4d7649d-4849-4413-bdc2-b7b84f239092 lvs=['master'] blockdev.py 84 INFO Zeroing device /dev/d4d7649d-4849-4413-bdc2-b7b84f239092/metadata (size=41943040) utils.py 454 INFO Zero device /dev/d4d7649d-4849-4413-bdc2-b7b84f239092/metadata: 0.00 seconds blockdev.py 84 INFO Zeroing device /dev/d4d7649d-4849-4413-bdc2-b7b84f239092/inbox (size=1024000) utils.py 454 INFO Zero device /dev/d4d7649d-4849-4413-bdc2-b7b84f239092/inbox: 0.02 seconds blockdev.py 84 INFO Zeroing device /dev/d4d7649d-4849-4413-bdc2-b7b84f239092/outbox (size=1024000) utils.py 454 INFO Zero device /dev/d4d7649d-4849-4413-bdc2-b7b84f239092/outbox: 0.02 seconds lvm.py 1438 INFO Changing VG tags (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, delTags=[], addTags=['MDT_LEASETIMESEC=60', 'MDT_IOOPTIMEOUTSEC=10', 'MDT_LEASERETRIES=3', 'MDT_LOCKRENEWALINTERVALSEC=5', 'MDT_SDUUID=d4d7649d-4849-4413-bdc2-b7b84f239092', 'MDT_ROLE=Regular', 'MDT_POOL_UUID=', 'MDT_PV0=pv:loop2&44&uuid:wzupcF-uQME-3PIa-4WNf-THJJ-57MS-3UIdxD&44&pestart:0&44&pecount:157&44&mapoffset:0', 'MDT_CLASS=Data', 'MDT__SHA_CKSUM=ee58868dee52c4cc128f0ee89a0c382de1fe6419', 'MDT_LOGBLKSIZE=512', 'MDT_VGUUID=3OJX6U-UDLc-VFtg-2cRO-q3kR-UH2g-Nvf78I', 'MDT_PHYBLKSIZE=512', 'MDT_DESCRIPTION=loop-domain', 'MDT_TYPE=ISCSI', 'MDT_VERSION=3', 'MDT_LOCKPOLICY=']) lvm.py 1438 INFO Changing VG tags (vg=d4d7649d-4849-4413-bdc2-b7b84f239092, delTags=['RHAT_storage_domain_UNREADY'], addTags=['RHAT_storage_domain']) lvm.py 1325 INFO Activating lvs: vg=d4d7649d-4849-4413-bdc2-b7b84f239092 lvs=['master'] PASSED [ 50%] Nir

2 2

Reviews
by Germano Veit Michel 15 Feb '19

15 Feb '19

Hello, I've pinged this a few times before, sorry for doing it again. If there is interest to incorporate these changes in ovirt to make troubleshooting LSM/Snapshots easier please review these changes, otherwise we can abandon them. https://gerrit.ovirt.org/#/q/topic:snapshot-tools https://gerrit.ovirt.org/#/q/topic:dump-chains-sqlite Thanks,

2 1

Failed dependencies Ovirt 4.3 on clean install Centos 7.6
by Erick Perez 15 Feb '19

15 Feb '19

Centos 7.6 (minimal install ISO) UEFI boot yum -y update reboot [root@ovirt01] yum install cockpit-ovirt-dashboard Error: Package: 1:openvswitch-2.10.1-1.el7.x86_64 (ovirt-4.3-centos-ovirt43) Requires: librte_mbuf.so.3()(64bit) Available: dpdk-17.11-13.el7.x86_64 (extras) librte_mbuf.so.3()(64bit) Available: dpdk-17.11-15.el7.x86_64 (extras) librte_mbuf.so.3()(64bit) Installed:dpdk-18.11-2.el7_6.x86_64 (@extras) ~librte_mbuf.so.4()(64bit) You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest [root@ovirt01] Note: The above message appears for librte_mbuf/mempool/pmd and several others. It seems openvswitch specifically need DPDK v17 and not v18. Please clarify. thanks,

3 3

oVirt Glance repo not accessible blocked OST
by Tian Xu 15 Feb '19

15 Feb '19

Hi Experts, I am trying to run ovirt-system-test in my local lab, which need proxy to access internet. ovirt cannot work with proxy to access ovirt glance repository, whose URL is: http://glance.ovirt.org:9292, a related bugzilla was filed by someone and in open status: https://bugzilla.redhat.com/show_bug.cgi?id=1362433 Some ovirt system tests depend on ovirt glance repository for VM images, these tests either skipped or failed, then subsequent VM related tests cannot be run. Here is the test file: https://github.com/oVirt/ovirt-system-tests/blob/master/basic-suite-4.2/tes…, and the verify_glance_import failed, then other test blocked too. Below is exception message and details test log see attachment. Running test scenario 004_basic_sanity.py <?xml version="1.0" encoding="UTF-8"?> <testsuite name="nosetests" tests="3" errors="1" failures="0" skip="1"> <testcase classname="004_basic_sanity" name="add_blank_vms" time="2.473"/> <testcase classname="004_basic_sanity" name="verify_glance_import" time="0.050"> <skipped type="unittest.case.SkipTest" message="Glance is not available"><![CDATA[SkipTest: Glance is not available ]]></skipped> </testcase> <testcase classname="004_basic_sanity" name="add_vm1_from_template" time="0.266"> <error type="exceptions.IndexError" message="list index out of range"><![CDATA[Traceback (most recent call last): File "/usr/lib64/python2.7/unittest/case.py", line 369, in run testMethod() File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 142, in wrapped_test test() File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 60, in wrapper return func(get_test_prefix(), *args, **kwargs) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 79, in wrapper prefix.virt_env.engine_vm().get_api(api_ver=4), *args, **kwargs File "/mnt/zz/bak/ovirt-system-tests-master/basic-suite-4.2/test-scenarios/004_basic_sanity.py", line 748, in add_vm1_from_template )[0] IndexError: list index out of range ]]></error> </testcase> </testsuite> Thanks， Xu

2 3

Fwd: [ovirt-users] Ovirt Cluster completely unstable
by Sandro Bonazzola 14 Feb '19

14 Feb '19

Any suggestion from Gluster team on how to get back to a stable system in a very short loop? opened https://bugzilla.redhat.com/show_bug.cgi?id=1677160 to track this on gluster side. ---------- Forwarded message --------- From: <dscott(a)umbctraining.com> Date: gio 14 feb 2019 alle ore 00:26 Subject: [ovirt-users] Ovirt Cluster completely unstable To: <users(a)ovirt.org> I'm abandoning my production ovirt cluster due to instability. I have a 7 host cluster running about 300 vms and have been for over a year. It has become unstable over the past three days. I have random hosts both, compute and storage disconnecting. AND many vms disconnecting and becoming unusable. 7 host are 4 compute hosts running Ovirt 4.2.8 and three glusterfs hosts running 3.12.5. I submitted a bugzilla bug and they immediately assigned it to the storage people but have not responded with any meaningful information. I have submitted several logs. I have found some discussion on problems with instability with gluster 3.12.5. I would be willing to upgrade my gluster to a more stable version if that's the culprit. I installed gluster using the ovirt gui and this is the version the ovirt gui installed. Is there an ovirt health monitor available? Where should I be looking to get a resolution the problems I'm facing. _______________________________________________ Users mailing list -- users(a)ovirt.org To unsubscribe send an email to users-leave(a)ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BL4M3JQA3IEXC… -- SANDRO BONAZZOLA MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://red.ht/sig>

1 0

disk cache issues
by Hetz Ben Hamo 14 Feb '19

14 Feb '19

Hi, After digging around and finding a bit of info about viodiskcache - I understand that if the user enable it - then the VM cannot be live migrated. Umm, unless the op decides to do a live migration including changing storage - I don't understand why the live migration is disabled. If the VM will only be live migrated between nodes, then the storage is the same, nothing is saved locally on the node's hard disk, so what is the reason to disable live migration? Thanks, Hetz

3 2

=?utf-8?q?=5Bovirt-devel=5D?=(reddit) 2 months with oVirt review
by Sandro Bonazzola 13 Feb '19

13 Feb '19

Hi, thanks to Greg for sharing the link to https://www.reddit.com/r/homelab/comments/ahq29c/2_months_with_ovirt_review/ This is a very constructive feedback from community, let's see what can be done for improving on the highlighted parts. Thanks, -- SANDRO BONAZZOLA MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://red.ht/sig>

2 1

[ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 13-02-2019 ] [ 001_initialize_engine.initialize_engine ]
by Dafna Ron 13 Feb '19

13 Feb '19

Hi, we are failing on project ovirt-engine in master branch on initialize engine Change suspected: https://gerrit.ovirt.org/#/c/95743/8 - core: Remove unused field VdsStatic.vdsStrength setup log is here: https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/12864/artifa… full logs are here: https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/12864/artifa… Can you please check the issue? Thanks, Dafna <error> INSERT INTO gluster_config_master(config_key, config_description, minimum_supported_cluster, config_possible_values, config_feature) values('use_meta_volume', 'Meta volume for the geo-replication session', '3.5', 'false;true', 'geo_replication'); ************************** INSERT 0 1 2019-02-13 09:03:04,406-0500 Saving custom users permissions on database objects... ********* QUERY ********** copy ( select count(*) from pg_available_extensions where name = 'uuid-ossp' and installed_version IS NOT NULL ) to stdout with delimiter as '|'; ************************** 2019-02-13 09:03:04,906-0500 dbfunc_psql_die --file=/usr/share/ovirt-engine/dbscripts/upgrade/04_00_0000_set_version.sql ********* QUERY ********** select 4000000; ************************** 4000000 2019-02-13 09:03:05,534-0500 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema plugin.executeRaw:863 execute-result: ['/usr/share/ovirt-engine/dbscripts/schema.sh', '-s', 'localhost', '-p', '5432', '-u', 'engine', '-d', 'engine', '-l', '/var/log/ovirt-engine/setup/ovirt-engine-setup-20190213090242-msviji.log', '-c', 'apply'], rc=1 2019-02-13 09:03:05,534-0500 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema plugin.execute:921 execute-output: ['/usr/share/ovirt-engine/dbscripts/schema.sh', '-s', 'localhost', '-p', '5432', '-u', 'engine', '-d', 'engine', '-l', '/var/log/ovirt-engine/setup/ovirt-engine-setup-20190213090242-msviji.log', '-c', 'apply'] stdout: 2019-02-13 09:03:05,535-0500 DEBUG otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema plugin.execute:926 execute-output: ['/usr/share/ovirt-engine/dbscripts/schema.sh', '-s', 'localhost', '-p', '5432', '-u', 'engine', '-d', 'engine', '-l', '/var/log/ovirt-engine/setup/ovirt-engine-setup-20190213090242-msviji.log', '-c', 'apply'] stderr: FATAL: Operation aborted, found duplicate version: 04030790 2019-02-13 09:03:05,535-0500 ERROR otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema schema._misc:435 schema.sh: FATAL: Operation aborted, found duplicate version: 04030790 2019-02-13 09:03:05,535-0500 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", line 437, in _misc raise RuntimeError(_('Engine schema refresh failed')) RuntimeError: Engine schema refresh failed 2019-02-13 09:03:05,536-0500 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Misc configuration': Engine schema refresh failed 2019-02-13 09:03:05,562-0500 DEBUG otopi.plugins.otopi.debug.debug_failure.debug_failure debug_failure._notification:100 tcp connections: id uid local foreign state pid exe <error>

2 3

Recovering from power failure
by Sandro Bonazzola 13 Feb '19

13 Feb '19

Moving the discussion to devel list on this scenario. Il giorno mar 12 feb 2019 alle ore 11:16 Hetz Ben Hamo <hetz(a)hetz.biz> ha scritto: > Hi, > > Well, there is a severe bug that I complained about it on 4.2 (or 4.1? I > don't remember) and it's regarding "yanking the power cable". > Basically I'm performing a simple test: kill all hosts immediately to > simulate a power loss without UPS. > > For this test I have 2 nodes, and 4 storage domains: hosted_storage (that > was setup during the HE installation), 1 iSCSI domain, 1 NAS domain and 1 > ISO domain. > > After all the nodes loose power, I power them on and the following > procedure happens: > 1. The node with HE finishes booting, and it takes few minutes until the > HE is up. > 2. When the HE is up, all the storage domains comes back to life as online > and VM's with high availability starting to boot. > 3. Few minutes later, *all* (with the exception of hosted_storage) > storage domains are going down > 4. After about 5 minutes, all the other storage domains which went down, > are coming up, but by then, and VM's without high availability that are not > hosted on hosted_storage remains down, you'll need to power them manually > back. > > This whole procedure takes about 15-25 minutes after booting the nodes, > and this issue is always repeatable, just kill the power to the nodes, > power them up again and see for yourself. > > The solution would be to change the code and if a storage domain is up - *leave > it up*, skip the check. > > Tal, Nir, what do you think about this? > Thanks > > > On Tue, Feb 12, 2019 at 11:56 AM Sandro Bonazzola <sbonazzo(a)redhat.com> > wrote: > >> Hi, >> We are planning to release the first candidate of 4.3.1 on February >> 20th[1] and the final release on February 26th. >> Please join us testing this release candidate right after it will be >> announced! >> We are going to coordinate the testing effort with a public Trello board >> at https://trello.com/b/5ZNJgPC3 >> You'll find instructions on how to use the board there. >> >> If you have an environment dedicated to testing, remember you can setup a >> few VMs and test the deployment with nested virtualization. >> To ease the setup of such environment you can use Lago ( >> https://github.com/lago-project) >> >> The oVirt team will monitor the Trello board, the #ovirt IRC channel on >> irc.oftc.net server and the users(a)ovirt.org mailing list to assist with >> the testing. >> >> [1] >> https://www.ovirt.org/develop/release-management/releases/4.3.z/release-man… >> >> -- >> >> SANDRO BONAZZOLA >> >> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV >> >> Red Hat EMEA <https://www.redhat.com/> >> >> sbonazzo(a)redhat.com >> <https://red.ht/sig> >> _______________________________________________ >> Users mailing list -- users(a)ovirt.org >> To unsubscribe send an email to users-leave(a)ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/users@ovirt.org/message/URIGV3LPTE2RO… >> > -- SANDRO BONAZZOLA MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://red.ht/sig>

3 2