
On Wed, Nov 4, 2020 at 12:18 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
On 11/3/20 7:21 PM, Nir Soffer wrote:
On Tue, Nov 3, 2020 at 8:05 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Tue, Nov 3, 2020 at 6:53 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Tue, Nov 3, 2020 at 3:22 PM Marcin Sobczyk <msobczyk@redhat.com> wrote:
Hi All,
there are multiple pieces of information floating around on how to set up a machine for running OST. Some of them outdated (like dealing with el7), some of them more recent, but still a bit messy.
Not long ago, in some email conversation, Milan presented an ansible playbook that provided the steps necessary to do that. We've picked up the playbook, tweaked it a bit, made a convenience shell script wrapper that runs it, and pushed that into OST project [1].
This script, along with the playbook, should be our single-source-of-truth, one-stop solution for the job. It's been tested by a couple of persons and proved to be able to set up everything on a bare (rh)el8 machine. If you encounter any problems with the script please either report it on the devel mailing list, directly to me, or simply file a patch. Let's keep it maintained. Awesome, thanks! So setup_for_ost.sh finished successfully (after more than an hour), but now I see conflicting documentation and comments about how to run test suites and how to cleanup after the run.
The docs say: https://ovirt-system-tests.readthedocs.io/en/latest/general/running_tests/in...
./run_suite.sh basic-suite-4.0
But I see other undocumented ways in recent threads:
run_tests Trying the run_test option, from recent Mail:
. lagofy.sh lago_init /usr/share/ost-images/el8-engine-installed.qcow2 -k /usr/share/ost-images/el8_id_rsa This fails:
$ . lagofy.sh Suite basic-suite-master - lago_init /usr/share/ost-images/el8-engine-installed.qcow2 -k /usr/share/ost-images/el8_id_rsa Add your group to qemu's group: "usermod -a -G qemu nsoffer"
setup_for_ost.sh should handle this, no? It does: https://github.com/oVirt/ovirt-system-tests/blob/e1c1873d1e7de3f136e46b6355b... Maybe you didn't relog so the group inclusion would be effective? But I agree there should be a message printed to the user if relogging is necessary - I will write a patch for it.
[nsoffer@ost ovirt-system-tests]$ lago_init /usr/share/ost-images/el8-engine-installed.qcow2 -k /usr/share/ost-images/el8_id_rsa Using images ost-images-el8-host-installed-1-202011021248.x86_64, ost-images-el8-engine-installed-1-202011021248.x86_64 containing ovirt-engine-4.4.4-0.0.master.20201031195930.git8f858d6c01d.el8.noarch vdsm-4.40.35.1-1.el8.x86_64 @ Initialize and populate prefix: # Initialize prefix: * Create prefix dirs: * Create prefix dirs: Success (in 0:00:00) * Generate prefix uuid: * Generate prefix uuid: Success (in 0:00:00) * Copying ssh key: * Copying ssh key: Success (in 0:00:00) * Tag prefix as initialized: * Tag prefix as initialized: Success (in 0:00:00) # Initialize prefix: Success (in 0:00:00) # Create disks for VM lago-basic-suite-master-engine: * Create disk root: * Create disk root: Success (in 0:00:00) * Create disk nfs: * Create disk nfs: Success (in 0:00:00) * Create disk iscsi: * Create disk iscsi: Success (in 0:00:00) # Create disks for VM lago-basic-suite-master-engine: Success (in 0:00:00) # Create disks for VM lago-basic-suite-master-host-0: * Create disk root: * Create disk root: Success (in 0:00:00) # Create disks for VM lago-basic-suite-master-host-0: Success (in 0:00:00) # Create disks for VM lago-basic-suite-master-host-1: * Create disk root: * Create disk root: Success (in 0:00:00) # Create disks for VM lago-basic-suite-master-host-1: Success (in 0:00:00) # Copying any deploy scripts: # Copying any deploy scripts: Success (in 0:00:00) # calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. # Missing current link, setting it to default @ Initialize and populate prefix: ERROR (in 0:00:01) Error occured, aborting Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/lago/cmd.py", line 987, in main cli_plugins[args.verb].do_run(args) File "/usr/lib/python3.6/site-packages/lago/plugins/cli.py", line 186, in do_run self._do_run(**vars(args)) File "/usr/lib/python3.6/site-packages/lago/cmd.py", line 207, in do_init ssh_key=ssh_key, File "/usr/lib/python3.6/site-packages/lago/prefix.py", line 1143, in virt_conf_from_stream ssh_key=ssh_key File "/usr/lib/python3.6/site-packages/lago/prefix.py", line 1269, in virt_conf net_specs=conf['nets'], File "/usr/lib/python3.6/site-packages/lago/virt.py", line 101, in __init__ self._nets[name] = self._create_net(spec, compat) File "/usr/lib/python3.6/site-packages/lago/virt.py", line 113, in _create_net return cls(self, net_spec, compat=compat) File "/usr/lib/python3.6/site-packages/lago/providers/libvirt/network.py", line 44, in __init__ name=env.uuid, File "/usr/lib/python3.6/site-packages/lago/providers/libvirt/utils.py", line 96, in get_libvirt_connection return libvirt.openAuth(libvirt_url, auth) File "/usr/lib64/python3.6/site-packages/libvirt.py", line 104, in openAuth if ret is None:raise libvirtError('virConnectOpenAuth() failed') libvirt.libvirtError: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
$ systemctl status libvirtd ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: inactive (dead) Docs: man:libvirtd(8) https://libvirt.org [nsoffer@ost ovirt-system-tests]$ systemctl status libvirtd.socket ● libvirtd.socket - Libvirt local socket Loaded: loaded (/usr/lib/systemd/system/libvirtd.socket; enabled; vendor preset: disabled) Active: inactive (dead) Listen: /run/libvirt/libvirt-sock (Stream) [nsoffer@ost ovirt-system-tests]$ systemctl status libvirtd-ro.socket ● libvirtd-ro.socket - Libvirt local read-only socket Loaded: loaded (/usr/lib/systemd/system/libvirtd-ro.socket; enabled; vendor preset: disabled) Active: inactive (dead) Listen: /run/libvirt/libvirt-sock-ro (Stream) [nsoffer@ost ovirt-system-tests]$ systemctl status libvirtd-admin.socket ● libvirtd-admin.socket - Libvirt admin socket Loaded: loaded (/usr/lib/systemd/system/libvirtd-admin.socket; disabled; vendor preset: disabled) Active: inactive (dead) Listen: /run/libvirt/libvirt-admin-sock (Stream)
Another missing setup in setup_for_ost.sh?
Never encountered it myself, but I can also try enabling the 'libvirtd.socket' service in the playbook.
After adding myself to qemu group and starting libvirtd sockets lago_init seems to work.
time run_tests Not sure which tests will run, and:
run_tc basic-suite-master/test-scenarios/001_initialize_engine.py
Which seems to run only one test module. There's also 'run_tests', which runs all the tests for you.
This seems useful but for one module I found this undocumented command:
python -B -m pytest -s -v -x --junit-xml=test.xml ${SUITE}/test-scenarios/name_test_pytest.py
This looks most promising, assuming that I can use -k test_name or -m marker to select only some tests for quick feedback. However due to the way OST is built, mixing setup and test code, when later tests depend on earlier setup "tests" I don't see how this is going to work with current suites.
Perhaps what you want, some day, is for the individual tests to have make-style dependencies? So you'll issue just a single test, and OST will only run the bare minimum for running it. (Also, mind you, the integration team is still not perfect - we also have bugs - so "mixing setup and test code" is not a clear cut - setup code is also test code, as it runs code that we want to test, and which sometimes fails).
That is true - the way OST works is that tests are dependent on previous tests. It's not nice, but this is how OST worked from the beginning - it's not feasible to set up a whole oVirt solution for each test case. Changing that requires a complete redesign of the project.
'run_tc' is still useful though. You can do 'run_tests' to run OST up to some point, i.e. force a failure with 'assert False' and work on your new test case with the whole setup hanging at some stage. There is no way currently to freeze and restore the the state of the environment though, so if you apply some intrusive changes you may need to rerun the suite from the beginning.
There used to be 'lago snapshot', no? I think I even used it at some point.
What is the difference between the ways, and which one is the right way?
'run_suite.sh' is used by CI and it brings all the legacy and the burden possible (el7/py2/keeping all the suites happy etc.). 'lagofy.sh' OTOH was written by Michal recently and it's more lightweight and user friendly. There is no right way - work with the one that suits you better.
My plan is to add a new storage suite that will run after some basic setup was done - engine, hosts, and storage are ready. Which tests scenarios are needed to reach this state?
Probably up until '002_bootstrap' finishes, but you have to find a point which satisfies you by yourself. If you think that there's too much stuff in that module, please make a suggestion to split it into separate ones.
Do we have any documentation on how to add a new suite? or my only reference is the network suite?
Unfortunately not, but feel free to reach me if you have any questions.
Thanks! (Still in my TODO list to try this - hopefully soon). -- Didi