On Thu, Dec 16, 2021 at 10:47 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Dec 16, 2021 at 10:14 PM Milan Zamazal <mzamazal@redhat.com> wrote:
>
> Hi,
>
> when I run Vdsm tests as a non-root user in vdsm-test-centos-8
> container, they systematically fail on several storage tests.  I run the
> tests as
>
>   podman run -it -v $HOME:$HOME --userns=keep-id vdsm-test-centos-8 $HOME/test.sh

Why do you need the --userns option?

You run this in a container because your development environment
is not compatible - right?

> where test.sh is
>
>   #!/bin/sh
>
>   export TRAVIS_CI=1

This is not correct for your local container - in travis we run a privileged container
as root and we create loop devices before the test.
 
>   cd .../vdsm
>   ./autogen.sh --system
>   make clean
>   make

I never run autogen and make during the tests, this can be done once
after checkout or when modifying the makefiles.

>   make lint
>   make tests

You are missing:

    make storage

Without this, a lot of tests will be skipped or xfailed.

>   make tests-storage
>
> The failing tests are in devicemapper_test.py, outofprocess_test.py and
> qemuimg_test.py.  I have also seen a test failure in nbd_test.py but not
> always.  Is it a problem of the tests or of my environment?

nbd_test.py should be skipped unless you run as root, or have running
supervdsm serving your user.

I usually run the storage tests like this locally on Fedora (35 now):

    make storage
    tox -e storage

Results:

storage/devicemapper_test.py - 5 pass, rest skip
storage/nbd_test.py - 2 pass, rest skip
storage/qemuimg_test.py - 2 skips 
storage/outofprocess_test.py - 3 skips, rest pass 

2350 passed, 161 skipped, 102 deselected, 3 xfailed, 13093 warnings in 134.80s (0:02:14) 


I'll try later using the container.


I tried this:

    podman run --rm -it -v `pwd`:/src:Z --userns=keep-id vdsm-test-centos-8
    cd src
    make tests-storage

Got 2 test failures:


============================================================= FAILURES ==============================================================
______________________________________________________ test_block_device_name _______________________________________________________

    def test_block_device_name():
        devs = glob.glob("/sys/block/*/dev")
        dev_name = os.path.basename(os.path.dirname(devs[0]))
        with open(devs[0], 'r') as f:
            major_minor = f.readline().rstrip()
>           assert devicemapper.device_name(major_minor) == dev_name
E           AssertionError: assert '7:1' == 'loop1'
E             - loop1
E             + 7:1

This likey failed since there are no loop devices in the container:

bash-4.4$ ls /dev/
console  core  fd  full  mqueue  null  ptmx  pts  random  shm  stderr  stdin  stdout  tty  urandom  zero

And there is no way to create them, since your run as regular user
and sudo does not work. Event if it works I think you will not be able
to create the loop devices since the container is not privileged.

It may be possible to map the loop devices from the host to the container
but I never tried.


storage/devicemapper_test.py:243: AssertionError
___________________________________________________ test_stop_server_not_running ____________________________________________________

    @broken_on_ci
    def test_stop_server_not_running():
        # Stopping non-existing server should succeed.
>       nbd.stop_server("no-such-server-uuid")

storage/nbd_test.py:806:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../lib/vdsm/storage/nbd.py:179: in stop_server
    info = systemctl.show(service, properties=("LoadState",))
../lib/vdsm/common/systemctl.py:74: in show
    out = commands.run(cmd).decode("utf-8")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = ['/usr/bin/systemctl', 'show', '--property=LoadState', 'vdsm-nbd-no-such-server-uuid.service'], input = None, cwd = None
env = None, sudo = False, setsid = False, nice = None, ioclass = None, ioclassdata = None, reset_cpu_affinity = True

    def run(args, input=None, cwd=None, env=None, sudo=False, setsid=False,
            nice=None, ioclass=None, ioclassdata=None, reset_cpu_affinity=True):
        """
        Starts a command communicate with it, and wait until the command
        terminates. Ensures that the command is killed if an unexpected error is
        raised.
   
        args are logged when command starts, and are included in the exception if a
        command has failed. If args contain sensitive information that should not
        be logged, such as passwords, they must be wrapped with ProtectedPassword.
   
        The child process stdout and stderr are always buffered. If you have
        special needs, such as running the command without buffering stdout, or
        create a pipeline of several commands, use the lower level start()
        function.
   
        Arguments:
            args (list): Command arguments
            input (bytes): Data to send to the command via stdin.
            cwd (str): working directory for the child process
            env (dict): environment of the new child process
            sudo (bool): if set to True, run the command via sudo
            nice (int): if not None, run the command via nice command with the
                specified nice value
            ioclass (int): if not None, run the command with the ionice command
                using specified ioclass value.
            ioclassdata (int): if ioclass is set, the scheduling class data. 0-7
                are valid data (priority levels).
            reset_cpu_affinity (bool): Run the command via the taskset command,
                allowing the child process to run on all cpus (default True).
   
        Returns:
            The command output (bytes)
   
        Raises:
            OSError if the command could not start.
            cmdutils.Error if the command terminated with a non-zero exit code.
            utils.TerminatingFailure if command could not be terminated.
        """
        p = start(args,
                  stdin=subprocess.PIPE if input else None,
                  stdout=subprocess.PIPE,
                  stderr=subprocess.PIPE,
                  cwd=cwd,
                  env=env,
                  sudo=sudo,
                  setsid=setsid,
                  nice=nice,
                  ioclass=ioclass,
                  ioclassdata=ioclassdata,
                  reset_cpu_affinity=reset_cpu_affinity)
   
        with terminating(p):
            out, err = p.communicate(input)
   
        log.debug(cmdutils.retcode_log_line(p.returncode, err))
   
        if p.returncode != 0:
>           raise cmdutils.Error(args, p.returncode, out, err)
E           vdsm.common.cmdutils.Error: Command ['/usr/bin/systemctl', 'show', '--property=LoadState', 'vdsm-nbd-no-such-server-uuid.service'] failed with rc=1 out=b'' err=b"System has not been booted with systemd as init system (PID 1). Can't operate.\nFailed to connect to bus: Host is down\n"


This fails because we have systemd in the container, but we did not start the
container in the write way to make it happy.

I'm not sure why the test was not skipped, probably a bug in the skip condition.


2 failed, 1965 passed, 164 skipped, 102 deselected, 383 xfailed, 1 warning in 150.30s (0:02:30)

I'll try to fix the 2 failing tests so they skip in this condition.

The first test can probably be skipped we don't have any /dev/loop*
The second can be skipped if systemctl show does not work.

Nir