On Mon, Jun 11, 2018 at 7:20 PM Kevin Wolf <kwolf@redhat.com> wrote:
Am 11.06.2018 um 12:43 hat Nir Soffer geschrieben:
> On Mon, Jun 11, 2018 at 1:03 PM Maor Lipchuk <mlipchuk@redhat.com> wrote:
>
> > On Tue, Jun 5, 2018 at 9:40 AM, Dan Kenigsberg <danken@redhat.com> wrote:
> >
> >> On Mon, Jun 4, 2018 at 7:14 PM, Nir Soffer <nsoffer@redhat.com> wrote:
> >> > On Mon, Jun 4, 2018 at 6:56 PM Dan Kenigsberg <danken@redhat.com>
> >> wrote:
> >> >>
> >> >> On Tue, May 8, 2018 at 11:59 AM, Nir Soffer <nsoffer@redhat.com>
> >> wrote:
> >> >> > There are several issues:
> >> >> >
> >> >> > 1. coverage fail after this patch:
> >> >> >
> >> >> >
> >> https://github.com/oVirt/vdsm/commit/6b905c2c134bcf344961d28eefbd05f2838d2ec8
> >> >> >
> >> >> > https://travis-ci.org/oVirt/vdsm/builds/366574414
> >> >> > ...
> >> >> > pwd
> >> >> > /vdsm/tests
> >> >> > ls .cov*
> >> >> > ls: cannot access .cov*: No such file or directory
> >> >> > make[1]: *** [check] Error 2
> >> >> > make[1]: Leaving directory `/vdsm/tests'
> >> >>
> >> >> That was me, sorry. This should solve it:
> >> >> https://gerrit.ovirt.org/#/c/91925/
> >> >
> >> > Thanks!
> >> >
> >> >> BTW, on my fc28 I see TestCountClusters.test_multiple_blocks failing
> >> with
> >> >>
> >> >> E           Error: Command ['/usr/bin/qemu-img', 'map', '--output',
> >> >> 'json', '/var/tmp/vdsm/test_multiple_blocks0/test'] failed with rc=-6
> >> >> out='' err="qemu-img:
> >> >> /builddir/build/BUILD/qemu-2.12.0-rc1/qemu-img.c:2680:
> >> >> get_block_status: Assertion `bytes' failed.\n"
> >> >>
> >> >> Any idea what's that?
> >> >
> >> >
> >> > Looks like qemu-img bug.
> >> >
> >> > Can you file a qemu-img bug?
> >>
> >> I hope Maor can translate the test to qemu-img speak.
> >>
> >
> > Opened the following bug:
> >   https://bugzilla.redhat.com/1589738
> >
>
> Adding qemu-block

It's related to the unaligned image size.

Right, I think we forgot the -1 in the test setup:

    f.seek(16 * 1024 - 1)
    f.write(b"x")

But it is good that we made this error :-)
 
Correct image files should be
aligned to 512 byte sectors, so something is wrong with your image to
start with (hard disks don't have half sectors).

Right. We indeed forbid uploads on unaligned images in the UI.

But we found that qemu-img creates unaligned images:

$ qemu-img create -f qcow2 empty.raw 10g
Formatting 'empty.raw', fmt=qcow2 size=10737418240 cluster_size=65536 lazy_refcounts=off refcount_bits=16

$ ls -l empty.raw 
-rw-r--r--. 1 nsoffer nsoffer 196768 Jun 12 01:59 empty.raw

$ python -c "print 196768 % 512"
160

$ qemu-img map -f raw --output json test.raw 
[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": 0},
{ "start": 4096, "length": 12288, "depth": 0, "zero": true, "data": false, "offset": 4096},
{ "start": 16384, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": 16384},
qemu-img: /builddir/build/BUILD/qemu-2.12.0-rc1/qemu-img.c:2680: get_block_status: Assertion `bytes' failed.
Aborted (core dumped)

The image becomes aligned once you write anything into it, but I still find it 
strange that qemu-img create such files. Shouldn't it create always 3 complete
clusters?
 

Anyway, git bisects points to this one:

a290f085901b528265787cd27ebda19c970be4ee is the first bad commit
commit a290f085901b528265787cd27ebda19c970be4ee
Author: Eric Blake <eblake@redhat.com>
Date:   Tue Feb 13 14:26:44 2018 -0600

    file-posix: Switch to .bdrv_co_block_status()

    We are gradually moving away from sector-based interfaces, towards
    byte-based.  Update the file protocol driver accordingly.

    In want_zero mode, we continue to report fine-grained hole
    information (the caller wants as much mapping detail as possible);
    but when not in that mode, the caller prefers larger *pnum and
    merely cares about what offsets are allocated at this layer, rather
    than where the holes live.  Since holes still read as zeroes at
    this layer (rather than deferring to a backing layer), we can take
    the shortcut of skipping lseek(), and merely state that all bytes
    are allocated.

    We can also drop redundant bounds checks that are already
    guaranteed by the block layer.

    Signed-off-by: Eric Blake <eblake@redhat.com>
    Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
    Reviewed-by: Fam Zheng <famz@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>

I think the problem is a bit higher up the call stack, but I'm not
completely sure yet. It manifests in img_map(), in this code:

        while (curr.start + curr.length < length) {
            ...
            n = QEMU_ALIGN_DOWN(MIN(1 << 30, length - offset), BDRV_SECTOR_SIZE);
            ret = get_block_status(bs, offset, n, &next);

The loop condition is still true because a single byte is left to be
processed, but n is aligned down to 0. I'm not sure why the
QEMU_ALIGN_DOWN() is even there.

Eric, would just removing the QEMU_ALIGN_DOWN() be correct?

Kevin