Based on the feedback, I changed the code now such that we read the PropagateErrors config at the very first time disk is created (BaseImage Constructor) and the disk property is set accordingly. This change is very straightforward but I am stuck with a lots of unit test failures and it is becoming a challenging journey to fix all the issues.

First of all, I realized that the majority of the disk related tests are written with no mock configuration setup. So when the tests are run the config is never initialized. But since the now code tries to use the config, the test simply fails. Once I fixed the first test, the next case fails and I am literally spending hours to understand the tests and get those fixed.

One particular testcase scenario is pretty unique. There is a parameterizedTest (StorageDomainValidatorFreeSpaceTest) where the function that create parameters (createParam)is called "before" individual tests. This function has code that ends up adding a new disk snapshot via a copyOf method. The MockConfigExtension is the standard way for us to add config. I found that there is no way to inject config before the parameters are created. I have rearranged the code to workaround the issue here.

Anyways, I thought I will ask if there is a better way to accomplish the goal here because it has very challenging and time consuming problem I am running into with the tests.

Thanks

Shubha

From: Shubha Kulkarni
Sent: Tuesday, August 11, 2020 6:46 PM
To: Kevin Wolf <kwolf@redhat.com>; Nir Soffer <nsoffer@redhat.com>
Cc: devel <devel@ovirt.org>; Simon Coter <simon.coter@oracle.com>; Benjamin Marzinski <bmarzins@redhat.com>; greg King <greg.king@oracle.com>; Pierre Lecomte <pierre.lecomte@oracle.com>
Subject: [ovirt-devel] Re: Improving VM behavior in case of IO errors

Okay. So here is my understanding/summary based on the discussion.

Disk Storage Type	Current Behavior (propagate error = On)	Current Behavior (propagate error = Off)	Recommended Changes (propagate error = On)	Recommended Changes (propagate error = Off)
Image	Error policy – enospace	Error policy - stop	No change	No change
Lun	Error policy - enospace	Error policy - stop	Error policy - report	No change
Cinder	Error policy - stop	Error policy - stop	Error policy - report	No change
Managed Block Storage	Use default	Use default	No change	No change

Can you please confirm?

Thanks,

Shubha

On 8/11/2020 1:34 PM, Kevin Wolf wrote:

Am 11.08.2020 um 19:22 hat Nir Soffer geschrieben:

On Tue, Aug 11, 2020 at 7:21 PM Kevin Wolf <kwolf@redhat.com> wrote:

Am 11.08.2020 um 17:44 hat Nir Soffer geschrieben:

On Mon, Aug 10, 2020 at 11:53 AM Kevin Wolf <kwolf@redhat.com> wrote:

Am 09.08.2020 um 23:50 hat Nir Soffer geschrieben:

On Wed, Jul 29, 2020 at 2:30 PM Shubha Kulkarni

<shubha.kulkarni@oracle.com> wrote:

Thanks for the feedback Nir.

I agree in general that having an additional engine config for disk

level error handling default would be the right way. It would be

good

to

decide the granularity. Would it make sense to have this for a

specific

disk type like lun or would you prefer to make it generic for all

types?

This must be for a specific disk type, since for thin images on block

storage we cannot support propagating errors to the guest. This will

break thin provisioning.

Is werror=enospc not enough for thin provisioning to work? This will

still stop the guest for any other kinds of I/O errors.

Right, this should work, and what we actually use now for propagating

errors for anything but cdrom.

Hm, wait, the options you quote below are all either 'stop' or 'report',

but never 'enospc'. Is 'enospc' used for yet another kind of disk?

Currently as a user there is no good way to get enospc, this is what Shubha

is trying to fix.

For LUN using werror=enospc,rerror=enospc seems wrong, but we do this for

many years.

This is how we handle cdrom:

-device

ide-cd,bus=ide.2,id=ua-346e176c-f983-4510-af4b-786b368efdd6,bootindex=2,werror=report,rerror=report

Makes sense to me. This is read-only and removable media. Stopping the

guest usually makes sense so that it won't assume the disk is broken,

but if it happens with removable media, you can just eject and re-insert

the same image and it's fixed.

BTW this was changed since users typically leave cdrom attached

with otherwise unused ISO storage domain (NFS). When the NFS server breaks

the VM was stopped.

Image:

-device

virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-2-format,id=ua-1d93fa9e-1665-40d7-9ffc-770513242795,bootindex=1,write-cache=on,serial=1d93fa9e-1665-40d7-9ffc-770513242795,werror=stop,rerror=stop

I assume this is the one that could use 'enospc'?

Yes, if we propagate errors, this will become werror=enospc,rerror=enospc

LUN:

-device

virtio-blk-pci,iothread=iothread2,scsi=off,bus=pci.7,addr=0x0,drive=libvirt-1-format,id=ua-19b06845-2c54-422d-921b-6ec0ee2e935b,write-cache=on,werror=stop,rerror=stop

Kevin, any reason not to use werror=report,rerror=report for LUN when

we want to propagate errors to the guest?

If you want to propagate errors, then 'report' is the right setting.

What does "LUN" mean exactly?

When we attach a multipath device, this is called "Direct LUN" in oVirt.

The underlying device can iSCSI or FC, managed by the user, or managed by

Cindelib.

We have 3 options:

1. As virtio or virtio-scsi

    <disk type='block' device='disk' snapshot='no'>

      <driver name='qemu' type='raw' cache='none' error_policy='stop'

io='native' iothread='1'/>

      <source dev='/dev/mapper/360014058657c2a1941841348f19c1a50' index='1'>

        <seclabel model='dac' relabel='no'/>

      </source>

      <backingStore/>

      <target dev='vdb' bus='virtio'/>

      <alias name='ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b'/>

      <address type='pci' domain='0x0000' bus='0x06' slot='0x00'

function='0x0'/>

    </disk>

-blockdev

'{"driver":"host_device","filename":"/dev/mapper/360014058657c2a1941841348f19c1a50","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}'

-blockdev

'{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}'

-device

virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.6,addr=0x0,drive=libvirt-1-format,id=ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b,write-cache=on,werror=stop,rerror=stop

2. same + passthrough

    <disk type='block' device='lun' sgio='filtered' snapshot='no'>

      <driver name='qemu' type='raw' cache='none' error_policy='stop'

io='native'/>

      <source dev='/dev/mapper/360014058657c2a1941841348f19c1a50' index='1'>

        <seclabel model='dac' relabel='no'/>

      </source>

      <backingStore/>

      <target dev='sda' bus='scsi'/>

      <alias name='ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b'/>

      <address type='drive' controller='0' bus='0' target='0' unit='0'/>

    </disk>

-blockdev

'{"driver":"host_device","filename":"/dev/mapper/360014058657c2a1941841348f19c1a50","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}'

-blockdev

'{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}'

-device

scsi-block,bus=ua-50240806-3d5a-4e5b-a220-bc394698a641.0,channel=0,scsi-id=0,lun=0,drive=libvirt-1-format,id=ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b,werror=stop,rerror=stop

3. same + privileged I/O

    <disk type='block' device='lun' sgio='unfiltered' snapshot='no'>

      <driver name='qemu' type='raw' cache='none' error_policy='stop'

io='native'/>

      <source dev='/dev/mapper/360014058657c2a1941841348f19c1a50' index='1'>

        <seclabel model='dac' relabel='no'/>

      </source>

      <backingStore/>

      <target dev='sda' bus='scsi'/>

      <alias name='ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b'/>

      <address type='drive' controller='0' bus='0' target='0' unit='0'/>

    </disk>

-blockdev

'{"driver":"host_device","filename":"/dev/mapper/360014058657c2a1941841348f19c1a50","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}'

-blockdev

'{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}'

-device

scsi-block,bus=ua-9c2c7e43-d32d-4ea4-9cfd-e2bb36d26fdb.0,channel=0,scsi-id=0,lun=0,drive=libvirt-1-format,id=ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b,werror=stop,rerror=stop

It doesn't seem to be passthrough, so is

it just that you have some restriction like that it's always raw?

Yes, we don't support qcow2 for these disks (yet). Theoretically we

can support qcow2 to enable incremental backup. But the qcow2 image

will never be larger than the block device, actually smaller to leave

room for metadata.  Thin provisioning is done on the storage side.

Maybe

I would use 'enospc' for consistency even though you never expect this

error to happen. But 'report' is fine, too.

enospc looks wrong since this error should not be possible, and if it

happens we cannot handle it. Sounds like good way to confuse futures

maintainers of this code.

If it's a different code path anyway, then I agree. If you would have to

make it a special case just for this, I'd rather add a comment.

Maybe libvirt or qemu did not support "report" when this code was added in

2010?

At least as far as QEMU is concerned, 'report' has existed as long as

rerror/werror in general. If you're interested, commit 428c570512c added

it in 2009.

Kevin