Okay. So here is my understanding/summary based on the discussion.
Disk Storage Type
Current Behavior
(propagate error = On)
Current Behavior (propagate error = Off)
Recommended Changes (propagate error = On)
Recommended Changes (propagate error = Off)
Image
Error policy – enospace
Error policy - stop
No change
No change
Lun
Error policy - enospace
Error policy - stop
*Error policy - report*
No change
Cinder
Error policy - stop
Error policy - stop
*Error policy - report*
No change
Managed Block Storage
Use default
Use default
No change
No change
Can you please confirm?
Thanks,
Shubha
On 8/11/2020 1:34 PM, Kevin Wolf wrote:
Am 11.08.2020 um 19:22 hat Nir Soffer geschrieben:
> On Tue, Aug 11, 2020 at 7:21 PM Kevin Wolf <kwolf(a)redhat.com> wrote:
>
>> Am 11.08.2020 um 17:44 hat Nir Soffer geschrieben:
>>> On Mon, Aug 10, 2020 at 11:53 AM Kevin Wolf <kwolf(a)redhat.com> wrote:
>>>
>>>> Am 09.08.2020 um 23:50 hat Nir Soffer geschrieben:
>>>>> On Wed, Jul 29, 2020 at 2:30 PM Shubha Kulkarni
>>>>> <shubha.kulkarni(a)oracle.com> wrote:
>>>>>> Thanks for the feedback Nir.
>>>>>>
>>>>>> I agree in general that having an additional engine config for
disk
>>>>>> level error handling default would be the right way. It would be
>> good
>>>> to
>>>>>> decide the granularity. Would it make sense to have this for a
>> specific
>>>>>> disk type like lun or would you prefer to make it generic for
all
>>>> types?
>>>>> This must be for a specific disk type, since for thin images on
block
>>>>> storage we cannot support propagating errors to the guest. This will
>>>>> break thin provisioning.
>>>> Is werror=enospc not enough for thin provisioning to work? This will
>>>> still stop the guest for any other kinds of I/O errors.
>>>>
>>> Right, this should work, and what we actually use now for propagating
>>> errors for anything but cdrom.
>> Hm, wait, the options you quote below are all either 'stop' or
'report',
>> but never 'enospc'. Is 'enospc' used for yet another kind of
disk?
>>
> Currently as a user there is no good way to get enospc, this is what Shubha
> is trying to fix.
>
>
>
>> For LUN using werror=enospc,rerror=enospc seems wrong, but we do this for
>>> many years.
>>>
>>> This is how we handle cdrom:
>>>
>>> -device
>>>
>>
ide-cd,bus=ide.2,id=ua-346e176c-f983-4510-af4b-786b368efdd6,bootindex=2,werror=report,rerror=report
>>
>> Makes sense to me. This is read-only and removable media. Stopping the
>> guest usually makes sense so that it won't assume the disk is broken,
>> but if it happens with removable media, you can just eject and re-insert
>> the same image and it's fixed.
>>
> BTW this was changed since users typically leave cdrom attached
> with otherwise unused ISO storage domain (NFS). When the NFS server breaks
> the VM was stopped.
>
>> Image:
>>> -device
>>>
>>
virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.5,addr=0x0,drive=libvirt-2-format,id=ua-1d93fa9e-1665-40d7-9ffc-770513242795,bootindex=1,write-cache=on,serial=1d93fa9e-1665-40d7-9ffc-770513242795,werror=stop,rerror=stop
>>
>> I assume this is the one that could use 'enospc'?
>>
> Yes, if we propagate errors, this will become werror=enospc,rerror=enospc
>
>
>>> LUN:
>>>
>>> -device
>>>
>>
virtio-blk-pci,iothread=iothread2,scsi=off,bus=pci.7,addr=0x0,drive=libvirt-1-format,id=ua-19b06845-2c54-422d-921b-6ec0ee2e935b,write-cache=on,werror=stop,rerror=stop
>>> \
>>>
>>> Kevin, any reason not to use werror=report,rerror=report for LUN when
>>> we want to propagate errors to the guest?
>> If you want to propagate errors, then 'report' is the right setting.
>>
>> What does "LUN" mean exactly?
>
> When we attach a multipath device, this is called "Direct LUN" in oVirt.
> The underlying device can iSCSI or FC, managed by the user, or managed by
> Cindelib.
>
> We have 3 options:
>
> 1. As virtio or virtio-scsi
>
> <disk type='block' device='disk' snapshot='no'>
> <driver name='qemu' type='raw' cache='none'
error_policy='stop'
> io='native' iothread='1'/>
> <source dev='/dev/mapper/360014058657c2a1941841348f19c1a50'
index='1'>
> <seclabel model='dac' relabel='no'/>
> </source>
> <backingStore/>
> <target dev='vdb' bus='virtio'/>
> <alias name='ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b'/>
> <address type='pci' domain='0x0000' bus='0x06'
slot='0x00'
> function='0x0'/>
> </disk>
>
> -blockdev
>
'{"driver":"host_device","filename":"/dev/mapper/360014058657c2a1941841348f19c1a50","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}'
> \
> -blockdev
>
'{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}'
> \
> -device
>
virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.6,addr=0x0,drive=libvirt-1-format,id=ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b,write-cache=on,werror=stop,rerror=stop
> \
>
> 2. same + passthrough
>
> <disk type='block' device='lun' sgio='filtered'
snapshot='no'>
> <driver name='qemu' type='raw' cache='none'
error_policy='stop'
> io='native'/>
> <source dev='/dev/mapper/360014058657c2a1941841348f19c1a50'
index='1'>
> <seclabel model='dac' relabel='no'/>
> </source>
> <backingStore/>
> <target dev='sda' bus='scsi'/>
> <alias name='ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b'/>
> <address type='drive' controller='0' bus='0'
target='0' unit='0'/>
> </disk>
>
> -blockdev
>
'{"driver":"host_device","filename":"/dev/mapper/360014058657c2a1941841348f19c1a50","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}'
> \
> -blockdev
>
'{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}'
> \
> -device
>
scsi-block,bus=ua-50240806-3d5a-4e5b-a220-bc394698a641.0,channel=0,scsi-id=0,lun=0,drive=libvirt-1-format,id=ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b,werror=stop,rerror=stop
> \
>
> 3. same + privileged I/O
>
> <disk type='block' device='lun' sgio='unfiltered'
snapshot='no'>
> <driver name='qemu' type='raw' cache='none'
error_policy='stop'
> io='native'/>
> <source dev='/dev/mapper/360014058657c2a1941841348f19c1a50'
index='1'>
> <seclabel model='dac' relabel='no'/>
> </source>
> <backingStore/>
> <target dev='sda' bus='scsi'/>
> <alias name='ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b'/>
> <address type='drive' controller='0' bus='0'
target='0' unit='0'/>
> </disk>
>
> -blockdev
>
'{"driver":"host_device","filename":"/dev/mapper/360014058657c2a1941841348f19c1a50","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}'
> \
> -blockdev
>
'{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}'
> \
> -device
>
scsi-block,bus=ua-9c2c7e43-d32d-4ea4-9cfd-e2bb36d26fdb.0,channel=0,scsi-id=0,lun=0,drive=libvirt-1-format,id=ua-c1bf9168-00f0-422f-a190-9ddf6bcd449b,werror=stop,rerror=stop
> \
>
> It doesn't seem to be passthrough, so is
>> it just that you have some restriction like that it's always raw?
>
> Yes, we don't support qcow2 for these disks (yet). Theoretically we
> can support qcow2 to enable incremental backup. But the qcow2 image
> will never be larger than the block device, actually smaller to leave
> room for metadata. Thin provisioning is done on the storage side.
>
>
>> Maybe
>> I would use 'enospc' for consistency even though you never expect this
>> error to happen. But 'report' is fine, too.
>>
> enospc looks wrong since this error should not be possible, and if it
> happens we cannot handle it. Sounds like good way to confuse futures
> maintainers of this code.
If it's a different code path anyway, then I agree. If you would have to
make it a special case just for this, I'd rather add a comment.
> Maybe libvirt or qemu did not support "report" when this code was added in
> 2010?
At least as far as QEMU is concerned, 'report' has existed as long as
rerror/werror in general. If you're interested, commit 428c570512c added
it in 2009.
Kevin