Yes, on file based storage a snapshot is a file, and it grows as
needed.  On block based
storage, a snapshot is a logical volume, and oVirt needs to extend it
when needed.

Forgive my ignorance, coming from vSphere background where a filesystem was created on iSCSI LUN.
I take that this isn't the case in case of a iSCSI Storage Domain in oVirt.

On Wed, Aug 11, 2021 at 12:26 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Aug 11, 2021 at 12:43 AM Shantur Rathore
<shantur.rathore@gmail.com> wrote:
>
> Thanks for the detailed response Nir.
>
> In my use case, we keep creating VMs from templates and deleting them so we need the VMs to be created quickly and cloning it will use a lot of time and storage.

That's a good reason to use a template.

If your vm is temporary and you like to drop the data written while
the vm is running, you
could use a temporary disk based on the template. This is called a
"transient disk" in vdsm.

Arik, maybe you remember how transient disks are used in engine?
Do we have an API to run a VM once, dropping the changes to the disk
done while the VM was running?

> I will try to add the config and try again tomorrow. Also I like the Managed Block storage idea, I had read about it in the past and used it with Ceph.
>
> Just to understand it better, is this issue only on iSCSI based storage?

Yes, on file based storage a snapshot is a file, and it grows as
needed.  On block based
storage, a snapshot is a logical volume, and oVirt needs to extend it
when needed.

Nir

> Thanks again.
>
> Regards
> Shantur
>
> On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer <nsoffer@redhat.com> wrote:
>>
>> On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
>> <shantur.rathore@gmail.com> wrote:
>> >
>> > Hi all,
>> >
>> > I have a setup as detailed below
>> >
>> > - iSCSI Storage Domain
>> > - Template with Thin QCOW2 disk
>> > - Multiple VMs from Template with Thin disk
>>
>> Note that a single template disk used by many vms can become a performance
>> bottleneck, and is a single point of failure. Cloning the template when creating
>> vms avoids such issues.
>>
>> > oVirt Node 4.4.4
>>
>> 4.4.4 is old, you should upgrade to 4.4.7.
>>
>> > When the VMs boots up it downloads some data to it and that leads to increase in volume size.
>> > I see that every few seconds the VM gets paused with
>> >
>> > "VM X has been paused due to no Storage space error."
>> >
>> >  and then after few seconds
>> >
>> > "VM X has recovered from paused back to up"
>>
>> This is normal operation when a vm writes too quickly and oVirt cannot
>> extend the disk quick enough. To mitigate this, you can increase the
>> volume chunk size.
>>
>> Created this configuration drop in file:
>>
>> # cat /etc/vdsm/vdsm.conf.d/99-local.conf
>> [irs]
>> volume_utilization_percent = 25
>> volume_utilization_chunk_mb = 2048
>>
>> And restart vdsm.
>>
>> With this setting, when free space in a disk is 1.5g, the disk will
>> be extended by 2g. With the default setting, when free space is
>> 0.5g the disk was extended by 1g.
>>
>> If this does not eliminate the pauses, try a larger chunk size
>> like 4096.
>>
>> > Sometimes after a many pause and recovery the VM dies with
>> >
>> > "VM X is down with error. Exit message: Lost connection with qemu process."
>>
>> This means qemu has crashed. You can find more info in the vm log at:
>> /var/log/libvirt/qemu/vm-name.log
>>
>> We know about bugs in qemu that cause such crashes when vm disk is
>> extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
>> will fix this issue.
>>
>> Even with these settings, if you have a very bursty io in the vm, it may
>> become paused. The only way to completely avoid these pauses is to
>> use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
>> can be thin provisioned on the server side so it does not mean you need
>> more storage, but you will not be able to use shared templates in the way
>> you use them now. You can create vm from template, but the template
>> is cloned to the new vm.
>>
>> Another option with (still tech preview) is Managed Block Storage (Cinder
>> based storage). If your storage server is supported by Cinder, we can
>> managed it using cinderlib. In this setup every disk is a LUN, which may
>> be thin provisioned on the storage server. This can also offload storage
>> operations to the server, like cloning disks, which may be much faster and
>> more efficient.
>>
>> Nir
>>