[ovirt-users] Re: Sparse VMs from Templates - Storage issues

Tuesday, 10 August 2021

Thanks for the detailed response Nir.

In my use case, we keep creating VMs from templates and deleting them so
we need the VMs to be created quickly and cloning it will use a lot of time
and storage.
I will try to add the config and try again tomorrow. Also I like the
Managed Block storage idea, I had read about it in the past and used it
with Ceph.

Just to understand it better, is this issue only on iSCSI based storage?

Thanks again.

Regards
Shantur

On Tue, Aug 10, 2021 at 9:26 PM Nir Soffer <nsoffer(a)redhat.com&gt; wrote:

...
 On Tue, Aug 10, 2021 at 4:24 PM Shantur Rathore
 <shantur.rathore(a)gmail.com&gt; wrote:
 >
 > Hi all,
 >
 > I have a setup as detailed below
 >
 > - iSCSI Storage Domain
 > - Template with Thin QCOW2 disk
 > - Multiple VMs from Template with Thin disk

 Note that a single template disk used by many vms can become a performance
 bottleneck, and is a single point of failure. Cloning the template when
 creating
 vms avoids such issues.

 > oVirt Node 4.4.4

 4.4.4 is old, you should upgrade to 4.4.7.

 > When the VMs boots up it downloads some data to it and that leads to
 increase in volume size.
 > I see that every few seconds the VM gets paused with
 >
 > "VM X has been paused due to no Storage space error."
 >
 >  and then after few seconds
 >
 > "VM X has recovered from paused back to up"

 This is normal operation when a vm writes too quickly and oVirt cannot
 extend the disk quick enough. To mitigate this, you can increase the
 volume chunk size.

 Created this configuration drop in file:

 # cat /etc/vdsm/vdsm.conf.d/99-local.conf
 [irs]
 volume_utilization_percent = 25
 volume_utilization_chunk_mb = 2048

 And restart vdsm.

 With this setting, when free space in a disk is 1.5g, the disk will
 be extended by 2g. With the default setting, when free space is
 0.5g the disk was extended by 1g.

 If this does not eliminate the pauses, try a larger chunk size
 like 4096.

 > Sometimes after a many pause and recovery the VM dies with
 >
 > "VM X is down with error. Exit message: Lost connection with qemu
 process."

 This means qemu has crashed. You can find more info in the vm log at:
 /var/log/libvirt/qemu/vm-name.log

 We know about bugs in qemu that cause such crashes when vm disk is
 extended. I think the latest bug was fixed in 4.4.6, so upgrading to 4.4.7
 will fix this issue.

 Even with these settings, if you have a very bursty io in the vm, it may
 become paused. The only way to completely avoid these pauses is to
 use a preallocated disk, or use file storage (e.g. NFS). Preallocated disk
 can be thin provisioned on the server side so it does not mean you need
 more storage, but you will not be able to use shared templates in the way
 you use them now. You can create vm from template, but the template
 is cloned to the new vm.

 Another option with (still tech preview) is Managed Block Storage (Cinder
 based storage). If your storage server is supported by Cinder, we can
 managed it using cinderlib. In this setup every disk is a LUN, which may
 be thin provisioned on the storage server. This can also offload storage
 operations to the server, like cloning disks, which may be much faster and
 more efficient.

 Nir

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[ovirt-users] Re: Sparse VMs from Templates - Storage issues