On Fri, Nov 23, 2018 at 1:01 PM Nathanaël Blanchet <blanchet@abes.fr> wrote:
Hi,

What are the best pratices about vm snapshots?

I think the general guideline is keep only snapshot you need, since every 
snapshot has a potential performance cost.

qemu caches some image metadata in memory, so accessing data from an
image with 20 snapshots should be efficient as image with 2 snapshots, but
using more snapshots will consume more memory.

Kevin, do we have performance tests comparing VMs with different amount
of snapshots?

On oVirt side, there is little additional overhead for every snapshot. We never
measured this overhead but I don't think it will be an issue in normal use.

With block storage oVirt has a soft 1300 volumes limit per storage domain, so
keeping more snapshots per VM means yo can keep less VMs on the same
storage domain.
 
Is there a maximum number of snapshots per vm?

How many snapshot do you plan to keep?
 

Has a high number of present snapshot an impact on the vm performance?
... on how long the snapshot is completed?

Taking a snapshot is fast - basically the time it take to create a new empty image,
and  then the time it take to freeze the guest file systems before the snapshot.

Before 4.2 this could be slow with block storage depending on the number of
snapshot in the VM:
https://bugzilla.redhat.com/1395941

Taking a snapshot with memory is not fast, depending on the amount of RAM in
the VM and how fast you can compress and write memory to storage.

Deleting a snapshot is not fast, the operation requires copying the data in the
snapshot to previous snapshot. But if the snapshot is not too old, and the VM
did not write a lot of data to that snapshot, the opeartion is not very long.

Reverting a VM to an older snapshot is pretty fast and require no data operations.
This is basically the time to create snapshot based on the snapshot you want
to revert to.

You can measure these operations on your system. Clone existing VM and
try to add and remove snapshots with and without memory. You can also test
if having lot of snapshots cause noticeable performance issues with the planned
workload.

Nir