I had corruption of the Hosted  Engine due to a power failiure of all hosts (Gluster).
My lab has no UPS.

You can create a snapshot via API , backup the snapshot and then delete it via the API while the VM is working (no downtime).

Still, this approach won't work for Databases as  you can create the snapahot in the middle of a transaction.

Best Regards,
Strahil Nikolov

On Aug 7, 2019 20:33, Douglas Duckworth <dod2014@med.cornell.edu> wrote:

Hi


We are running oVirt 4.2.8.2-1.el7.  Should probably upgrade but it works.


We are backing up the engine every day with dump going to external NFS file system then onto the cloud.  For VMs we are doing backups within Linux itself using a program called Restic which then sends data to cloud S3 service.  That runs daily as well.


We also save all configuration data, for applications running on our VMs such as Apache, etc, within Ansible.   So we can quickly recreate the VM using Ansible, along with any applications, then restore any data, not saved in Ansible, such as private PKI keys or PostgreSQL dump, for example, from Restic.  Dockerized applications even easier.  There would be some downtime to redeploy a new VM but this is acceptable given the constrains of our environment.


I am wondering under what situations has anyone experienced VM corruption?  This would help me determine if more effort should be invested in snapshotting VMs and possibly exporting their disks.  Though as I recall removing snapshots from my storage domain would require shutting down the VM, right?


-- 
Thanks,

Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unithttps://github.com/restic/restic