On Mon, Feb 6, 2017 at 1:52 PM, Doug Ingham <dougti@gmail.com> wrote:
Hi All, Simone,

On 24 January 2017 at 10:11, Simone Tiraboschi <stirabos@redhat.com> wrote:


On Tue, Jan 24, 2017 at 1:49 PM, Doug Ingham <dougti@gmail.com> wrote:
Hey guys,
 Just giving this a bump in the hope that someone might be able to advise...

Hi all,
 One of our engines has had a DB failure* & it seems there was an unnoticed problem in its backup routine, meaning the last backup I've got is a couple of weeks old.
Luckily, VDSM has kept the underlying VMs running without any interruptions, so my objective is to get the HE back online & get the hosts & VMs back under its control with minimal downtime.

So, my questions are the following...
  1. What problems can I expect to have with VMs added/modified since the last backup?
Modified VMs will be reverted to the previous configuration; additional VMs should be seen as external VMs, then you could import.

Given VDSM kept the VMs up whilst the HE's been down, how will the running VMs that were present before & after the backup be affected?

Many of the VMs that were present during the last backup are now on different hosts, including the HE VM. Will that cause any issues?

For normal VMs I don't expect any issue: the engine will simply update the correspondent record once it will find them on the managed hosts.
A serious issue could instead happen with HA VMs:
if the engine finds earlier an HA VM as running on a different host it will simply update its record, the issue is if it finds earlier the VM a not on the original host since it will try to restart it causing a split brain and probably a VM corruption.
I opened a bug to track it:
https://bugzilla.redhat.com/show_bug.cgi?id=1419649

 
 
 
  1. As it's only the DB that's been affected, can I skip redeploying the Engine & jump straight to restoring the DB & rerunning engine-setup?

Yes, if the engine VM is fine, you could just import the previous backup and run engine-setup again.
Please set the global maintenance mode for hosted-engine since engine-backup and engine-setup are going to bring down the engine.

As per above, do I still only need to import the previous backup even if the all of the VMs (including the HE VM) are now on different hosts to when the backup was made?

Please take care of the HA VMs.
 


And as for the future, is it going to be necessary to always keep an unused host in the cluster to allow for emergency restores? I'm a bit concerned that if we ever utilised all of our hosts for running VMs, then we'd be completely stuck if the HE ever imploded again.

Honestly I don't see any special issue there.
 

Cheers,
--
Doug