hosted Engine - moving a host to maintenance

Oved Ourfalli ovedo at redhat.com
Thu Oct 3 12:29:15 UTC 2013


Hey,

When moving a host to maintenance, the oVirt engine migrates all the running VMs to another host.
If this host is the one running the hosted engine, then the hosted engine VM needs to be migrated as well.
However, the hosted engine HA cluster might not be identical to the oVirt cluster (the oVirt cluster will usually be a superset of the HA one).

So, in order to move a host to maintenance in a hosted engine environment we thought of doing the following:
1. VDSM changes
 - Report the HA agent score (if HA agent exists on host)
2. HA agent changes
 - provide an API to get the score, for VDSM to use
3. Engine changes
 - Add a scheduling filter, to filter out all hosts without a score, when the VM in question is the hosted engine
 - Add a weight filter, weighting hosts the "max weight" - the HA weight (further normalization might be needed here, I guess)
 - Marking the hosted engine VM as manual-migration
4. Add the score to the engine (in VdsDynamic).


Issues with this approach:
1. Moving a host to maintenance will stay in "preparing for maintenance", until you manually migrate the hosted engine VM
 - Possible solution: define the VM as migratable, and add a condition to the load balancing logic to prevent it from migrating it
2. If the migration takes too much time that might cause it to abort, and do decrease the hosted engine performance.
 - Two alternatives here to prevent migration, or to address a scenario of aborted migration:
  a. To solve it perhaps we can add a "re-elect" command to the HA agents, to initiate a re-election of the host running the hosted engine VM:
   * add a re-election initiated bit to each host
   * if some host initiated a re-election, stop the VM and run it on another host according to the score
   * when the re-election is finished clear *ALL* re-election bits
  b. When asked using CLI or when the hosted-engine VM goes down (stopped by vdsm), the HA logic will temporary reduce its score to allow others to be elected.
3. Changing the cluster policy to a policy without the new filter / weight policy units will cancel this entire behaviour
 - Perhaps we can put some units in by default in the UI, allowing users to remove them.

Thoughts and comments are welcome.

Thank you,
Oved



More information about the Arch mailing list