[Users] ovirt engine in high availability

Doron Fediuck dfediuck at redhat.com
Tue Mar 11 23:01:16 EDT 2014


----- Original Message -----
> From: "Gianluca Cecchi" <gianluca.cecchi at gmail.com>
> To: "users" <users at ovirt.org>
> Sent: Thursday, March 6, 2014 12:53:40 AM
> Subject: [Users] ovirt engine in high availability
> 
> Hello,
> I have made a base testing environment putting engine in HA.
> I think it could complement self hosted engine.
> As this could be of interest for enterprise based installations, I
> used CentOS 6.5 as OS environment for engine.
> Even if in upcoming RH EL 7 it seems corosync will substitute cman, in
> this 6.5 test I have used cman/pacemaker with pcs commands.
> The engine servers are two vSphere VMs, so I tested the configuration
> both without stonith and with vSphere CLi sdk and fence_vmware fenging
> agent.
> 
> Also, as this kind of environment could be suitable for DR too, I set
> up the filesystem resource with drbd in active/passive.
> There are many possible configurations; to stick with similar
> environment as the standalone engine, I configured the whole stack of
> resources in one group, also with the nfs share for the default
> ISO-DOMAIN.
> Actually in this kind of environment it would be preferable to have a
> dedicated nfs resource on another ip so that it doesn't influence the
> ovirt-engine resource that is more critical.
> In the same way one could choose to put the PostgreSQL resource as a
> separate one, optimizing and balancing nodes' utilization.
> 
> At the moment I successfully tested on ovirt-engine-3.3.3-2.el6.noarch
> and I have not yet configured any hypervisor hosts. I will do next
> days.
> I plan to test also upgrades to 3.3.4 and 3.4.0, to see if maintenance
> activity would reveal too complex and be a show stopper.
> In the mean time what I tested to be transparent with a client session
> connected to webadmin portal is:
> 
> - relocation of resources
> - power off of the passive node
> - power off of the master node
> 
> With transparent I mean that in the same browser window the user gets
> the login page (the same as when it goes timeout) and after login it
> gets the same situation as before.
> Obviously the PostgreSQL rdbms has stopped (or crashed in third case)
> and so problems could arise as if you stopped your db or powered off
> your server in single server environment.
> If there is interest I can post my configuration steps into a wiki
> page on oVirt web site (after learning the formatting syntax... ;-)..
> the page name could be something like "clustered_engine".
> In the mean time if you have any scenario I can test to verify
> consistency you are welcome.
> 
> Gianluca

Thanks Gianluca for sharing.
The thing with wiki pages is that they tend to get out of date
over time. However, if you're willing to update it (or anyone
else) go ahead and create the page.

Doron


More information about the Users mailing list