[Engine-devel] [node-devel] Support for stateless nodes

Mike Burns mburns at redhat.com
Wed Feb 22 15:58:48 UTC 2012


On Wed, 2012-02-22 at 17:33 +0200, Doron Fediuck wrote:
> On 22/02/12 16:57, Mike Burns wrote:
> > There has been a lot of interest in being able to run stateless Nodes
> > with ovirt-engine.  ovirt-node has designed a way [1] to achieve this on
> > the node side, but we need input from the engine and vdsm teams to see
> > if we're missing some requirement or if there needs to be changes on the
> > engine/vdsm side to achieve this.
> > 
> > As it currently stands, every time you reboot an ovirt-node that is
> > stateless, it would require manually removing the host in engine, then
> > re-registering/approving it again in engine.  
> > 
> > Any thoughts, concerns, input on how to solve this?
> > 
> > Thanks
> > 
> > Mike
> > 
> > [1] http://ovirt.org/wiki/Node_Stateless
> > 
> 
> Some points need to be considered;
> 
> - Installation issues
> 
> * Just stating the obvious, which is users need
> to remove-add the host on every reboot. This will
> not make this feature a lovable one from user's point of view.

Yes, this is something that will cause this to be a non-starter.  We'd
need to change something in the engine/vdsm to make it smoother.
Perhaps, a flag in engine on the host saying that it's stateless.  Then
if a host comes up with the same information, but no certs, etc, it
would validate some other embedded key (TPM, key embedded in the node
itself), and auto-approve it to be the same state as the previous boot

> 
> * During initial boot, vdsm-reg configures the networking
> and creates a management network bridge. This is a very
> delicate process which may fail due to networking issues
> such as resolution, routing, etc. So re-doing this on
> every boot increases the chances of loosing a node due
> to network problems.

vdsm-reg runs on *every* boot anyway and renames the bridge.  This is
something that was debated previously, but it was decided to re-run it
every boot.

> 
> * CA pollution; generating a certificate on each reboot
> for each node will create a huge number of certificates
> in the engine side, which eventually may damage the CA.
> (Unsure if there's a limitation to certificates number,
> but having hundreds of junk cert's can't be good).

We could have vdsm/engine store the certs on the engine side, and on
boot, after validating the host (however that is done), it will load the
certs onto the node machine.  

> 
> * Today there's a supported flow that for nodes with
> password, the user is allowed to use the "add host"
> scenario. For stateless, it means re-configuring a password
> on every boot...

Stateless is really targeted for a PXE environment.  There is a
supported kernel param that can be set that will set this password.
Also, if we follow the design mentioned ^^, then it's not an issue since
the host will auto-approve itself when it connects

> 
> - Other issues
> 
> * Local storage; so far we were able to define a local
> storage in ovirt node. Stateless will block this ability.

Yes, this would be unavailable if you're running stateless.  I think
that's a fine tradeoff since people want the host to be diskless.
> 
> * Node upgrade; currently it's possible to upgrade a node
> from the engine. In stateless it will error, since no where
> to d/l the iso file to.

Upgrade is handled easily by rebooting the host after updating the pxe
server

> 
> * Collecting information; core dumps and logging may not
> be available due to lack of space? Or will it cause kernel
> panic if all space is consumed?

A valid concern, but a stateless environment would likely have
collectd/rsyslog/netconsole servers running elsewhere that will collect
the logs.  kdumps can be configured to dump remotely as well.  
> 

Another concern raised is swap and overcommit.  First version would
likely disable swap completely.  This would disable overcommit as well.
Future versions could enable a local disk to be used completely for
swap, but that is another tradeoff that people would need to evaluate
when choosing between stateless and stateful installs.

Mike




More information about the Devel mailing list