oVirt Node designs for stateless operation and 3rd party plugins

Perry Myers pmyers at redhat.com
Tue Dec 6 14:11:38 UTC 2011


Hi Geoff,

Thanks for weighing in on this.  Let me preface my comments below by
saying that I think by and large we're on the same page with the
design/intent of stateless here.  I just think you may have some
confusion about terminology and what stateless will imply/not imply for
the end user.  I'll try to clarify below so that we can get on the same
page.

On 12/06/2011 05:18 AM, Geoff O'Callaghan wrote:
> On Thu, 2011-12-01 at 09:32 -0500, Perry Myers wrote: 
>> the Node development team has been trying to write up rough requirements
>> around the stateless and plugins concepts.  And also some working high
>> level design.
>>
>> They can be reviewed on these two wiki pages:
>>
>> http://ovirt.org/wiki/Node_plugins
>> http://ovirt.org/wiki/Node_Stateless
>>
>> Since the plugin model and the stateless model affect more than just the
>> oVirt Node itself, we definitely would like to get input from other
>> teams on the oVirt project.
>>
>> Please add comments here or directly to the wiki.
>>
> 
> Hi There
> 
> I work for a *large* organisation, I have issues with the goal of a
> stateless design.
> 
> * Being able to install without a local disk
> 
> I don't see this as a compelling reason for doing anything.   In fact,
> in many cases for other nameless hypervisors we use local disk as a
> source for logging / dumps etc.

This feature doesn't mandate that a local disk not be present.  In fact,
we must support using the local disk for things like swap and kernel
dumps as you mention.

Not sure why you thought stateless required no local disk... It's purely
one of many deployment options.

i.e. you could do:

stateless w/ no local disk
stateless w/ local disk for swap/kdump etc

> I think the goal for stateless should be instead be configuration
> neutral.  ie.  if the node is destroyed the configuration can be
> re-deployed without issue.

That is the goal

> The other issue is that the node should continue to be re-bootable even
> if the configuration server is unavailable, which is a reason for having
> the configuration on a local disk or a san attached LUN.   This should
> apply to the entire operational environment - if the engine is
> unavailable during a restart I should continue working the way I was
> configured to do so - that implies state is retained.  It needs to be
> easily refreshable :-)

The Config Server != oVirt Engine Server.  They are two separate servers.

I agree that in the event of the config server being down, you cannot
boot the oVirt Node.  However, a common deployment scenario is using PXE
to boot the node and this already suffers from this drawback and people
do seem comfortable with that scenario.

Basically, 'stateless' is an option, not a mandate.

If you want/need to continue to persist local config data on the Node,
we're not going to prevent that.  We're just adding an option to allow
people to use a centralized config server with the option of no local disks.

Adding this option won't prevent you from still doing a stateful
install, so I don't think this feature conflicts with your requirements.

> The configuration bundle should be refreshable from a configuration
> server (part of the engine) and that could either be just configuration
> or agents or even s/w images - all would be preferred and it's pretty
> simple conceptually to have an active/backup image on local disk concept
> to allow easy rollbacks etc.  Yes all this , except for the logging /
> swap could be in a usb key.
> 
> The bundle should all be pushed via a SSL encrypted RESTful api using
> known non-priv credentials, preferably with rotating passwords or some
> cert based approach.   The server should also know who previously
> managed it to reduce hostile attempts to change ownership of the node.
> 
> * DHCP and PXE booting
> 
> Many corporate security policies prohibit the use of DHCP or PXE booting
> servers for production environments.   I don't see it as a big issue to
> boot an install image and be a good woodpecker and hit enter a few times
> and configure a management IP address.   It should be possible to script
> the complete configuration / addition of the node after that step.   I
> see the initial install as a trivial part of the complete node
> lifecycle.

You are correct.  Many do prohibit this, but many don't.  So as stated
above, we'll allow both boot from USB key, local disk and PXE.  We
already do this today.

> * DNS SRV records
> 
> Sorry,  I hate the idea.  Large corporates have so many different teams
> doing little things that adding this in as a requirement simply adds
> delays to the deployments and opportunities for misconfiguration.

Sorry that you hate the idea.  Others like it, so we'll provide it as an
option.  Again, we're not _mandating_ the usage of DNS SRV, we're
providing it as an option.  If you want, you can certainly manually
configure every Node in your datacenter.

Using things like DHCP and DNS SRV help with automating large
deployments, but we certainly won't require their usage for those that
wish to have more control over every Node's configuration.

> Having the node image and config on local disk (or usb) avoids this
> requirement as the node knows who manages it.   A complete rebuild could
> occur and the configuration reloaded once added back into the engine.

We'll continue allowing local disk usage for config.  No plans to remove
this as an install option.

> * Previously configured state
> 
> Yes,  the node should remember the previous operational state if it
> can't talk to the engine.   This is not a bad thing.   
> 
> *  Configuration server
> 
> This should be part of the engine.   It should know the complete
> configuration of a node, right down to hypervisor 'firmware' image.  The
> process should be 2-way.  An admin should be able to 'pull' the
> image/config from an operational and accessible node and new
> configurations/images should be pushable to it.

It will be collocated with the Engine, but we will design it in such a
way that it can be run independently from the oVirt Engine.  There are
several reasons for this:

1. re-usability outside of the oVirt context
2. scaling multiple config servers for different areas of the datacenter
3. perhaps one part of the datacenter is comfortable using a config
   server and another is not.  You could co-locate the config server
   with the portion of Nodes that are using DNS SRV/DHCP, etc and keep
   them physically separate from the Nodes that are using static config
   and local disks for configuration

Keep in mind that most of the Node configuration is _already_ done by
oVirt Engine (advanced storage config, network config, vm information).
 The only thing that this config server will need to store are:

* config of the mgmt network interface
* config of vdsm so that the Node can talk back to the oVirt Engine
* config of the local passwd file

Most everything else can/is applied dynamically by oVirt Engine sending
config to vdsm.  So this config server really is just bootstrapping for
the basic stuff, and we let the oVirt Engine handle everything else more
complex.

> I really don't think this needs to be a separate server to the engine.

Noted.  I'd be interested to see if others have an opinion here.

> *  New bundle deployments / Upgrades
> 
> The engine should keep track of what images are on a node.   If a new
> config / image is to be deployed then for example, the node would be
> tagged with the new image.  If the node was online, an alternate image
> would be pushed, vm's migrated to an alternate node and the node
> restarted implementing the new image when requested.

Engine already does this.  It knows which version of oVirt Node ISO has
been pushed to each node in the datacenter.  That is also how it knows
when a Node is eligible for an upgrade.

> If the node was offline at the time the new image was configured in the
> engine or if the node was built say with an old image then when it
> connects to the engine the image would be refreshed and the node
> recycled.
> 
> * Swap
> 
> Local disk swap is likely to be required.  Overcommit is common and SSD
> local disk is something that is quite useful :-)

Yes, please read the wiki where it says:

> In order to overcommit a host, you need to have swap space to support it
> First implementation will probably disable swap
> Future implementation may allow the system to configure a local disk as swap space

So yes, the plan is to allow swap even during stateless operation if the
administrator chooses to do so.

> So in summary,  I prefer to think that the target should be
> configuration neutrality or even just plain old distributed
> configuration from a central source rather than completely stateless.

You're confusing stateless with diskless.

Stateless is configuration neutrality.  Nowhere in the wiki does it
imply that we must be diskless

> The goal should be toleration of complete destruction of a node image
> and configuration and a simple process to re-add it and automatically
> re-apply the configuration/sw image.

Agreed, and that is indeed the intent of this design

> Just some thoughts for discussion / abuse ;-)

Thanks for the feedback.  I think roughly speaking we are on the same
page.  It just seems like perhaps the wiki made you think we required
diskless and PXE/DNS SRV vs. making those things options that the
administrator could choose or reject as a deployment detail.

I think the only area of contention is whether or not the Config Server
should be integral with oVirt Engine, and on that point I think we can
discuss further.  But please keep in mind that this Config Engine is for
the bare minimum config info and anything more complex will be coming
from the oVirt Engine server via vdsm anyhow.  I think if you limit the
scope of the Config Server to be that, it is more reasonable to make it
a standalone/separate component

Perhaps to make it less confusing we should call it the "Bootstrap
Server" since it won't be a true "Config Server" since it only has
bootstrap config information to allow the Node to get add'l config from
the oVirt Engine via vdsm

Perry



More information about the Arch mailing list