oVirt Node designs for stateless operation and 3rd party plugins

Tue Dec 6 14:45:20 UTC 2011

arch-bounces at ovirt.org wrote on 12/06/2011 05:18:56 AM:

> From: "Geoff O'Callaghan" <geoffocallaghan at gmail.com>
> To: arch at ovirt.org, node-devel <node-devel at ovirt.org>,
> Date: 12/06/2011 05:19 AM
> Subject: Re: oVirt Node designs for stateless operation and 3rd party
plugins
> Sent by: arch-bounces at ovirt.org
>
> On Thu, 2011-12-01 at 09:32 -0500, Perry Myers wrote:
> > the Node development team has been trying to write up rough
requirements
> > around the stateless and plugins concepts.  And also some working high
> > level design.
> >
> > They can be reviewed on these two wiki pages:
> >
> > http://ovirt.org/wiki/Node_plugins
> > http://ovirt.org/wiki/Node_Stateless
> >
> > Since the plugin model and the stateless model affect more than just
the
> > oVirt Node itself, we definitely would like to get input from other
> > teams on the oVirt project.
> >
> > Please add comments here or directly to the wiki.
> >
>
> Hi There
>
> I work for a *large* organisation, I have issues with the goal of a
> stateless design.
>
> * Being able to install without a local disk
>
> I don't see this as a compelling reason for doing anything.   In fact,
> in many cases for other nameless hypervisors we use local disk as a
> source for logging / dumps etc.
>

The ability to operate without disk does not imply the inability to take
advantage of local disk when available and appropriate.

> I think the goal for stateless should be instead be configuration
> neutral.  ie.  if the node is destroyed the configuration can be
> re-deployed without issue.
>
> The other issue is that the node should continue to be re-bootable even
> if the configuration server is unavailable, which is a reason for having
> the configuration on a local disk or a san attached LUN.   This should
> apply to the entire operational environment - if the engine is
> unavailable during a restart I should continue working the way I was
> configured to do so - that implies state is retained.  It needs to be
> easily refreshable :-)

One, I would think 'the' configuration server would be a bit more robust
than implied here.  I think certain examples to date have been fairly poor,
but a respectable architecture would work out well.

I think there is some opportunity for risk if a 'central' authority can
repair configuration automagically (including restarting VMs on another
node) *and* the downed node can also independently operate as last
expected.  There is a risk for split-brain scenario.  Maybe there is a
large degree of confidence in fencing and storage locking mitigating this
sufficiently.

>
> The configuration bundle should be refreshable from a configuration
> server (part of the engine) and that could either be just configuration
> or agents or even s/w images - all would be preferred and it's pretty
> simple conceptually to have an active/backup image on local disk concept
> to allow easy rollbacks etc.  Yes all this , except for the logging /
> swap could be in a usb key.
>
> The bundle should all be pushed via a SSL encrypted RESTful api using
> known non-priv credentials, preferably with rotating passwords or some
> cert based approach.   The server should also know who previously
> managed it to reduce hostile attempts to change ownership of the node.
>
> * DHCP and PXE booting
>
> Many corporate security policies prohibit the use of DHCP or PXE booting
> servers for production environments.   I don't see it as a big issue to
> boot an install image and be a good woodpecker and hit enter a few times
> and configure a management IP address.   It should be possible to script
> the complete configuration / addition of the node after that step.   I
> see the initial install as a trivial part of the complete node
> lifecycle.

I find it a big issue when configuring a few thousand servers.  Of course,
taking a bigger picture here, we can make unattended stateless image
booting a secure prospect even over PXE.  Either way from the inside,
tolerating an ISO boot generally covers the same concerns as a PXE boot
anyway.

>
> * DNS SRV records
>
> Sorry,  I hate the idea.  Large corporates have so many different teams
> doing little things that adding this in as a requirement simply adds
> delays to the deployments and opportunities for misconfiguration.

A capability does not imply requirement.  Active Directory makes use of
this as can Kerberos, so it's not exactly without precedent.  I'd
expect /proc/cmdline to be the typical path however.  You could add SLP,
mDNS, or roll-your-own multicast discovery to the list though.

>
> Having the node image and config on local disk (or usb) avoids this
> requirement as the node knows who manages it.   A complete rebuild could
> occur and the configuration reloaded once added back into the engine.

Once you sanely provide for the automated 'rebuild' case, you've solved the
problem for arbitrary boots anyway.

>
> * Previously configured state
>
> Yes,  the node should remember the previous operational state if it
> can't talk to the engine.   This is not a bad thing.

Depends on what the 'weak' point is.  If the chances are your inability to
talk to the configuration infrastructure favor a split-brain scenario,
restoring last state could be a bad thing.

>
> *  Configuration server
>
> This should be part of the engine.   It should know the complete
> configuration of a node, right down to hypervisor 'firmware' image.  The
> process should be 2-way.  An admin should be able to 'pull' the
> image/config from an operational and accessible node and new
> configurations/images should be pushable to it.
>
> I really don't think this needs to be a separate server to the engine.
>
> *  New bundle deployments / Upgrades
>
> The engine should keep track of what images are on a node.   If a new
> config / image is to be deployed then for example, the node would be
> tagged with the new image.  If the node was online, an alternate image
> would be pushed, vm's migrated to an alternate node and the node
> restarted implementing the new image when requested.
>
> If the node was offline at the time the new image was configured in the
> engine or if the node was built say with an old image then when it
> connects to the engine the image would be refreshed and the node
> recycled.

The nice thing about stateless, it's a lot more straightforward to work
through these workflows.

>
> * Swap
>
> Local disk swap is likely to be required.  Overcommit is common and SSD
> local disk is something that is quite useful :-)

Flexibility is good, support swap and non-swap cases.  In fact, stateless
is a good match for this, you can use the local disk for things like
distributed filesystems, logging and swap, the OS image isn't bound to disk
and neither is the configuration.

>
> So in summary,  I prefer to think that the target should be
> configuration neutrality or even just plain old distributed
> configuration from a central source rather than completely stateless.
> The goal should be toleration of complete destruction of a node image
> and configuration and a simple process to re-add it and automatically
> re-apply the configuration/sw image.

I think a bullet-proof configuration infrastructure that makes stateless
just as good as stateful handles the failure cases a lot more smoothly.
Emphasis on bulletproof.

>
> Just some thoughts for discussion / abuse ;-)
>
> Tks
> Geoff
>
> > Cheers,
> >
> > Perry
> > _______________________________________________
> > Arch mailing list
> > Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
>
>
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/arch/attachments/20111206/a8c65f5b/attachment.html>