[node-devel] oVirt Node designs for stateless operation and 3rd party plugins

Tue Dec 6 14:26:29 UTC 2011

On Tue, 2011-12-06 at 08:52 -0500, Mike Burns wrote:
> Comments Inline
> 
> On Tue, 2011-12-06 at 21:18 +1100, Geoff O'Callaghan wrote:
> > On Thu, 2011-12-01 at 09:32 -0500, Perry Myers wrote: 
> > > the Node development team has been trying to write up rough requirements
> > > around the stateless and plugins concepts.  And also some working high
> > > level design.
> > > 
> > > They can be reviewed on these two wiki pages:
> > > 
> > > http://ovirt.org/wiki/Node_plugins
> > > http://ovirt.org/wiki/Node_Stateless
> > > 
> > > Since the plugin model and the stateless model affect more than just the
> > > oVirt Node itself, we definitely would like to get input from other
> > > teams on the oVirt project.
> > > 
> > > Please add comments here or directly to the wiki.
> > > 
> > 
> > Hi There
> > 
> > I work for a *large* organisation, I have issues with the goal of a
> > stateless design.
> 
> Thanks for the feedback overall.  I'll try to address all your points
> below.
> 
> > 
> > * Being able to install without a local disk
> > 
> > I don't see this as a compelling reason for doing anything.   In fact,
> > in many cases for other nameless hypervisors we use local disk as a
> > source for logging / dumps etc.
> 
> That may be the case in your environment, but when we presented this at
> the oVirt Workshop, the idea of a diskless deployment was very well
> received.  

> I suppose that what we're calling stateless is really more of
> a diskless feature rather than truly stateless since we're keeping the
> stateful information in a configuration server.

This is actually not correct.  My mind was just caught up in thinking of
a totally diskless system.  What Perry said is correct.  We're looking
to move all configuration to a central location (configuration
neutrality).  Disk would then become optional for things like swap
and/or kdump, etc.

> 
> > 
> > I think the goal for stateless should be instead be configuration
> > neutral.  ie.  if the node is destroyed the configuration can be
> > re-deployed without issue.
> 
> Redeployed on the same machine?  Or redeployed on a different machine?
> We already provide autoinstallation options that will do redeployments
> easily and one of the goals or ideas along with the proposed stateless
> model is that the machine gets re-provisioned and downloads its config
> bundle.  This would successfully recover the node if someone were to
> power it off or destroy it somehow.  If you're looking to move the
> config to a new machine, then that's not quite as simple.  The easiest
> would be to simply install it again from scratch.
> 
> > 
> > The other issue is that the node should continue to be re-bootable even
> > if the configuration server is unavailable, which is a reason for having
> > the configuration on a local disk or a san attached LUN.   This should
> > apply to the entire operational environment - if the engine is
> > unavailable during a restart I should continue working the way I was
> > configured to do so - that implies state is retained.  It needs to be
> > easily refreshable :-)
> 
> I will admit that the thought of the config server being unavailable
> hadn't come up previously.  If this is something that you're
> legitimately concerned about, then it sounds like you'd want to continue
> doing local installations and not stateless installs.
> 
> Currently, node images will install to local disk and they will boot
> fine without a management server or config server.  But they won't be
> truly functional unless there is a management server available to tell
> it what to do.  This is the case for all hypervisors, whether they're
> ovirt-node images or Fedora 16 images with VDSM installed or any of the
> other distributions.  It's a limitation the VDSM and Engine need to
> solve outside the scope of ovirt-node.
> > 
> > The configuration bundle should be refreshable from a configuration
> > server (part of the engine) and that could either be just configuration
> > or agents or even s/w images - all would be preferred and it's pretty
> > simple conceptually to have an active/backup image on local disk concept
> > to allow easy rollbacks etc.  Yes all this , except for the logging /
> > swap could be in a usb key.
> 
> We do provide a RootBackup partition that we automatically activate if
> something goes wrong with an upgrade.  It would make sense that we
> should keep a backup configuration bundle on the management server as
> well.  The actual image itself is a livecd, so updating that would be a
> matter of changing the usb stick/cd-rom/pxe image to the old/new version
> 
> > 
> > The bundle should all be pushed via a SSL encrypted RESTful api using
> > known non-priv credentials, preferably with rotating passwords or some
> > cert based approach.   The server should also know who previously
> > managed it to reduce hostile attempts to change ownership of the node.
> 
> Yes, the security issues are something that we're definitely aware of
> and not taking lightly.  The actual process for how we do this is
> something that still would need to be worked out.  The initial design
> was something along the lines of a free posting to the config server
> that the admin has to approve.  The thought was that we would have
> different levels of security that could be configured depending on your
> deployment and the strictness of the rules in your environment.
> 
> > 
> > * DHCP and PXE booting
> > 
> > Many corporate security policies prohibit the use of DHCP or PXE booting
> > servers for production environments.   I don't see it as a big issue to
> > boot an install image and be a good woodpecker and hit enter a few times
> > and configure a management IP address.   It should be possible to script
> > the complete configuration / addition of the node after that step.   I
> > see the initial install as a trivial part of the complete node
> > lifecycle.
> 
> So a couple thoughts here:
> 
> 1.  If only pxe is restricted, then you could have a usb stick or cd-rom
> with the image in each machine and still do stateless as defined
> otherwise.
> 2.  If just DHCP, then you could have a pxe profile per machine that
> sets up the static networking options needed
> 3.  If both are restricted, then you would have to go with a stateful
> installation.  It's not going away, just another mode that we will
> provide.
> 
> Actual installation and configuration can be completed automatically
> using kernel command line options.  That is independent of whether
> you're using a stateful or stateless installation.
> 
> >   
> > * DNS SRV records
> > 
> > Sorry,  I hate the idea.  Large corporates have so many different teams
> > doing little things that adding this in as a requirement simply adds
> > delays to the deployments and opportunities for misconfiguration.
> 
> Sure, that's a valid possibility.  Perhaps another commandline option
> that allows someone to specify the config server manually.
> 
> > 
> > Having the node image and config on local disk (or usb) avoids this
> > requirement as the node knows who manages it.   A complete rebuild could
> > occur and the configuration reloaded once added back into the engine.
> 
> Yes, this is a valid use case.  And if that's the way you want to deploy
> your environment, then use the install to disk option and not stateless.
> We will provide both
> 
> > 
> > * Previously configured state
> > 
> > Yes,  the node should remember the previous operational state if it
> > can't talk to the engine.   This is not a bad thing.   
> > 
> > *  Configuration server
> > 
> > This should be part of the engine.   It should know the complete
> > configuration of a node, right down to hypervisor 'firmware' image.  The
> > process should be 2-way.  An admin should be able to 'pull' the
> > image/config from an operational and accessible node and new
> > configurations/images should be pushable to it.
> > 
> > I really don't think this needs to be a separate server to the engine.
> 
> I agree, it should be part of the engine, probably will be.  Depending
> on time frames and availability, it might be developed separate
> initially, but long term we probably want to integrate with the
> management server.
> > 
> > *  New bundle deployments / Upgrades
> > 
> > The engine should keep track of what images are on a node.   If a new
> > config / image is to be deployed then for example, the node would be
> > tagged with the new image.  If the node was online, an alternate image
> > would be pushed, vm's migrated to an alternate node and the node
> > restarted implementing the new image when requested.
> 
> This is mostly already done, I think.  I know the functionality is there
> in RHEV-M, but not sure if it's all in the webadmin UI yet.  I know the
> backend pieces are all there though.
> 
> A running node has it's version info that vdsm reads initially and
> reports back to the engine.  An admin logs into the engine, and can see
> the details of the node including the version that it's currently
> running.  There is an option to push out a new image to the node and
> have it upgrade itself.  The node does have to be in maintenance mode to
> start the process which causes all VMs to be migrated away.  
> 
> > 
> > If the node was offline at the time the new image was configured in the
> > engine or if the node was built say with an old image then when it
> > connects to the engine the image would be refreshed and the node
> > recycled.
> 
> Automatic upgrades like this aren't done at the moment.  There probably
> needs to be some policy engine that can control it so all machines don't
> suddenly try to upgrade themselves.  
> 
> This whole section really applies to stateful installations though.  In
> Stateless, you just need to refresh the image in the PXE
> server/cd-rom/usb stick and reboot the machine (after putting it in
> maintenance mode)
> 
> > 
> > * Swap
> > 
> > Local disk swap is likely to be required.  Overcommit is common and SSD
> > local disk is something that is quite useful :-)
> 
> I agree, in general.  I did talk to one person at the workshop that had
> a machine with 300+GB RAM and had 0 interest in doing overcommit.  So
> there is certainly a use case for being able to support both.  
> 
> > 
> > So in summary,  I prefer to think that the target should be
> > configuration neutrality or even just plain old distributed
> > configuration from a central source rather than completely stateless.
> > The goal should be toleration of complete destruction of a node image
> > and configuration and a simple process to re-add it and automatically
> > re-apply the configuration/sw image.
> 
> I like the thought of storing the configuration to a central location
> even when having the image installed locally.  I definitely think there
> will be people that can't or won't go with stateless for various reasons
> many of which you state above.  But I also think there are some that
> will want it as well.  
> 
> The simplest use case for wanting a stateless model like we designed is
> someone that has a rack of blades without local disks.  The setup pxe
> and dhcp, and just turn on the blades.
> 
> Mike
> > 
> > Just some thoughts for discussion / abuse ;-)
> > 
> > Tks
> > Geoff
> > 
> > > Cheers,
> > > 
> > > Perry
> > > _______________________________________________
> > > Arch mailing list
> > > Arch at ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/arch
> > 
> > 
> > _______________________________________________
> > Arch mailing list
> > Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
> 
> 
> _______________________________________________
> node-devel mailing list
> node-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/node-devel