[Engine-devel] [node-devel] Support for stateless nodes

Wed Feb 22 17:23:36 UTC 2012

>> Well, if the network is busted which leads to the bridge rename failing,
>> wouldn't the fact that the network is broken cause other problems anyhow?
>>
> Perry, my point is that we're increasing the chances to get
> into these holes. Network is not busted most of the time, but occasionally
> there's a glitch and we'd like to stay away from it. I'm sure
> you know what I'm talking about.

What if oVirt Node creates ifcfg-ovirt (instead of ifcfg-br0) by default
as part of bringing up the network on each boot (via either DHCP or
kernel args)?

Then vdsm would never need to do this.  This particular step could be
something that is turned on/off only if vdsm is installed so that it
doesn't affect any non-oVirt usages of oVirt Node (Archipel, etc)

>>> * Today there's a supported flow that for nodes with
>>> password, the user is allowed to use the "add host"
>>> scenario. For stateless, it means re-configuring a password
>>> on every boot...
>>
>> This flow would still be applicable.  We are going to allow setting of
>> the admin password embedded in the core ISO via an offline process.
>> Once vdsm is fixed to use a non-root account for installation flow, this
>> is no longer a problem
> This is not exactly vdsm. More like vdsm-bootstrap.

ack

>>
>> Also, if we (as described above) make registrations persistent across
>> reboots by changing the registration flow a bit, then the install user
>> password only need be set for the initial boot anyhow.
>>
>> Therefore I think as a requirement for stateless oVirt Node, we must
>> have as a prerequsite removing root account usage for
>> registration/installation
>>
> This is both for vdsm and engine, and I'm not sure it's that trivial.

Understood, but it's a requirement for other things.  There are security
considerations for requiring remote root ssh access as part of your core
infrastructure.  So this needs to be dealt with regardless.

>>> - Other issues
>>>
>>> * Local storage; so far we were able to define a local
>>> storage in ovirt node. Stateless will block this ability.
>>
>> It shouldn't.  The Node should be able to automatically scan locally
>> attached disks to look for a well defined VG or partition label and
>> based on that automatically activate/mount
>>
>> Stateless doesn't imply diskless.  It is a requirement even for
>> stateless node usage to be able to leverage locally attached disks both
>> for VM storage and also for Swap.
>>
> Still, in a pure disk-less setup you will not have local storage.
> See also Mike's answer.

Sure.  If you want diskless specifically and then complain about lack of
swap or local storage for VMs...  then you might not be getting the point :)

That has no bearing on the stateless discussion, except that the first
pass of stateless might not allow config of local disk/swap to start
with.  We might do it incrementally

>>> * Node upgrade; currently it's possible to upgrade a node
>>> from the engine. In stateless it will error, since no where
>>> to d/l the iso file to.
>>
>> Upgrades are no longer needed with stateless.  To upgrade a stateless
>> node all you need to do is 'reboot from a newer image'.  i.e. all
>> upgrades would be done via PXE server image replacement.  So the flow of
>> 'upload ISO to running oVirt Node' is no longer even necessary
>>
> This is assuming PXE only use-case. I'm not sure it's the only one.

Nope...

copy oVirt Node 2.2.3 to a USB stick (via ovirt-iso-to-usb-disk)
boot a host with it

Later...

copy oVirt Node 2.2.4 to same USB stick (via ovirt-iso-to-usb-disk)
boot the host with it

Yes, it requires you to touch the USB stick.  If you specifically want
stateless (implying no 'installation' of the Node) and you won't be
using PXE to run, then it involves legwork.

But again, we're not planning to eliminate the current 'install'
methods.  Stateless is in addition to installing to disk, and using the
'iso upload' upgrade method

>>> * Collecting information; core dumps and logging may not
>>> be available due to lack of space? Or will it cause kernel
>>> panic if all space is consumed?
>>
>> We already provide ability to send kdumps to remote ssh/NFS location and
>> already provide the ability to use both collectd and rsyslogs to pipe
>> logs/stats to remote server(s).  Local logs can be set to logrotate to a
>> reasonable size so that local RAM FS always contains recent log
>> information for quick triage, but long term historical logging would be
>> maintained on the rsyslog server
>>
> This needs to be co-ordinated with log-collection, as well as the bootstrapping
> code.

Yep.  Lots of stuff for vdsm/oVirt Engine team to do in order to meet
this requirement :)

In contrast, making oVirt Node stateless is quite trivial.  Most of the
work here is actually for vdsm and other related utilities (like log
collector)

Perry