On 22/02/12 18:06, Perry Myers wrote:
> * Just stating the obvious, which is users need
> to remove-add the host on every reboot. This will
> not make this feature a lovable one from user's point of view.
I think the point mburns is trying to make in his initial email is that
we're going to need to do some joint work between node and vdsm teams to
change the registration process so that this is no longer necessary.
It will require some redesigning of the registration process
I'm aware of it, and that's why I'm raising my concerns, so we'll
have a (partial) task list ;)
> * During initial boot, vdsm-reg configures the networking
> and creates a management network bridge. This is a very
> delicate process which may fail due to networking issues
> such as resolution, routing, etc. So re-doing this on
> every boot increases the chances of loosing a node due
> to network problems.
Well, if the network is busted which leads to the bridge rename failing,
wouldn't the fact that the network is broken cause other problems anyhow?
Perry, my point is that we're increasing the chances to get
into these holes. Network is not busted most of the time, but occasionally
there's a glitch and we'd like to stay away from it. I'm sure
you know what I'm talking about.
So I don't see this as a problem. If your network doesn't
work
properly, don't expect hosts in the network to subsequently work properly.
See previous answer.
As an aside, why is reverse DNS lookup a requirement? If we remove
that
it makes things a lot easier, no?
Not sure I'm the right guy to defend it, but in order to drop reverse-dns,
you need to consider dropping SSL, LDAP and some other important shortcuts...
> * CA pollution; generating a certificate on each reboot
> for each node will create a huge number of certificates
> in the engine side, which eventually may damage the CA.
> (Unsure if there's a limitation to certificates number,
> but having hundreds of junk cert's can't be good).
I don't think we should regenerate a new certificate on each boot. I
think we need a way for 'an already registered host to retrieve it's
certificate from the oVirt Engine server'
Using an embedded encryption key (if you trust your mgmt network or are
booting from embedded flash), or for the paranoid a key stored in TPM
can be used to have vdsm safely retrieve this from the oVirt Engine
server on each boot so that it's not required to regenerate/reregister
on each boot
Thoughtful redesign needed here...
> * Today there's a supported flow that for nodes with
> password, the user is allowed to use the "add host"
> scenario. For stateless, it means re-configuring a password
> on every boot...
This flow would still be applicable. We are going to allow setting of
the admin password embedded in the core ISO via an offline process.
Once vdsm is fixed to use a non-root account for installation flow, this
is no longer a problem
This is not exactly vdsm. More like vdsm-bootstrap.
Also, if we (as described above) make registrations persistent across
reboots by changing the registration flow a bit, then the install user
password only need be set for the initial boot anyhow.
Therefore I think as a requirement for stateless oVirt Node, we must
have as a prerequsite removing root account usage for
registration/installation
This is both for vdsm and engine, and I'm not sure it's that trivial.
> - Other issues
>
> * Local storage; so far we were able to define a local
> storage in ovirt node. Stateless will block this ability.
It shouldn't. The Node should be able to automatically scan locally
attached disks to look for a well defined VG or partition label and
based on that automatically activate/mount
Stateless doesn't imply diskless. It is a requirement even for
stateless node usage to be able to leverage locally attached disks both
for VM storage and also for Swap.
Still, in a pure disk-less setup you will not have local storage.
See also Mike's answer.
> * Node upgrade; currently it's possible to upgrade a node
> from the engine. In stateless it will error, since no where
> to d/l the iso file to.
Upgrades are no longer needed with stateless. To upgrade a stateless
node all you need to do is 'reboot from a newer image'. i.e. all
upgrades would be done via PXE server image replacement. So the flow of
'upload ISO to running oVirt Node' is no longer even necessary
This is assuming PXE only use-case. I'm not sure it's the only one.
> * Collecting information; core dumps and logging may not
> be available due to lack of space? Or will it cause kernel
> panic if all space is consumed?
We already provide ability to send kdumps to remote ssh/NFS location and
already provide the ability to use both collectd and rsyslogs to pipe
logs/stats to remote server(s). Local logs can be set to logrotate to a
reasonable size so that local RAM FS always contains recent log
information for quick triage, but long term historical logging would be
maintained on the rsyslog server
This needs to be co-ordinated with log-collection, as well as the bootstrapping
code.
Perry
--
/d
"Willyoupleasehelpmefixmykeyboard?Thespacebarisbroken!"