
* Just stating the obvious, which is users need to remove-add the host on every reboot. This will not make this feature a lovable one from user's point of view.
I think the point mburns is trying to make in his initial email is that we're going to need to do some joint work between node and vdsm teams to change the registration process so that this is no longer necessary. It will require some redesigning of the registration process
* During initial boot, vdsm-reg configures the networking and creates a management network bridge. This is a very delicate process which may fail due to networking issues such as resolution, routing, etc. So re-doing this on every boot increases the chances of loosing a node due to network problems.
Well, if the network is busted which leads to the bridge rename failing, wouldn't the fact that the network is broken cause other problems anyhow? So I don't see this as a problem. If your network doesn't work properly, don't expect hosts in the network to subsequently work properly. As an aside, why is reverse DNS lookup a requirement? If we remove that it makes things a lot easier, no?
* CA pollution; generating a certificate on each reboot for each node will create a huge number of certificates in the engine side, which eventually may damage the CA. (Unsure if there's a limitation to certificates number, but having hundreds of junk cert's can't be good).
I don't think we should regenerate a new certificate on each boot. I think we need a way for 'an already registered host to retrieve it's certificate from the oVirt Engine server' Using an embedded encryption key (if you trust your mgmt network or are booting from embedded flash), or for the paranoid a key stored in TPM can be used to have vdsm safely retrieve this from the oVirt Engine server on each boot so that it's not required to regenerate/reregister on each boot
* Today there's a supported flow that for nodes with password, the user is allowed to use the "add host" scenario. For stateless, it means re-configuring a password on every boot...
This flow would still be applicable. We are going to allow setting of the admin password embedded in the core ISO via an offline process. Once vdsm is fixed to use a non-root account for installation flow, this is no longer a problem Also, if we (as described above) make registrations persistent across reboots by changing the registration flow a bit, then the install user password only need be set for the initial boot anyhow. Therefore I think as a requirement for stateless oVirt Node, we must have as a prerequsite removing root account usage for registration/installation
- Other issues
* Local storage; so far we were able to define a local storage in ovirt node. Stateless will block this ability.
It shouldn't. The Node should be able to automatically scan locally attached disks to look for a well defined VG or partition label and based on that automatically activate/mount Stateless doesn't imply diskless. It is a requirement even for stateless node usage to be able to leverage locally attached disks both for VM storage and also for Swap.
* Node upgrade; currently it's possible to upgrade a node from the engine. In stateless it will error, since no where to d/l the iso file to.
Upgrades are no longer needed with stateless. To upgrade a stateless node all you need to do is 'reboot from a newer image'. i.e. all upgrades would be done via PXE server image replacement. So the flow of 'upload ISO to running oVirt Node' is no longer even necessary
* Collecting information; core dumps and logging may not be available due to lack of space? Or will it cause kernel panic if all space is consumed?
We already provide ability to send kdumps to remote ssh/NFS location and already provide the ability to use both collectd and rsyslogs to pipe logs/stats to remote server(s). Local logs can be set to logrotate to a reasonable size so that local RAM FS always contains recent log information for quick triage, but long term historical logging would be maintained on the rsyslog server Perry