feature suggestion: initial generation of management network

Omer Frenkel ofrenkel at redhat.com
Tue May 7 13:11:05 UTC 2013



----- Original Message -----
> From: "Moti Asayag" <masayag at redhat.com>
> To: "arch" <arch at ovirt.org>
> Cc: "Alon Bar-Lev" <abarlev at redhat.com>
> Sent: Tuesday, May 7, 2013 2:22:19 PM
> Subject: Re: feature suggestion: initial generation of management network
> 
> I stumbled upon few issues with the current design while implementing it:
> 
> There seems to be a requirement to reboot the host after the installation
> is completed in order to assure the host is recoverable.
> 
> Therefore, the building blocks of the installation process of 3.3 are:
> 1. host deploy which installs the host expect configuring its management
> network.
> 2. SetupNetwork (and CommitNetworkChanges) - for creating the management
> network
> on the host and persisting the network configuration.
> 3. Reboot the host - This is a missing piece. (engine has FenceVds command,
> but it
> requires the power management to be configured prior to the installation and
> might
> be irrelevant for hosts without PM.)
> 
> So, there are couple of issues here:
> 1. How to reboot the host?
> 1.1. By exposing new RebootNode verb in VDSM and invoking it from the engine
> 1.2. By opening ssh dialog to the host in order to execute the reboot
> 

why not send a reboot flag to the CommitNetworkChanges which is sent anyway,
one less call (or connection if you choose ssh) and easier to do.

> 2. When to perform the reboot?
> 2.1. After host deploy, by utilizing the host deploy to perform the reboot.
> It requires to configure the network by the monitor when the host is detected
> by the engine,
> detached from the installation flow. However it is a step toward the
> non-persistent network feature
> yet to be defined.
> 2.2. After setupNetwork is done and network was configured and persisted on
> the host.
> There is no special advantage from recoverable aspect, as setupNetwork is
> constantly
> used to persist the network configuration (by the complementary
> CommitNetworkChanges command).
> In case and network configuration fails, VDSM will revert to the last well
> known configuration
> - so connectivity with engine should be restored. Design wise, it fits to
> configure the management
>  network as part of the installation sequence.
> If the network configuration fails in this context, the host status will be
> set to "InstallFailed" rather than "NonOperational",
> as might occur as a result of a failed setupNetwork command.
> 
> 
> Your inputs are welcome.
> 
> Thanks,
> Moti
> ----- Original Message -----
> > From: "Dan Kenigsberg" <danken at redhat.com>
> > To: "Simon Grinberg" <simon at redhat.com>, "Moti Asayag" <masayag at redhat.com>
> > Cc: "arch" <arch at ovirt.org>
> > Sent: Tuesday, January 1, 2013 2:47:57 PM
> > Subject: Re: feature suggestion: initial generation of management network
> > 
> > On Thu, Dec 27, 2012 at 07:36:40AM -0500, Simon Grinberg wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Dan Kenigsberg" <danken at redhat.com>
> > > > To: "Simon Grinberg" <simon at redhat.com>
> > > > Cc: "arch" <arch at ovirt.org>
> > > > Sent: Thursday, December 27, 2012 2:14:06 PM
> > > > Subject: Re: feature suggestion: initial generation of management
> > > > network
> > > > 
> > > > On Tue, Dec 25, 2012 at 09:29:26AM -0500, Simon Grinberg wrote:
> > > > > 
> > > > > 
> > > > > ----- Original Message -----
> > > > > > From: "Dan Kenigsberg" <danken at redhat.com>
> > > > > > To: "arch" <arch at ovirt.org>
> > > > > > Sent: Tuesday, December 25, 2012 2:27:22 PM
> > > > > > Subject: feature suggestion: initial generation of management
> > > > > > network
> > > > > > 
> > > > > > Current condition:
> > > > > > ==================
> > > > > > The management network, named ovirtmgmt, is created during host
> > > > > > bootstrap. It consists of a bridge device, connected to the
> > > > > > network
> > > > > > device that was used to communicate with Engine (nic, bonding or
> > > > > > vlan).
> > > > > > It inherits its ip settings from the latter device.
> > > > > > 
> > > > > > Why Is the Management Network Needed?
> > > > > > =====================================
> > > > > > Understandably, some may ask why do we need to have a management
> > > > > > network - why having a host with IPv4 configured on it is not
> > > > > > enough.
> > > > > > The answer is twofold:
> > > > > > 1. In oVirt, a network is an abstraction of the resources
> > > > > > required
> > > > > > for
> > > > > >    connectivity of a host for a specific usage. This is true for
> > > > > >    the
> > > > > >    management network just as it is for VM network or a display
> > > > > >    network.
> > > > > >    The network entity is the key for adding/changing nics and IP
> > > > > >    address.
> > > > > > 2. In many occasions (such as small setups) the management
> > > > > > network is
> > > > > >    used as a VM/display network as well.
> > > > > > 
> > > > > > Problems in current connectivity:
> > > > > > ================================
> > > > > > According to alonbl of ovirt-host-deploy fame, and with no
> > > > > > conflict
> > > > > > to
> > > > > > my own experience, creating the management network is the most
> > > > > > fragile,
> > > > > > error-prone step of bootstrap.
> > > > > 
> > > > > +1,
> > > > > I've raise that repeatedly in the past, bootstrap should not create
> > > > > the management network but pick up the existing configuration and
> > > > > let the engine override later with it's own configuration if it
> > > > > differs , I'm glad that we finally get to that.
> > > > > 
> > > > > > 
> > > > > > Currently it always creates a bridged network (even if the DC
> > > > > > requires a
> > > > > > non-bridged ovirtmgmt), it knows nothing about the defined MTU
> > > > > > for
> > > > > > ovirtmgmt, it uses ping to guess on top of which device to build
> > > > > > (and
> > > > > > thus requires Vdsm-to-Engine reverse connectivity), and is the
> > > > > > sole
> > > > > > remaining user of the addNetwork/vdsm-store-net-conf scripts.
> > > > > > 
> > > > > > Suggested feature:
> > > > > > ==================
> > > > > > Bootstrap would avoid creating a management network. Instead,
> > > > > > after
> > > > > > bootstrapping a host, Engine would send a getVdsCaps probe to the
> > > > > > installed host, receiving a complete picture of the network
> > > > > > configuration on the host. Among this picture is the device that
> > > > > > holds
> > > > > > the host's management IP address.
> > > > > > 
> > > > > > Engine would send setupNetwork command to generate ovirtmgmt with
> > > > > > details devised from this picture, and according to the DC
> > > > > > definition
> > > > > > of
> > > > > > ovirtmgmt.  For example, if Vdsm reports:
> > > > > > 
> > > > > > - vlan bond4.3000 has the host's IP, configured to use dhcp.
> > > > > > - bond4 is comprises eth2 and eth3
> > > > > > - ovirtmgmt is defined as a VM network with MTU 9000
> > > > > > 
> > > > > > then Engine sends the likes of:
> > > > > >   setupNetworks(ovirtmgmt: {bridged=True, vlan=3000, iface=bond4,
> > > > > >                 bonding=bond4: {eth2,eth3}, MTU=9000)
> > > > > 
> > > > > Just one comment here,
> > > > > In order to save time and confusion - if the ovirtmgmt is defined
> > > > > with default values meaning the user did not bother to touch it,
> > > > > let it pick up the VLAN configuration from the first host added in
> > > > > the Data Center.
> > > > > 
> > > > > Otherwise, you may override the host VLAN and loose connectivity.
> > > > > 
> > > > > This will also solve the situation many users encounter today.
> > > > > 1. The engine in on a host that actually has VLAN defined
> > > > > 2. The ovirtmgmt network was not updated in the DC
> > > > > 3. A host, with VLAN already defined is added - everything works
> > > > > fine
> > > > > 4. Any number of hosts are now added, again everything seems to
> > > > > work fine.
> > > > > 
> > > > > But, now try to use setupNetworks, and you'll find out that you
> > > > > can't do much on the interface that contains the ovirtmgmt since
> > > > > the definition does not match. You can't sync (Since this will
> > > > > remove the VLAN and cause connectivity lose) you can't add more
> > > > > networks on top since it already has non-VLAN network on top
> > > > > according to the DC definition, etc.
> > > > > 
> > > > > On the other hand you can't update the ovirtmgmt definition on the
> > > > > DC since there are clusters in the DC that use the network.
> > > > > 
> > > > > The only workaround not involving DB hack to change the VLAN on the
> > > > > network is to:
> > > > > 1. Create new DC
> > > > > 2. Do not use the wizard that pops up to create your cluster.
> > > > > 3. Modify the ovirtmgmt network to have VLANs
> > > > > 4. Now create a cluster and add your hosts.
> > > > > 
> > > > > If you insist on using the default DC and cluster then before
> > > > > adding the first host, create an additional DC and move the
> > > > > Default cluster over there. You may then change the network on the
> > > > > Default cluster and then move the Default cluster back
> > > > > 
> > > > > Both are ugly. And should be solved by the proposal above.
> > > > > 
> > > > > We do something similar for the Default cluster CPU level, where we
> > > > > set the intial level based on the first host added to the cluster.
> > > > 
> > > > I'm not sure what Engine has for Default cluster CPU level. But I
> > > > have
> > > > reservation of the hysteresis in your proposal - after a host is
> > > > added,
> > > > the DC cannot forget ovirtmgmt's vlan.
> > > > 
> > > > How about letting the admin edit ovirtmgmt's vlan in the DC level,
> > > > thus
> > > > rendering all hosts out-of-sync. The the admin could manually, or
> > > > through a script, or in the future through a distributed operation,
> > > > sync
> > > > all the hosts to the definition?
> > > 
> > > Usually if you do that you will loose connectivity to the hosts.
> > 
> > Yes, changing the management vlan id (or ip address) is never fun, and
> > requires out-of-band intervention.
> > 
> > > I'm not insisting on the automatic adjustment of the ovirtmgmt network to
> > > match the hosts' (that is just a nice touch) we can take the allow edit
> > > approach.
> > > 
> > > But allow to change VLAN on the ovirtmgmt network will indeed solve the
> > > issue I'm trying to solve while creating another issue of user expecting
> > > that we'll be able to re-tag the host from the engine side, which is
> > > challenging to do.
> > > 
> > > On the other hand, if we allow to change the VLAN as long as the change
> > > matches the hosts' configuration, it will both solve the issue while not
> > > eluding the user to think that we really can solve the chicken and egg
> > > issue of re-tag the entire system.
> > > 
> > > Now with the above ability you do get a flow to do the re-tag.
> > > 1. Place all the hosts in maintenance
> > > 2. Re-tag the ovirtmgmt on all the hosts
> > > 3. Re-tag the hosts on which the engine on
> > > 4. Activate the hosts - this should work well now since connectivity
> > > exist
> > > 5. Change the tag on ovirtmgmt on the engine to match the hosts'
> > > 
> > > Simple and clear process.
> > > 
> > > When the workaround of creating another DC was not possible since the
> > > system was already long in use and the need was re-tag of the network the
> > > above is what I've recommended in the, except that steps 4-5 where done
> > > as:
> > > 4. Stop the engine
> > > 5. Change the tag in the DB
> > > 6. Start the engine
> > > 7. Activate the hosts
> > 
> > Sounds reasonable to me - but as far as I am aware this is not tightly
> > related to the $Subject, which is the post-boot ovirtmgmt definition.
> > 
> > I've added a few details to
> > http://www.ovirt.org/Features/Normalized_ovirtmgmt_Initialization#Engine
> > and I would apreciate a review from someone with intimate Engine
> > know-how.
> > 
> > Dan.
> > 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 



More information about the Arch mailing list