feature suggestion: initial generation of management network

Alon Bar-Lev alonbl at redhat.com
Sun May 12 08:25:45 UTC 2013



----- Original Message -----
> From: "Barak Azulay" <bazulay at redhat.com>
> To: "Livnat Peer" <lpeer at redhat.com>
> Cc: "Alon Bar-Lev" <abarlev at redhat.com>, "arch" <arch at ovirt.org>, "Simon Grinberg" <sgrinber at redhat.com>
> Sent: Sunday, May 12, 2013 11:15:20 AM
> Subject: Re: feature suggestion: initial generation of management network
> 
> 
> 
> ----- Original Message -----
> > From: "Livnat Peer" <lpeer at redhat.com>
> > To: "Moti Asayag" <masayag at redhat.com>
> > Cc: "arch" <arch at ovirt.org>, "Alon Bar-Lev" <abarlev at redhat.com>, "Barak
> > Azulay" <bazulay at redhat.com>, "Simon
> > Grinberg" <sgrinber at redhat.com>
> > Sent: Sunday, May 12, 2013 9:59:07 AM
> > Subject: Re: feature suggestion: initial generation of management network
> > 
> > Thread Summary -
> > 
> > 1. We all agree the automatic reboot after host installation is not
> > needed anymore and can be removed.
> > 
> > 2. There is a vast agreement that we need to add a new VDSM verb for
> > reboot.
> 
> I disagree with the above
> 
> In addition to the fact that it will not work when VDSM is not responsive
> (when this action will be needed the most)

If vdsm is unresponsive because of a fault in vdsm we can add a fail safe mechanism for critical commands within vdsm.
And we can always fallback to the standard fencing in such cases.

Can you please describe the scenario of which host-deploy succeeds and vdsm is unresponsive?

Current sequence:
1. host-deploy + reboot - all via single ssh session.

New sequence:
1. host-deploy - via ssh.
2. network setup - via vdsm.
3. optional reboot - via vdsm.

In the new sequence, vdsm must be responsive to accomplish (2), and if (2) succeeds vdsm, again, must be responsive.

Thanks!

> 
> 
> > 
> > 3. There was a suggestion to add a checkbox when adding a host to reboot
> > the host after installation, default would be not to reboot. (leaving
> > the option to reboot to the administrator).
> > 
> > 
> > If there is no objection we'll go with the above.
> > 
> > Thanks, Livnat
> > 
> > 
> > On 05/07/2013 02:22 PM, Moti Asayag wrote:
> > > I stumbled upon few issues with the current design while implementing it:
> > > 
> > > There seems to be a requirement to reboot the host after the installation
> > > is completed in order to assure the host is recoverable.
> > > 
> > > Therefore, the building blocks of the installation process of 3.3 are:
> > > 1. host deploy which installs the host expect configuring its management
> > > network.
> > > 2. SetupNetwork (and CommitNetworkChanges) - for creating the management
> > > network
> > > on the host and persisting the network configuration.
> > > 3. Reboot the host - This is a missing piece. (engine has FenceVds
> > > command,
> > > but it
> > > requires the power management to be configured prior to the installation
> > > and might
> > > be irrelevant for hosts without PM.)
> > > 
> > > So, there are couple of issues here:
> > > 1. How to reboot the host?
> > > 1.1. By exposing new RebootNode verb in VDSM and invoking it from the
> > > engine
> > > 1.2. By opening ssh dialog to the host in order to execute the reboot
> > > 
> > > 2. When to perform the reboot?
> > > 2.1. After host deploy, by utilizing the host deploy to perform the
> > > reboot.
> > > It requires to configure the network by the monitor when the host is
> > > detected by the engine,
> > > detached from the installation flow. However it is a step toward the
> > > non-persistent network feature
> > > yet to be defined.
> > > 2.2. After setupNetwork is done and network was configured and persisted
> > > on
> > > the host.
> > > There is no special advantage from recoverable aspect, as setupNetwork is
> > > constantly
> > > used to persist the network configuration (by the complementary
> > > CommitNetworkChanges command).
> > > In case and network configuration fails, VDSM will revert to the last
> > > well
> > > known configuration
> > > - so connectivity with engine should be restored. Design wise, it fits to
> > > configure the management
> > >  network as part of the installation sequence.
> > > If the network configuration fails in this context, the host status will
> > > be
> > > set to "InstallFailed" rather than "NonOperational",
> > > as might occur as a result of a failed setupNetwork command.
> > > 
> > > 
> > > Your inputs are welcome.
> > > 
> > > Thanks,
> > > Moti
> > > ----- Original Message -----
> > >> From: "Dan Kenigsberg" <danken at redhat.com>
> > >> To: "Simon Grinberg" <simon at redhat.com>, "Moti Asayag"
> > >> <masayag at redhat.com>
> > >> Cc: "arch" <arch at ovirt.org>
> > >> Sent: Tuesday, January 1, 2013 2:47:57 PM
> > >> Subject: Re: feature suggestion: initial generation of management
> > >> network
> > >>
> > >> On Thu, Dec 27, 2012 at 07:36:40AM -0500, Simon Grinberg wrote:
> > >>>
> > >>>
> > >>> ----- Original Message -----
> > >>>> From: "Dan Kenigsberg" <danken at redhat.com>
> > >>>> To: "Simon Grinberg" <simon at redhat.com>
> > >>>> Cc: "arch" <arch at ovirt.org>
> > >>>> Sent: Thursday, December 27, 2012 2:14:06 PM
> > >>>> Subject: Re: feature suggestion: initial generation of management
> > >>>> network
> > >>>>
> > >>>> On Tue, Dec 25, 2012 at 09:29:26AM -0500, Simon Grinberg wrote:
> > >>>>>
> > >>>>>
> > >>>>> ----- Original Message -----
> > >>>>>> From: "Dan Kenigsberg" <danken at redhat.com>
> > >>>>>> To: "arch" <arch at ovirt.org>
> > >>>>>> Sent: Tuesday, December 25, 2012 2:27:22 PM
> > >>>>>> Subject: feature suggestion: initial generation of management
> > >>>>>> network
> > >>>>>>
> > >>>>>> Current condition:
> > >>>>>> ==================
> > >>>>>> The management network, named ovirtmgmt, is created during host
> > >>>>>> bootstrap. It consists of a bridge device, connected to the
> > >>>>>> network
> > >>>>>> device that was used to communicate with Engine (nic, bonding or
> > >>>>>> vlan).
> > >>>>>> It inherits its ip settings from the latter device.
> > >>>>>>
> > >>>>>> Why Is the Management Network Needed?
> > >>>>>> =====================================
> > >>>>>> Understandably, some may ask why do we need to have a management
> > >>>>>> network - why having a host with IPv4 configured on it is not
> > >>>>>> enough.
> > >>>>>> The answer is twofold:
> > >>>>>> 1. In oVirt, a network is an abstraction of the resources
> > >>>>>> required
> > >>>>>> for
> > >>>>>>    connectivity of a host for a specific usage. This is true for
> > >>>>>>    the
> > >>>>>>    management network just as it is for VM network or a display
> > >>>>>>    network.
> > >>>>>>    The network entity is the key for adding/changing nics and IP
> > >>>>>>    address.
> > >>>>>> 2. In many occasions (such as small setups) the management
> > >>>>>> network is
> > >>>>>>    used as a VM/display network as well.
> > >>>>>>
> > >>>>>> Problems in current connectivity:
> > >>>>>> ================================
> > >>>>>> According to alonbl of ovirt-host-deploy fame, and with no
> > >>>>>> conflict
> > >>>>>> to
> > >>>>>> my own experience, creating the management network is the most
> > >>>>>> fragile,
> > >>>>>> error-prone step of bootstrap.
> > >>>>>
> > >>>>> +1,
> > >>>>> I've raise that repeatedly in the past, bootstrap should not create
> > >>>>> the management network but pick up the existing configuration and
> > >>>>> let the engine override later with it's own configuration if it
> > >>>>> differs , I'm glad that we finally get to that.
> > >>>>>
> > >>>>>>
> > >>>>>> Currently it always creates a bridged network (even if the DC
> > >>>>>> requires a
> > >>>>>> non-bridged ovirtmgmt), it knows nothing about the defined MTU
> > >>>>>> for
> > >>>>>> ovirtmgmt, it uses ping to guess on top of which device to build
> > >>>>>> (and
> > >>>>>> thus requires Vdsm-to-Engine reverse connectivity), and is the
> > >>>>>> sole
> > >>>>>> remaining user of the addNetwork/vdsm-store-net-conf scripts.
> > >>>>>>
> > >>>>>> Suggested feature:
> > >>>>>> ==================
> > >>>>>> Bootstrap would avoid creating a management network. Instead,
> > >>>>>> after
> > >>>>>> bootstrapping a host, Engine would send a getVdsCaps probe to the
> > >>>>>> installed host, receiving a complete picture of the network
> > >>>>>> configuration on the host. Among this picture is the device that
> > >>>>>> holds
> > >>>>>> the host's management IP address.
> > >>>>>>
> > >>>>>> Engine would send setupNetwork command to generate ovirtmgmt with
> > >>>>>> details devised from this picture, and according to the DC
> > >>>>>> definition
> > >>>>>> of
> > >>>>>> ovirtmgmt.  For example, if Vdsm reports:
> > >>>>>>
> > >>>>>> - vlan bond4.3000 has the host's IP, configured to use dhcp.
> > >>>>>> - bond4 is comprises eth2 and eth3
> > >>>>>> - ovirtmgmt is defined as a VM network with MTU 9000
> > >>>>>>
> > >>>>>> then Engine sends the likes of:
> > >>>>>>   setupNetworks(ovirtmgmt: {bridged=True, vlan=3000, iface=bond4,
> > >>>>>>                 bonding=bond4: {eth2,eth3}, MTU=9000)
> > >>>>>
> > >>>>> Just one comment here,
> > >>>>> In order to save time and confusion - if the ovirtmgmt is defined
> > >>>>> with default values meaning the user did not bother to touch it,
> > >>>>> let it pick up the VLAN configuration from the first host added in
> > >>>>> the Data Center.
> > >>>>>
> > >>>>> Otherwise, you may override the host VLAN and loose connectivity.
> > >>>>>
> > >>>>> This will also solve the situation many users encounter today.
> > >>>>> 1. The engine in on a host that actually has VLAN defined
> > >>>>> 2. The ovirtmgmt network was not updated in the DC
> > >>>>> 3. A host, with VLAN already defined is added - everything works
> > >>>>> fine
> > >>>>> 4. Any number of hosts are now added, again everything seems to
> > >>>>> work fine.
> > >>>>>
> > >>>>> But, now try to use setupNetworks, and you'll find out that you
> > >>>>> can't do much on the interface that contains the ovirtmgmt since
> > >>>>> the definition does not match. You can't sync (Since this will
> > >>>>> remove the VLAN and cause connectivity lose) you can't add more
> > >>>>> networks on top since it already has non-VLAN network on top
> > >>>>> according to the DC definition, etc.
> > >>>>>
> > >>>>> On the other hand you can't update the ovirtmgmt definition on the
> > >>>>> DC since there are clusters in the DC that use the network.
> > >>>>>
> > >>>>> The only workaround not involving DB hack to change the VLAN on the
> > >>>>> network is to:
> > >>>>> 1. Create new DC
> > >>>>> 2. Do not use the wizard that pops up to create your cluster.
> > >>>>> 3. Modify the ovirtmgmt network to have VLANs
> > >>>>> 4. Now create a cluster and add your hosts.
> > >>>>>
> > >>>>> If you insist on using the default DC and cluster then before
> > >>>>> adding the first host, create an additional DC and move the
> > >>>>> Default cluster over there. You may then change the network on the
> > >>>>> Default cluster and then move the Default cluster back
> > >>>>>
> > >>>>> Both are ugly. And should be solved by the proposal above.
> > >>>>>
> > >>>>> We do something similar for the Default cluster CPU level, where we
> > >>>>> set the intial level based on the first host added to the cluster.
> > >>>>
> > >>>> I'm not sure what Engine has for Default cluster CPU level. But I
> > >>>> have
> > >>>> reservation of the hysteresis in your proposal - after a host is
> > >>>> added,
> > >>>> the DC cannot forget ovirtmgmt's vlan.
> > >>>>
> > >>>> How about letting the admin edit ovirtmgmt's vlan in the DC level,
> > >>>> thus
> > >>>> rendering all hosts out-of-sync. The the admin could manually, or
> > >>>> through a script, or in the future through a distributed operation,
> > >>>> sync
> > >>>> all the hosts to the definition?
> > >>>
> > >>> Usually if you do that you will loose connectivity to the hosts.
> > >>
> > >> Yes, changing the management vlan id (or ip address) is never fun, and
> > >> requires out-of-band intervention.
> > >>
> > >>> I'm not insisting on the automatic adjustment of the ovirtmgmt network
> > >>> to
> > >>> match the hosts' (that is just a nice touch) we can take the allow edit
> > >>> approach.
> > >>>
> > >>> But allow to change VLAN on the ovirtmgmt network will indeed solve the
> > >>> issue I'm trying to solve while creating another issue of user
> > >>> expecting
> > >>> that we'll be able to re-tag the host from the engine side, which is
> > >>> challenging to do.
> > >>>
> > >>> On the other hand, if we allow to change the VLAN as long as the change
> > >>> matches the hosts' configuration, it will both solve the issue while
> > >>> not
> > >>> eluding the user to think that we really can solve the chicken and egg
> > >>> issue of re-tag the entire system.
> > >>>
> > >>> Now with the above ability you do get a flow to do the re-tag.
> > >>> 1. Place all the hosts in maintenance
> > >>> 2. Re-tag the ovirtmgmt on all the hosts
> > >>> 3. Re-tag the hosts on which the engine on
> > >>> 4. Activate the hosts - this should work well now since connectivity
> > >>> exist
> > >>> 5. Change the tag on ovirtmgmt on the engine to match the hosts'
> > >>>
> > >>> Simple and clear process.
> > >>>
> > >>> When the workaround of creating another DC was not possible since the
> > >>> system was already long in use and the need was re-tag of the network
> > >>> the
> > >>> above is what I've recommended in the, except that steps 4-5 where done
> > >>> as:
> > >>> 4. Stop the engine
> > >>> 5. Change the tag in the DB
> > >>> 6. Start the engine
> > >>> 7. Activate the hosts
> > >>
> > >> Sounds reasonable to me - but as far as I am aware this is not tightly
> > >> related to the $Subject, which is the post-boot ovirtmgmt definition.
> > >>
> > >> I've added a few details to
> > >> http://www.ovirt.org/Features/Normalized_ovirtmgmt_Initialization#Engine
> > >> and I would apreciate a review from someone with intimate Engine
> > >> know-how.
> > >>
> > >> Dan.
> > >>
> > > _______________________________________________
> > > Arch mailing list
> > > Arch at ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/arch
> > > 
> > > 
> > 
> > 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 



More information about the Arch mailing list