feature suggestion: initial generation of management network

Livnat Peer lpeer at redhat.com
Sun May 12 08:46:06 UTC 2013


On 05/12/2013 11:15 AM, Barak Azulay wrote:
> 
> 
> ----- Original Message -----
>> From: "Livnat Peer" <lpeer at redhat.com>
>> To: "Moti Asayag" <masayag at redhat.com>
>> Cc: "arch" <arch at ovirt.org>, "Alon Bar-Lev" <abarlev at redhat.com>, "Barak Azulay" <bazulay at redhat.com>, "Simon
>> Grinberg" <sgrinber at redhat.com>
>> Sent: Sunday, May 12, 2013 9:59:07 AM
>> Subject: Re: feature suggestion: initial generation of management network
>>
>> Thread Summary -
>>
>> 1. We all agree the automatic reboot after host installation is not
>> needed anymore and can be removed.
>>
>> 2. There is a vast agreement that we need to add a new VDSM verb for reboot.
> 
> I disagree with the above
> 
> In addition to the fact that it will not work when VDSM is not responsive (when this action will be needed the most) 
> 

you can fence the node if VDSM is non responsive, that's the mechanism
we use today to deal with such cases.

> 
>>
>> 3. There was a suggestion to add a checkbox when adding a host to reboot
>> the host after installation, default would be not to reboot. (leaving
>> the option to reboot to the administrator).
>>
>>
>> If there is no objection we'll go with the above.
>>
>> Thanks, Livnat
>>
>>
>> On 05/07/2013 02:22 PM, Moti Asayag wrote:
>>> I stumbled upon few issues with the current design while implementing it:
>>>
>>> There seems to be a requirement to reboot the host after the installation
>>> is completed in order to assure the host is recoverable.
>>>
>>> Therefore, the building blocks of the installation process of 3.3 are:
>>> 1. host deploy which installs the host expect configuring its management
>>> network.
>>> 2. SetupNetwork (and CommitNetworkChanges) - for creating the management
>>> network
>>> on the host and persisting the network configuration.
>>> 3. Reboot the host - This is a missing piece. (engine has FenceVds command,
>>> but it
>>> requires the power management to be configured prior to the installation
>>> and might
>>> be irrelevant for hosts without PM.)
>>>
>>> So, there are couple of issues here:
>>> 1. How to reboot the host?
>>> 1.1. By exposing new RebootNode verb in VDSM and invoking it from the
>>> engine
>>> 1.2. By opening ssh dialog to the host in order to execute the reboot
>>>
>>> 2. When to perform the reboot?
>>> 2.1. After host deploy, by utilizing the host deploy to perform the reboot.
>>> It requires to configure the network by the monitor when the host is
>>> detected by the engine,
>>> detached from the installation flow. However it is a step toward the
>>> non-persistent network feature
>>> yet to be defined.
>>> 2.2. After setupNetwork is done and network was configured and persisted on
>>> the host.
>>> There is no special advantage from recoverable aspect, as setupNetwork is
>>> constantly
>>> used to persist the network configuration (by the complementary
>>> CommitNetworkChanges command).
>>> In case and network configuration fails, VDSM will revert to the last well
>>> known configuration
>>> - so connectivity with engine should be restored. Design wise, it fits to
>>> configure the management
>>>  network as part of the installation sequence.
>>> If the network configuration fails in this context, the host status will be
>>> set to "InstallFailed" rather than "NonOperational",
>>> as might occur as a result of a failed setupNetwork command.
>>>
>>>
>>> Your inputs are welcome.
>>>
>>> Thanks,
>>> Moti
>>> ----- Original Message -----
>>>> From: "Dan Kenigsberg" <danken at redhat.com>
>>>> To: "Simon Grinberg" <simon at redhat.com>, "Moti Asayag"
>>>> <masayag at redhat.com>
>>>> Cc: "arch" <arch at ovirt.org>
>>>> Sent: Tuesday, January 1, 2013 2:47:57 PM
>>>> Subject: Re: feature suggestion: initial generation of management network
>>>>
>>>> On Thu, Dec 27, 2012 at 07:36:40AM -0500, Simon Grinberg wrote:
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: "Dan Kenigsberg" <danken at redhat.com>
>>>>>> To: "Simon Grinberg" <simon at redhat.com>
>>>>>> Cc: "arch" <arch at ovirt.org>
>>>>>> Sent: Thursday, December 27, 2012 2:14:06 PM
>>>>>> Subject: Re: feature suggestion: initial generation of management
>>>>>> network
>>>>>>
>>>>>> On Tue, Dec 25, 2012 at 09:29:26AM -0500, Simon Grinberg wrote:
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>> From: "Dan Kenigsberg" <danken at redhat.com>
>>>>>>>> To: "arch" <arch at ovirt.org>
>>>>>>>> Sent: Tuesday, December 25, 2012 2:27:22 PM
>>>>>>>> Subject: feature suggestion: initial generation of management
>>>>>>>> network
>>>>>>>>
>>>>>>>> Current condition:
>>>>>>>> ==================
>>>>>>>> The management network, named ovirtmgmt, is created during host
>>>>>>>> bootstrap. It consists of a bridge device, connected to the
>>>>>>>> network
>>>>>>>> device that was used to communicate with Engine (nic, bonding or
>>>>>>>> vlan).
>>>>>>>> It inherits its ip settings from the latter device.
>>>>>>>>
>>>>>>>> Why Is the Management Network Needed?
>>>>>>>> =====================================
>>>>>>>> Understandably, some may ask why do we need to have a management
>>>>>>>> network - why having a host with IPv4 configured on it is not
>>>>>>>> enough.
>>>>>>>> The answer is twofold:
>>>>>>>> 1. In oVirt, a network is an abstraction of the resources
>>>>>>>> required
>>>>>>>> for
>>>>>>>>    connectivity of a host for a specific usage. This is true for
>>>>>>>>    the
>>>>>>>>    management network just as it is for VM network or a display
>>>>>>>>    network.
>>>>>>>>    The network entity is the key for adding/changing nics and IP
>>>>>>>>    address.
>>>>>>>> 2. In many occasions (such as small setups) the management
>>>>>>>> network is
>>>>>>>>    used as a VM/display network as well.
>>>>>>>>
>>>>>>>> Problems in current connectivity:
>>>>>>>> ================================
>>>>>>>> According to alonbl of ovirt-host-deploy fame, and with no
>>>>>>>> conflict
>>>>>>>> to
>>>>>>>> my own experience, creating the management network is the most
>>>>>>>> fragile,
>>>>>>>> error-prone step of bootstrap.
>>>>>>>
>>>>>>> +1,
>>>>>>> I've raise that repeatedly in the past, bootstrap should not create
>>>>>>> the management network but pick up the existing configuration and
>>>>>>> let the engine override later with it's own configuration if it
>>>>>>> differs , I'm glad that we finally get to that.
>>>>>>>
>>>>>>>>
>>>>>>>> Currently it always creates a bridged network (even if the DC
>>>>>>>> requires a
>>>>>>>> non-bridged ovirtmgmt), it knows nothing about the defined MTU
>>>>>>>> for
>>>>>>>> ovirtmgmt, it uses ping to guess on top of which device to build
>>>>>>>> (and
>>>>>>>> thus requires Vdsm-to-Engine reverse connectivity), and is the
>>>>>>>> sole
>>>>>>>> remaining user of the addNetwork/vdsm-store-net-conf scripts.
>>>>>>>>
>>>>>>>> Suggested feature:
>>>>>>>> ==================
>>>>>>>> Bootstrap would avoid creating a management network. Instead,
>>>>>>>> after
>>>>>>>> bootstrapping a host, Engine would send a getVdsCaps probe to the
>>>>>>>> installed host, receiving a complete picture of the network
>>>>>>>> configuration on the host. Among this picture is the device that
>>>>>>>> holds
>>>>>>>> the host's management IP address.
>>>>>>>>
>>>>>>>> Engine would send setupNetwork command to generate ovirtmgmt with
>>>>>>>> details devised from this picture, and according to the DC
>>>>>>>> definition
>>>>>>>> of
>>>>>>>> ovirtmgmt.  For example, if Vdsm reports:
>>>>>>>>
>>>>>>>> - vlan bond4.3000 has the host's IP, configured to use dhcp.
>>>>>>>> - bond4 is comprises eth2 and eth3
>>>>>>>> - ovirtmgmt is defined as a VM network with MTU 9000
>>>>>>>>
>>>>>>>> then Engine sends the likes of:
>>>>>>>>   setupNetworks(ovirtmgmt: {bridged=True, vlan=3000, iface=bond4,
>>>>>>>>                 bonding=bond4: {eth2,eth3}, MTU=9000)
>>>>>>>
>>>>>>> Just one comment here,
>>>>>>> In order to save time and confusion - if the ovirtmgmt is defined
>>>>>>> with default values meaning the user did not bother to touch it,
>>>>>>> let it pick up the VLAN configuration from the first host added in
>>>>>>> the Data Center.
>>>>>>>
>>>>>>> Otherwise, you may override the host VLAN and loose connectivity.
>>>>>>>
>>>>>>> This will also solve the situation many users encounter today.
>>>>>>> 1. The engine in on a host that actually has VLAN defined
>>>>>>> 2. The ovirtmgmt network was not updated in the DC
>>>>>>> 3. A host, with VLAN already defined is added - everything works
>>>>>>> fine
>>>>>>> 4. Any number of hosts are now added, again everything seems to
>>>>>>> work fine.
>>>>>>>
>>>>>>> But, now try to use setupNetworks, and you'll find out that you
>>>>>>> can't do much on the interface that contains the ovirtmgmt since
>>>>>>> the definition does not match. You can't sync (Since this will
>>>>>>> remove the VLAN and cause connectivity lose) you can't add more
>>>>>>> networks on top since it already has non-VLAN network on top
>>>>>>> according to the DC definition, etc.
>>>>>>>
>>>>>>> On the other hand you can't update the ovirtmgmt definition on the
>>>>>>> DC since there are clusters in the DC that use the network.
>>>>>>>
>>>>>>> The only workaround not involving DB hack to change the VLAN on the
>>>>>>> network is to:
>>>>>>> 1. Create new DC
>>>>>>> 2. Do not use the wizard that pops up to create your cluster.
>>>>>>> 3. Modify the ovirtmgmt network to have VLANs
>>>>>>> 4. Now create a cluster and add your hosts.
>>>>>>>
>>>>>>> If you insist on using the default DC and cluster then before
>>>>>>> adding the first host, create an additional DC and move the
>>>>>>> Default cluster over there. You may then change the network on the
>>>>>>> Default cluster and then move the Default cluster back
>>>>>>>
>>>>>>> Both are ugly. And should be solved by the proposal above.
>>>>>>>
>>>>>>> We do something similar for the Default cluster CPU level, where we
>>>>>>> set the intial level based on the first host added to the cluster.
>>>>>>
>>>>>> I'm not sure what Engine has for Default cluster CPU level. But I
>>>>>> have
>>>>>> reservation of the hysteresis in your proposal - after a host is
>>>>>> added,
>>>>>> the DC cannot forget ovirtmgmt's vlan.
>>>>>>
>>>>>> How about letting the admin edit ovirtmgmt's vlan in the DC level,
>>>>>> thus
>>>>>> rendering all hosts out-of-sync. The the admin could manually, or
>>>>>> through a script, or in the future through a distributed operation,
>>>>>> sync
>>>>>> all the hosts to the definition?
>>>>>
>>>>> Usually if you do that you will loose connectivity to the hosts.
>>>>
>>>> Yes, changing the management vlan id (or ip address) is never fun, and
>>>> requires out-of-band intervention.
>>>>
>>>>> I'm not insisting on the automatic adjustment of the ovirtmgmt network to
>>>>> match the hosts' (that is just a nice touch) we can take the allow edit
>>>>> approach.
>>>>>
>>>>> But allow to change VLAN on the ovirtmgmt network will indeed solve the
>>>>> issue I'm trying to solve while creating another issue of user expecting
>>>>> that we'll be able to re-tag the host from the engine side, which is
>>>>> challenging to do.
>>>>>
>>>>> On the other hand, if we allow to change the VLAN as long as the change
>>>>> matches the hosts' configuration, it will both solve the issue while not
>>>>> eluding the user to think that we really can solve the chicken and egg
>>>>> issue of re-tag the entire system.
>>>>>
>>>>> Now with the above ability you do get a flow to do the re-tag.
>>>>> 1. Place all the hosts in maintenance
>>>>> 2. Re-tag the ovirtmgmt on all the hosts
>>>>> 3. Re-tag the hosts on which the engine on
>>>>> 4. Activate the hosts - this should work well now since connectivity
>>>>> exist
>>>>> 5. Change the tag on ovirtmgmt on the engine to match the hosts'
>>>>>
>>>>> Simple and clear process.
>>>>>
>>>>> When the workaround of creating another DC was not possible since the
>>>>> system was already long in use and the need was re-tag of the network the
>>>>> above is what I've recommended in the, except that steps 4-5 where done
>>>>> as:
>>>>> 4. Stop the engine
>>>>> 5. Change the tag in the DB
>>>>> 6. Start the engine
>>>>> 7. Activate the hosts
>>>>
>>>> Sounds reasonable to me - but as far as I am aware this is not tightly
>>>> related to the $Subject, which is the post-boot ovirtmgmt definition.
>>>>
>>>> I've added a few details to
>>>> http://www.ovirt.org/Features/Normalized_ovirtmgmt_Initialization#Engine
>>>> and I would apreciate a review from someone with intimate Engine
>>>> know-how.
>>>>
>>>> Dan.
>>>>
>>> _______________________________________________
>>> Arch mailing list
>>> Arch at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/arch
>>>
>>>
>>
>>
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 
> 




More information about the Arch mailing list