feature suggestion: migration network

Sun Jan 13 11:53:23 UTC 2013

On 01/10/2013 02:54 PM, Simon Grinberg wrote:
> 
> 
> ----- Original Message -----
>> From: "Dan Kenigsberg" <danken at redhat.com>
>> To: "Doron Fediuck" <dfediuck at redhat.com>
>> Cc: "Simon Grinberg" <simon at redhat.com>, "Orit Wasserman" <owasserm at redhat.com>, "Laine Stump" <lstump at redhat.com>,
>> "Yuval M" <yuvalme at gmail.com>, "Limor Gavish" <lgavish at gmail.com>, arch at ovirt.org, "Mark Wu"
>> <wudxw at linux.vnet.ibm.com>
>> Sent: Thursday, January 10, 2013 1:46:08 PM
>> Subject: Re: feature suggestion: migration network
>>
>> On Thu, Jan 10, 2013 at 04:43:45AM -0500, Doron Fediuck wrote:
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Simon Grinberg" <simon at redhat.com>
>>>> To: "Mark Wu" <wudxw at linux.vnet.ibm.com>, "Doron Fediuck"
>>>> <dfediuck at redhat.com>
>>>> Cc: "Orit Wasserman" <owasserm at redhat.com>, "Laine Stump"
>>>> <lstump at redhat.com>, "Yuval M" <yuvalme at gmail.com>, "Limor
>>>> Gavish" <lgavish at gmail.com>, arch at ovirt.org, "Dan Kenigsberg"
>>>> <danken at redhat.com>
>>>> Sent: Thursday, January 10, 2013 10:38:56 AM
>>>> Subject: Re: feature suggestion: migration network
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>>> From: "Mark Wu" <wudxw at linux.vnet.ibm.com>
>>>>> To: "Dan Kenigsberg" <danken at redhat.com>
>>>>> Cc: "Simon Grinberg" <simon at redhat.com>, "Orit Wasserman"
>>>>> <owasserm at redhat.com>, "Laine Stump" <lstump at redhat.com>,
>>>>> "Yuval M" <yuvalme at gmail.com>, "Limor Gavish"
>>>>> <lgavish at gmail.com>,
>>>>> arch at ovirt.org
>>>>> Sent: Thursday, January 10, 2013 5:13:23 AM
>>>>> Subject: Re: feature suggestion: migration network
>>>>>
>>>>> On 01/09/2013 03:34 AM, Dan Kenigsberg wrote:
>>>>>> On Tue, Jan 08, 2013 at 01:23:02PM -0500, Simon Grinberg
>>>>>> wrote:
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>> From: "Yaniv Kaul" <ykaul at redhat.com>
>>>>>>>> To: "Dan Kenigsberg" <danken at redhat.com>
>>>>>>>> Cc: "Limor Gavish" <lgavish at gmail.com>, "Yuval M"
>>>>>>>> <yuvalme at gmail.com>, arch at ovirt.org, "Simon Grinberg"
>>>>>>>> <sgrinber at redhat.com>
>>>>>>>> Sent: Tuesday, January 8, 2013 4:46:10 PM
>>>>>>>> Subject: Re: feature suggestion: migration network
>>>>>>>>
>>>>>>>> On 08/01/13 15:04, Dan Kenigsberg wrote:
>>>>>>>>> There's talk about this for ages, so it's time to have
>>>>>>>>> proper
>>>>>>>>> discussion
>>>>>>>>> and a feature page about it: let us have a "migration"
>>>>>>>>> network
>>>>>>>>> role, and
>>>>>>>>> use such networks to carry migration data
>>>>>>>>>
>>>>>>>>> When Engine requests to migrate a VM from one node to
>>>>>>>>> another,
>>>>>>>>> the
>>>>>>>>> VM
>>>>>>>>> state (Bios, IO devices, RAM) is transferred over a TCP/IP
>>>>>>>>> connection
>>>>>>>>> that is opened from the source qemu process to the
>>>>>>>>> destination
>>>>>>>>> qemu.
>>>>>>>>> Currently, destination qemu listens for the incoming
>>>>>>>>> connection
>>>>>>>>> on
>>>>>>>>> the
>>>>>>>>> management IP address of the destination host. This has
>>>>>>>>> serious
>>>>>>>>> downsides: a "migration storm" may choke the destination's
>>>>>>>>> management
>>>>>>>>> interface; migration is plaintext and ovirtmgmt includes
>>>>>>>>> Engine
>>>>>>>>> which
>>>>>>>>> sits may sit the node cluster.
>>>>>>>>>
>>>>>>>>> With this feature, a cluster administrator may grant the
>>>>>>>>> "migration"
>>>>>>>>> role to one of the cluster networks. Engine would use that
>>>>>>>>> network's IP
>>>>>>>>> address on the destination host when it requests a
>>>>>>>>> migration
>>>>>>>>> of
>>>>>>>>> a
>>>>>>>>> VM.
>>>>>>>>> With proper network setup, migration data would be
>>>>>>>>> separated
>>>>>>>>> to
>>>>>>>>> that
>>>>>>>>> network.
>>>>>>>>>
>>>>>>>>> === Benefit to oVirt ===
>>>>>>>>> * Users would be able to define and dedicate a separate
>>>>>>>>> network
>>>>>>>>> for
>>>>>>>>>     migration. Users that need quick migration would use
>>>>>>>>>     nics
>>>>>>>>>     with
>>>>>>>>>     high
>>>>>>>>>     bandwidth. Users who want to cap the bandwidth
>>>>>>>>>     consumed by
>>>>>>>>>     migration
>>>>>>>>>     could define a migration network over nics with
>>>>>>>>>     bandwidth
>>>>>>>>>     limitation.
>>>>>>>>> * Migration data can be limited to a separate network,
>>>>>>>>> that
>>>>>>>>> has
>>>>>>>>> no
>>>>>>>>>     layer-2 access from Engine
>>>>>>>>>
>>>>>>>>> === Vdsm ===
>>>>>>>>> The "migrate" verb should be extended with an additional
>>>>>>>>> parameter,
>>>>>>>>> specifying the address that the remote qemu process should
>>>>>>>>> listen
>>>>>>>>> on. A
>>>>>>>>> new argument is to be added to the currently-defined
>>>>>>>>> migration
>>>>>>>>> arguments:
>>>>>>>>> * vmId: UUID
>>>>>>>>> * dst: management address of destination host
>>>>>>>>> * dstparams: hibernation volumes definition
>>>>>>>>> * mode: migration/hibernation
>>>>>>>>> * method: rotten legacy
>>>>>>>>> * ''New'': migration uri, according to
>>>>>>>>> http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2
>>>>>>>>> such as tcp://<ip of migration network on remote node>
>>>>>>>>>
>>>>>>>>> === Engine ===
>>>>>>>>> As usual, complexity lies here, and several changes are
>>>>>>>>> required:
>>>>>>>>>
>>>>>>>>> 1. Network definition.
>>>>>>>>> 1.1 A new network role - not unlike "display network"
>>>>>>>>> should
>>>>>>>>> be
>>>>>>>>>       added.Only one migration network should be defined
>>>>>>>>>       on a
>>>>>>>>>       cluster.
>>>>>>> We are considering multiple display networks already, then
>>>>>>> why
>>>>>>> not
>>>>>>> the
>>>>>>> same for migration?
>>>>>> What is the motivation of having multiple migration networks?
>>>>>> Extending
>>>>>> the bandwidth (and thus, any network can be taken when
>>>>>> needed) or
>>>>>> data separation (and thus, a migration network should be
>>>>>> assigned
>>>>>> to
>>>>>> each VM in the cluster)? Or another morivation with
>>>>>> consequence?
>>>>> My suggestion is making the migration network role determined
>>>>> dynamically on each migrate.  If we only define one migration
>>>>> network
>>>>> per cluster,
>>>>> the migration storm could happen to that network. It could
>>>>> cause
>>>>> some
>>>>> bad impact on VM applications.  So I think engine could choose
>>>>> the
>>>>> network which
>>>>> has lower traffic load on migration, or leave the choice to
>>>>> user.
>>>>
>>>> Dynamic migration selection is indeed desirable but only from
>>>> migration networks - migration traffic is insecure so it's
>>>> undesirable to have it mixed with VM traffic unless permitted by
>>>> the
>>>> admin by marking this network as migration network.
>>>>
>>>> To clarify what I've meant in the previous response to Livnat -
>>>> When
>>>> I've said "...if the customer due to the unsymmetrical nature of
>>>> most bonding modes prefers to use muplitple networks for
>>>> migration
>>>> and will ask us to optimize migration across these..."
>>>>
>>>> But the dynamic selection should be based on SLA which the above
>>>> is
>>>> just part:
>>>> 1. Need to consider tenant traffic segregation rules = security
>>>> 2. SLA contracts
>>
>> We could devise a complex logic of assigning each Vm a pool of
>> applicable migration networks, where one of them is chosen by Engine
>> upon migration startup.
>>
>> I am, however, not at all sure that extending the migration bandwidth
>> by
>> means of multiple migration networks is worth the design hassle and
>> the
>> GUI noise. A simpler solution would be to build a single migration
>> network on top of a fat bond, tweaked by a fine-tuned SLA.
> 
> Except for mod-4 most bonding modes are either optimized for outbound optimization or inbound - not both. It's far from optimal.
> And you are forgetting the other reason I've raised, like isolation of tenants traffic and not just from SLA reasons. 
> 

Why do we need isolation of tenants migration traffic if not for SLA
reasons?

> Even from pure active - active redundancy you may want to have more then one or asymmetrical hosts 

That's again going back to SLA policies and not specific for the
migration network.

> Example. 
> We have a host with 3 nics - you dedicate each for management, migration, storage - respectively. But if the migration fails, you want the engagement network to become your migration (automatically) 
> 

OR you may not want that.
That's a policy for handling network roles, not related specifically to
migration network.

> Another:
> A large host with many nics and smaller host with less - as long as this a rout between the migration and management networks you could think on a scenario where on the larger host you have separate networks for each role while on the smaller you have a single network assuming both rolls.
> 

I'm not sure this is the main use case and if we want to make the
general flow complicated because of exotic use cases.

Maybe what you are looking for is override on host level to network
roles. Not sure how useful this is though.

> Other examples can be found. 
> 

If you have some main use cases I would love to here them maybe they can
make the requirement more clear.

> It's really not just one reason to support more then one migration network or display network or storage or any other 'facility' network. Any facility network may apply for more then one on a cluster. 
> 

I'm not sure display can be on the same bucket as migration management
and storage.

> 
>>
>>>>
>>>> If you keep 2, migration storms mitigation is granted. But you
>>>> are
>>>> right that another feature required for #2 above is to control
>>>> the
>>>> migration bandwidth (BW) per migration. We had discussion in the
>>>> past for VDSM to do dynamic calculation based on f(Line Speed,
>>>> Max
>>>> Migration BW, Max allowed per VM, Free BW, number of migrating
>>>> machines) when starting migration. (I actually wanted to do so
>>>> years
>>>> ago, but never got to that - one of those things you always
>>>> postpone
>>>> to when you'll find the time). We did not think that the engine
>>>> should provide some, but coming to think of it, you are right and
>>>> it
>>>> makes sense. For SLA - Max per VM + Min guaranteed should be
>>>> provided by the engine to maintain SLA. And it's up to the engine
>>>> not to VMs with Min-Guaranteed x number of concurrent migrations
>>>> will exceed Max Migration BW.
>>>>
>>>> Dan this is way too much for initial implementation, but don't
>>>> you
>>>> think we should at least add place holders in the migration API?
>>
>> In my opinion this should wait for another feature. For each VM, I'd
>> like to see a means to define the SLA of each of its vNIC. When we
>> have
>> that, we should similarly define how much bandwidth does it have for
>> migration
>>
>>>> Maybe Doron can assist with the required verbs.
>>>>
>>>> (P.S., I don't want to alarm but we may need SLA parameters for
>>>> setupNetworks as well :) unless we want these as separate API
>>>> tough
>>>> it means more calls during set up)
>>
>> Exactly - when we have a migration network concept, and when we have
>> general network SLA defition, we could easily apply the latter on the
>> former.
>>
>>>>
>>>
>>> As with other resources the bare minimum are usually MIN capacity
>>> and
>>> MAX to avoid choking of other tenants / VMs. In this context we may
>>> need
>>> to consider other QoS elements (delays, etc) but indeed it can be
>>> an additional
>>> limitation on top of the basic one.
>>>
>>
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
>