feature suggestion: migration network

Doron Fediuck dfediuck at redhat.com
Thu Jan 10 09:43:45 UTC 2013



----- Original Message -----
> From: "Simon Grinberg" <simon at redhat.com>
> To: "Mark Wu" <wudxw at linux.vnet.ibm.com>, "Doron Fediuck" <dfediuck at redhat.com>
> Cc: "Orit Wasserman" <owasserm at redhat.com>, "Laine Stump" <lstump at redhat.com>, "Yuval M" <yuvalme at gmail.com>, "Limor
> Gavish" <lgavish at gmail.com>, arch at ovirt.org, "Dan Kenigsberg" <danken at redhat.com>
> Sent: Thursday, January 10, 2013 10:38:56 AM
> Subject: Re: feature suggestion: migration network
> 
> 
> 
> ----- Original Message -----
> > From: "Mark Wu" <wudxw at linux.vnet.ibm.com>
> > To: "Dan Kenigsberg" <danken at redhat.com>
> > Cc: "Simon Grinberg" <simon at redhat.com>, "Orit Wasserman"
> > <owasserm at redhat.com>, "Laine Stump" <lstump at redhat.com>,
> > "Yuval M" <yuvalme at gmail.com>, "Limor Gavish" <lgavish at gmail.com>,
> > arch at ovirt.org
> > Sent: Thursday, January 10, 2013 5:13:23 AM
> > Subject: Re: feature suggestion: migration network
> > 
> > On 01/09/2013 03:34 AM, Dan Kenigsberg wrote:
> > > On Tue, Jan 08, 2013 at 01:23:02PM -0500, Simon Grinberg wrote:
> > >>
> > >> ----- Original Message -----
> > >>> From: "Yaniv Kaul" <ykaul at redhat.com>
> > >>> To: "Dan Kenigsberg" <danken at redhat.com>
> > >>> Cc: "Limor Gavish" <lgavish at gmail.com>, "Yuval M"
> > >>> <yuvalme at gmail.com>, arch at ovirt.org, "Simon Grinberg"
> > >>> <sgrinber at redhat.com>
> > >>> Sent: Tuesday, January 8, 2013 4:46:10 PM
> > >>> Subject: Re: feature suggestion: migration network
> > >>>
> > >>> On 08/01/13 15:04, Dan Kenigsberg wrote:
> > >>>> There's talk about this for ages, so it's time to have proper
> > >>>> discussion
> > >>>> and a feature page about it: let us have a "migration" network
> > >>>> role, and
> > >>>> use such networks to carry migration data
> > >>>>
> > >>>> When Engine requests to migrate a VM from one node to another,
> > >>>> the
> > >>>> VM
> > >>>> state (Bios, IO devices, RAM) is transferred over a TCP/IP
> > >>>> connection
> > >>>> that is opened from the source qemu process to the destination
> > >>>> qemu.
> > >>>> Currently, destination qemu listens for the incoming
> > >>>> connection
> > >>>> on
> > >>>> the
> > >>>> management IP address of the destination host. This has
> > >>>> serious
> > >>>> downsides: a "migration storm" may choke the destination's
> > >>>> management
> > >>>> interface; migration is plaintext and ovirtmgmt includes
> > >>>> Engine
> > >>>> which
> > >>>> sits may sit the node cluster.
> > >>>>
> > >>>> With this feature, a cluster administrator may grant the
> > >>>> "migration"
> > >>>> role to one of the cluster networks. Engine would use that
> > >>>> network's IP
> > >>>> address on the destination host when it requests a migration
> > >>>> of
> > >>>> a
> > >>>> VM.
> > >>>> With proper network setup, migration data would be separated
> > >>>> to
> > >>>> that
> > >>>> network.
> > >>>>
> > >>>> === Benefit to oVirt ===
> > >>>> * Users would be able to define and dedicate a separate
> > >>>> network
> > >>>> for
> > >>>>     migration. Users that need quick migration would use nics
> > >>>>     with
> > >>>>     high
> > >>>>     bandwidth. Users who want to cap the bandwidth consumed by
> > >>>>     migration
> > >>>>     could define a migration network over nics with bandwidth
> > >>>>     limitation.
> > >>>> * Migration data can be limited to a separate network, that
> > >>>> has
> > >>>> no
> > >>>>     layer-2 access from Engine
> > >>>>
> > >>>> === Vdsm ===
> > >>>> The "migrate" verb should be extended with an additional
> > >>>> parameter,
> > >>>> specifying the address that the remote qemu process should
> > >>>> listen
> > >>>> on. A
> > >>>> new argument is to be added to the currently-defined migration
> > >>>> arguments:
> > >>>> * vmId: UUID
> > >>>> * dst: management address of destination host
> > >>>> * dstparams: hibernation volumes definition
> > >>>> * mode: migration/hibernation
> > >>>> * method: rotten legacy
> > >>>> * ''New'': migration uri, according to
> > >>>> http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2
> > >>>> such as tcp://<ip of migration network on remote node>
> > >>>>
> > >>>> === Engine ===
> > >>>> As usual, complexity lies here, and several changes are
> > >>>> required:
> > >>>>
> > >>>> 1. Network definition.
> > >>>> 1.1 A new network role - not unlike "display network" should
> > >>>> be
> > >>>>       added.Only one migration network should be defined on a
> > >>>>       cluster.
> > >> We are considering multiple display networks already, then why
> > >> not
> > >> the
> > >> same for migration?
> > > What is the motivation of having multiple migration networks?
> > > Extending
> > > the bandwidth (and thus, any network can be taken when needed) or
> > > data separation (and thus, a migration network should be assigned
> > > to
> > > each VM in the cluster)? Or another morivation with consequence?
> > My suggestion is making the migration network role determined
> > dynamically on each migrate.  If we only define one migration
> > network
> > per cluster,
> > the migration storm could happen to that network. It could cause
> > some
> > bad impact on VM applications.  So I think engine could choose the
> > network which
> > has lower traffic load on migration, or leave the choice to user.
> 
> Dynamic migration selection is indeed desirable but only from
> migration networks - migration traffic is insecure so it's
> undesirable to have it mixed with VM traffic unless permitted by the
> admin by marking this network as migration network.
> 
> To clarify what I've meant in the previous response to Livnat - When
> I've said "...if the customer due to the unsymmetrical nature of
> most bonding modes prefers to use muplitple networks for migration
> and will ask us to optimize migration across these..."
> 
> But the dynamic selection should be based on SLA which the above is
> just part:
> 1. Need to consider tenant traffic segregation rules = security
> 2. SLA contracts
> 
> If you keep 2, migration storms mitigation is granted. But you are
> right that another feature required for #2 above is to control the
> migration bandwidth (BW) per migration. We had discussion in the
> past for VDSM to do dynamic calculation based on f(Line Speed, Max
> Migration BW, Max allowed per VM, Free BW, number of migrating
> machines) when starting migration. (I actually wanted to do so years
> ago, but never got to that - one of those things you always postpone
> to when you'll find the time). We did not think that the engine
> should provide some, but coming to think of it, you are right and it
> makes sense. For SLA - Max per VM + Min guaranteed should be
> provided by the engine to maintain SLA. And it's up to the engine
> not to VMs with Min-Guaranteed x number of concurrent migrations
> will exceed Max Migration BW.
> 
> Dan this is way too much for initial implementation, but don't you
> think we should at least add place holders in the migration API?
> Maybe Doron can assist with the required verbs.
> 
> (P.S., I don't want to alarm but we may need SLA parameters for
> setupNetworks as well :) unless we want these as separate API tough
> it means more calls during set up)
> 

As with other resources the bare minimum are usually MIN capacity and
MAX to avoid choking of other tenants / VMs. In this context we may need
to consider other QoS elements (delays, etc) but indeed it can be an additional
limitation on top of the basic one.

> > 
> > >
> > >>
> > >>>> 1.2 If none is defined, the legacy "use ovirtmgmt for
> > >>>> migration"
> > >>>>       behavior would apply.
> > >>>> 1.3 A migration network is more likely to be a ''required''
> > >>>> network, but
> > >>>>       a user may opt for non-required. He may face unpleasant
> > >>>>       surprises if he
> > >>>>       wants to migrate his machine, but no candidate host has
> > >>>>       the
> > >>>>       network
> > >>>>       available.
> > >> I think the enforcement should be at least one migration network
> > >> per host -> in the case we support more then one
> > >> Else always required.
> > > Fine by me - if we keep backward behavior of ovirtmgmt being a
> > > migration
> > > network by default. I think that the worst case is that the user
> > > finds
> > > out - in the least convinient moment - that ovirt 3.3 would not
> > > migrate
> > > his VMs without explicitly assigning the "migration" role.
> > >
> > >>>> 1.4 The "migration" role can be granted or taken on-the-fly,
> > >>>> when
> > >>>> hosts
> > >>>>       are active, as long as there are no currently-migrating
> > >>>>       VMs.
> > >>>>
> > >>>> 2. Scheduler
> > >>>> 2.1 when deciding which host should be used for automatic
> > >>>>       migration, take into account the existence and
> > >>>>       availability of
> > >>>>       the
> > >>>>       migration network on the destination host.
> > >>>> 2.2 For manual migration, let user migrate a VM to a host with
> > >>>> no
> > >>>>       migration network - if the admin wants to keep jamming
> > >>>>       the
> > >>>>       management network with migration traffic, let her.
> > >> Since you send migration network per migration command, why not
> > >> allow
> > >> to choose any network on the host same as you allow to choose
> > >> host? If
> > >> host is not selected then allow to choose from cluster's
> > >> networks.
> > >> The default should be the cluster's migration network.
> > > Cool. Added to wiki page.
> > >
> > >> If you allow for the above, we can waver the enforcement of
> > >> migration network per host. No migration network == no automatic
> > >> migration to/from this host.
> > > again, I'd prefer to keep the current default status of ovirtmgmt
> > > as a
> > > migration network. Besides that, +1.
> > >
> > >>
> > >>>> 3. VdsBroker migration verb.
> > >>>> 3.1 For the a modern cluster level, with migration network
> > >>>> defined
> > >>>> on
> > >>>>       the destination host, an additional ''miguri'' parameter
> > >>>>       should be added
> > >>>>       to the "migrate" command
> > >>>>
> > >>>> _______________________________________________
> > >>>> Arch mailing list
> > >>>> Arch at ovirt.org
> > >>>> http://lists.ovirt.org/mailman/listinfo/arch
> > >>> How is the authentication of the peers handled? Do we need a
> > >>> cert
> > >>> per
> > >>> each source/destination logical interface?
> > > I hope Orit or Lain correct me, but I am not aware of any
> > > authentication scheme that protects non-tunneled qemu destination
> > > from
> > > an evil process with network acess to the host.
> > >
> > > Dan.
> > > _______________________________________________
> > > Arch mailing list
> > > Arch at ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/arch
> > >
> > 
> > 
> 



More information about the Arch mailing list