feature suggestion: migration network

Simon Grinberg simon at redhat.com
Thu Jan 10 08:38:56 UTC 2013



----- Original Message -----
> From: "Mark Wu" <wudxw at linux.vnet.ibm.com>
> To: "Dan Kenigsberg" <danken at redhat.com>
> Cc: "Simon Grinberg" <simon at redhat.com>, "Orit Wasserman" <owasserm at redhat.com>, "Laine Stump" <lstump at redhat.com>,
> "Yuval M" <yuvalme at gmail.com>, "Limor Gavish" <lgavish at gmail.com>, arch at ovirt.org
> Sent: Thursday, January 10, 2013 5:13:23 AM
> Subject: Re: feature suggestion: migration network
> 
> On 01/09/2013 03:34 AM, Dan Kenigsberg wrote:
> > On Tue, Jan 08, 2013 at 01:23:02PM -0500, Simon Grinberg wrote:
> >>
> >> ----- Original Message -----
> >>> From: "Yaniv Kaul" <ykaul at redhat.com>
> >>> To: "Dan Kenigsberg" <danken at redhat.com>
> >>> Cc: "Limor Gavish" <lgavish at gmail.com>, "Yuval M"
> >>> <yuvalme at gmail.com>, arch at ovirt.org, "Simon Grinberg"
> >>> <sgrinber at redhat.com>
> >>> Sent: Tuesday, January 8, 2013 4:46:10 PM
> >>> Subject: Re: feature suggestion: migration network
> >>>
> >>> On 08/01/13 15:04, Dan Kenigsberg wrote:
> >>>> There's talk about this for ages, so it's time to have proper
> >>>> discussion
> >>>> and a feature page about it: let us have a "migration" network
> >>>> role, and
> >>>> use such networks to carry migration data
> >>>>
> >>>> When Engine requests to migrate a VM from one node to another,
> >>>> the
> >>>> VM
> >>>> state (Bios, IO devices, RAM) is transferred over a TCP/IP
> >>>> connection
> >>>> that is opened from the source qemu process to the destination
> >>>> qemu.
> >>>> Currently, destination qemu listens for the incoming connection
> >>>> on
> >>>> the
> >>>> management IP address of the destination host. This has serious
> >>>> downsides: a "migration storm" may choke the destination's
> >>>> management
> >>>> interface; migration is plaintext and ovirtmgmt includes Engine
> >>>> which
> >>>> sits may sit the node cluster.
> >>>>
> >>>> With this feature, a cluster administrator may grant the
> >>>> "migration"
> >>>> role to one of the cluster networks. Engine would use that
> >>>> network's IP
> >>>> address on the destination host when it requests a migration of
> >>>> a
> >>>> VM.
> >>>> With proper network setup, migration data would be separated to
> >>>> that
> >>>> network.
> >>>>
> >>>> === Benefit to oVirt ===
> >>>> * Users would be able to define and dedicate a separate network
> >>>> for
> >>>>     migration. Users that need quick migration would use nics
> >>>>     with
> >>>>     high
> >>>>     bandwidth. Users who want to cap the bandwidth consumed by
> >>>>     migration
> >>>>     could define a migration network over nics with bandwidth
> >>>>     limitation.
> >>>> * Migration data can be limited to a separate network, that has
> >>>> no
> >>>>     layer-2 access from Engine
> >>>>
> >>>> === Vdsm ===
> >>>> The "migrate" verb should be extended with an additional
> >>>> parameter,
> >>>> specifying the address that the remote qemu process should
> >>>> listen
> >>>> on. A
> >>>> new argument is to be added to the currently-defined migration
> >>>> arguments:
> >>>> * vmId: UUID
> >>>> * dst: management address of destination host
> >>>> * dstparams: hibernation volumes definition
> >>>> * mode: migration/hibernation
> >>>> * method: rotten legacy
> >>>> * ''New'': migration uri, according to
> >>>> http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2
> >>>> such as tcp://<ip of migration network on remote node>
> >>>>
> >>>> === Engine ===
> >>>> As usual, complexity lies here, and several changes are
> >>>> required:
> >>>>
> >>>> 1. Network definition.
> >>>> 1.1 A new network role - not unlike "display network" should be
> >>>>       added.Only one migration network should be defined on a
> >>>>       cluster.
> >> We are considering multiple display networks already, then why not
> >> the
> >> same for migration?
> > What is the motivation of having multiple migration networks?
> > Extending
> > the bandwidth (and thus, any network can be taken when needed) or
> > data separation (and thus, a migration network should be assigned
> > to
> > each VM in the cluster)? Or another morivation with consequence?
> My suggestion is making the migration network role determined
> dynamically on each migrate.  If we only define one migration network
> per cluster,
> the migration storm could happen to that network. It could cause some
> bad impact on VM applications.  So I think engine could choose the
> network which
> has lower traffic load on migration, or leave the choice to user.

Dynamic migration selection is indeed desirable but only from migration networks - migration traffic is insecure so it's undesirable to have it mixed with VM traffic unless permitted by the admin by marking this network as migration network. 

To clarify what I've meant in the previous response to Livnat - When I've said "...if the customer due to the unsymmetrical nature of most bonding modes prefers to use muplitple networks for migration and will ask us to optimize migration across these..."

But the dynamic selection should be based on SLA which the above is just part:
1. Need to consider tenant traffic segregation rules = security 
2. SLA contracts

If you keep 2, migration storms mitigation is granted. But you are right that another feature required for #2 above is to control the migration bandwidth (BW) per migration. We had discussion in the past for VDSM to do dynamic calculation based on f(Line Speed, Max Migration BW, Max allowed per VM, Free BW, number of migrating machines) when starting migration. (I actually wanted to do so years ago, but never got to that - one of those things you always postpone to when you'll find the time). We did not think that the engine should provide some, but coming to think of it, you are right and it makes sense. For SLA - Max per VM + Min guaranteed should be provided by the engine to maintain SLA. And it's up to the engine not to VMs with Min-Guaranteed x number of concurrent migrations will exceed Max Migration BW.

Dan this is way too much for initial implementation, but don't you think we should at least add place holders in the migration API? 
Maybe Doron can assist with the required verbs. 

(P.S., I don't want to alarm but we may need SLA parameters for setupNetworks as well :) unless we want these as separate API tough it means more calls during set up)

> 
> >
> >>
> >>>> 1.2 If none is defined, the legacy "use ovirtmgmt for migration"
> >>>>       behavior would apply.
> >>>> 1.3 A migration network is more likely to be a ''required''
> >>>> network, but
> >>>>       a user may opt for non-required. He may face unpleasant
> >>>>       surprises if he
> >>>>       wants to migrate his machine, but no candidate host has
> >>>>       the
> >>>>       network
> >>>>       available.
> >> I think the enforcement should be at least one migration network
> >> per host -> in the case we support more then one
> >> Else always required.
> > Fine by me - if we keep backward behavior of ovirtmgmt being a
> > migration
> > network by default. I think that the worst case is that the user
> > finds
> > out - in the least convinient moment - that ovirt 3.3 would not
> > migrate
> > his VMs without explicitly assigning the "migration" role.
> >
> >>>> 1.4 The "migration" role can be granted or taken on-the-fly,
> >>>> when
> >>>> hosts
> >>>>       are active, as long as there are no currently-migrating
> >>>>       VMs.
> >>>>
> >>>> 2. Scheduler
> >>>> 2.1 when deciding which host should be used for automatic
> >>>>       migration, take into account the existence and
> >>>>       availability of
> >>>>       the
> >>>>       migration network on the destination host.
> >>>> 2.2 For manual migration, let user migrate a VM to a host with
> >>>> no
> >>>>       migration network - if the admin wants to keep jamming the
> >>>>       management network with migration traffic, let her.
> >> Since you send migration network per migration command, why not
> >> allow
> >> to choose any network on the host same as you allow to choose
> >> host? If
> >> host is not selected then allow to choose from cluster's networks.
> >> The default should be the cluster's migration network.
> > Cool. Added to wiki page.
> >
> >> If you allow for the above, we can waver the enforcement of
> >> migration network per host. No migration network == no automatic
> >> migration to/from this host.
> > again, I'd prefer to keep the current default status of ovirtmgmt
> > as a
> > migration network. Besides that, +1.
> >
> >>
> >>>> 3. VdsBroker migration verb.
> >>>> 3.1 For the a modern cluster level, with migration network
> >>>> defined
> >>>> on
> >>>>       the destination host, an additional ''miguri'' parameter
> >>>>       should be added
> >>>>       to the "migrate" command
> >>>>
> >>>> _______________________________________________
> >>>> Arch mailing list
> >>>> Arch at ovirt.org
> >>>> http://lists.ovirt.org/mailman/listinfo/arch
> >>> How is the authentication of the peers handled? Do we need a cert
> >>> per
> >>> each source/destination logical interface?
> > I hope Orit or Lain correct me, but I am not aware of any
> > authentication scheme that protects non-tunneled qemu destination
> > from
> > an evil process with network acess to the host.
> >
> > Dan.
> > _______________________________________________
> > Arch mailing list
> > Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
> >
> 
> 



More information about the Arch mailing list