feature suggestion: migration network

Wed Jan 9 11:23:50 UTC 2013

----- Original Message -----
> From: "Livnat Peer" <lpeer at redhat.com>
> To: "Dan Kenigsberg" <danken at redhat.com>, "Simon Grinberg" <simon at redhat.com>
> Cc: "Orit Wasserman" <owasserm at redhat.com>, "Laine Stump" <lstump at redhat.com>, "Yuval M" <yuvalme at gmail.com>, "Limor
> Gavish" <lgavish at gmail.com>, arch at ovirt.org
> Sent: Wednesday, January 9, 2013 9:26:25 AM
> Subject: Re: feature suggestion: migration network
> 
> On 01/08/2013 09:34 PM, Dan Kenigsberg wrote:
> > On Tue, Jan 08, 2013 at 01:23:02PM -0500, Simon Grinberg wrote:
> >>
> >>
> >> ----- Original Message -----
> >>> From: "Yaniv Kaul" <ykaul at redhat.com>
> >>> To: "Dan Kenigsberg" <danken at redhat.com>
> >>> Cc: "Limor Gavish" <lgavish at gmail.com>, "Yuval M"
> >>> <yuvalme at gmail.com>, arch at ovirt.org, "Simon Grinberg"
> >>> <sgrinber at redhat.com>
> >>> Sent: Tuesday, January 8, 2013 4:46:10 PM
> >>> Subject: Re: feature suggestion: migration network
> >>>
> >>> On 08/01/13 15:04, Dan Kenigsberg wrote:
> >>>> There's talk about this for ages, so it's time to have proper
> >>>> discussion
> >>>> and a feature page about it: let us have a "migration" network
> >>>> role, and
> >>>> use such networks to carry migration data
> >>>>
> >>>> When Engine requests to migrate a VM from one node to another,
> >>>> the
> >>>> VM
> >>>> state (Bios, IO devices, RAM) is transferred over a TCP/IP
> >>>> connection
> >>>> that is opened from the source qemu process to the destination
> >>>> qemu.
> >>>> Currently, destination qemu listens for the incoming connection
> >>>> on
> >>>> the
> >>>> management IP address of the destination host. This has serious
> >>>> downsides: a "migration storm" may choke the destination's
> >>>> management
> >>>> interface; migration is plaintext and ovirtmgmt includes Engine
> >>>> which
> >>>> sits may sit the node cluster.
> >>>>
> >>>> With this feature, a cluster administrator may grant the
> >>>> "migration"
> >>>> role to one of the cluster networks. Engine would use that
> >>>> network's IP
> >>>> address on the destination host when it requests a migration of
> >>>> a
> >>>> VM.
> >>>> With proper network setup, migration data would be separated to
> >>>> that
> >>>> network.
> >>>>
> >>>> === Benefit to oVirt ===
> >>>> * Users would be able to define and dedicate a separate network
> >>>> for
> >>>>    migration. Users that need quick migration would use nics
> >>>>    with
> >>>>    high
> >>>>    bandwidth. Users who want to cap the bandwidth consumed by
> >>>>    migration
> >>>>    could define a migration network over nics with bandwidth
> >>>>    limitation.
> >>>> * Migration data can be limited to a separate network, that has
> >>>> no
> >>>>    layer-2 access from Engine
> >>>>
> >>>> === Vdsm ===
> >>>> The "migrate" verb should be extended with an additional
> >>>> parameter,
> >>>> specifying the address that the remote qemu process should
> >>>> listen
> >>>> on. A
> >>>> new argument is to be added to the currently-defined migration
> >>>> arguments:
> >>>> * vmId: UUID
> >>>> * dst: management address of destination host
> >>>> * dstparams: hibernation volumes definition
> >>>> * mode: migration/hibernation
> >>>> * method: rotten legacy
> >>>> * ''New'': migration uri, according to
> >>>> http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2
> >>>> such as tcp://<ip of migration network on remote node>
> >>>>
> >>>> === Engine ===
> >>>> As usual, complexity lies here, and several changes are
> >>>> required:
> >>>>
> >>>> 1. Network definition.
> >>>> 1.1 A new network role - not unlike "display network" should be
> >>>>      added.Only one migration network should be defined on a
> >>>>      cluster.
> >>
> >> We are considering multiple display networks already, then why not
> >> the
> >> same for migration?
> > 
> > What is the motivation of having multiple migration networks?
> > Extending
> > the bandwidth (and thus, any network can be taken when needed) or
> > data separation (and thus, a migration network should be assigned
> > to
> > each VM in the cluster)? Or another morivation with consequence?
> > 

All of the above, I'll explain in my answer to Livnat. 

> 
> In addition to the questions above there are some behavioral changes
> driven by supporting multiple-migration-network -
> 
> 1.Today cluster is a migration domain (give or take optional networks
> -
> which was designed to enable dynamic network provisioning not for
> breaking cluster migration domain...) 

It was designed as a simple workaround to many issue, however its not fully satisfies any:
As a workaround for Dynamic-Network - It misses the part of set up on demand, but at least allows the host not be non-operational when we use hooks to set up the networks 
As a workaround to allow fore multiple Storage-Network to be used with multi-path - It misses the property of 'at least one available -> dependency',  but at least allows the host not be non-operational

And so on, it's non of the above but it's easy and helpful 

> adding multiple migration
> network in the cluster means you break this assumption and now only hosts
> with shared migration network can migrate VMs between them...that's
> splitting the cluster to sub-migration domain. what is the meaning of cluster
> now?

Good question.
For a data center use case, current cluster definition is good and should be maintained - it's simple, easy to understand, and cover most of the use cases here. 
For multi-tenant and cloud use case where multiple tenants share physical resources - it's probably not enough and we'll need farther partitioning into sub-resouce-domains   

> Or did you mean ALL hosts in the cluster should have ALL migration
> networks? (a motivation for that is not clear to me)

Indeed, this is not the best use case for this. But still has some uses in case there are small number of tenants sharing the cluster or if the customer due to the unsymmetrical nature of most bonding modes prefers to use muplitple networks for migration and will ask us to optimize migration across these. See farther details below 

> 
> 2. What happens if a single host has multiple migration networks
> assigned to it. (I am assuming the migration role is not necessarily
> a
> standalone role but can be an additional tag on an existing network
> that
> can be used for management or VMs or any future role).
> Do we really want to get into managing a policy around it, which
> migration network to use - random/RR/even-traffic-load etc.

Yes we do and now I'll finally explain: 

Having a single Display network and Migration network per cluster, is good enough when there is a single tenant per cluster.
But what happens when you have for example two tenants sharing the same physical resources? 

Then their Display network should be different (so you'll be able to rout traffic to different client proxies if needed, guarantee SLA, security etc). Same may go to migration network for SLA grantee and security. 

But I'll take it for simplicity just from the SLA POV. 

1. You may need to guarantee global resources per tenant - that what he signs for.
2. Within a tenant resources he may may need to grantee specific VM / VMs-group resources - for his own better use of the resources he signed for. 

The second should probably done by a VM SLA while the first will be done easier on the network level. Yes it could be done on setting some kind of rules on traffic type and aggregated sources list etc, but they all funnel at the end into a logical network. The above is even more obvious when you use external hardware network management, like Mellanox or CISCO-UCS. There you set the SLA on profiles which is the equivalent to logical network. 

I know that when you think SLA you think on VM data traffic, but it is not enough. Display traffic SLA is important as well and the same are 'facilities' SLA - migration network lies under this.

So what I'm trying to say?
That except for the management network which is strictly for our use (and needs SLA for other reasons) all other 'Facility' networks may be require to be set per tenant as they may not share the same interfaces or even if they practically share the same interface. 

> 
> 
> 
> >>
> >>
> >>>> 1.2 If none is defined, the legacy "use ovirtmgmt for migration"
> >>>>      behavior would apply.
> >>>> 1.3 A migration network is more likely to be a ''required''
> >>>> network, but
> >>>>      a user may opt for non-required. He may face unpleasant
> >>>>      surprises if he
> >>>>      wants to migrate his machine, but no candidate host has the
> >>>>      network
> >>>>      available.
> >>
> >> I think the enforcement should be at least one migration network
> >> per host -> in the case we support more then one
> >> Else always required.
> 
> Why?
> If we are thinking that migration network is optional there could be
> hosts that do not have migration network and only hols pin-to-host
> VMs
> for example....
> This one falls to don't nanny the user I think...

Makes sense 

> 
> > 
> > Fine by me - if we keep backward behavior of ovirtmgmt being a
> > migration
> > network by default. 

Yap, my bad for not mentioning that it is build on top of what you said not instead.
Default should be management network as before, the customer will need to explicitly remove that role to end up with no migration network.

> > I think that the worst case is that the user
> > finds
> > out - in the least convinient moment - that ovirt 3.3 would not
> > migrate
> > his VMs without explicitly assigning the "migration" role.
> > 
> 
> We can assign the migration role in the engine to all exising
> management
> network upon upgrade, from there the user can change definitions the
> way
> he sees fit.
> If we get to a point of having a host with no migration network a
> warning to the user is in place but no more than that IMO.

makes sense 

> 
> >>
> >>>> 1.4 The "migration" role can be granted or taken on-the-fly,
> >>>> when
> >>>> hosts
> >>>>      are active, as long as there are no currently-migrating
> >>>>      VMs.
> >>>>
> >>>> 2. Scheduler
> >>>> 2.1 when deciding which host should be used for automatic
> >>>>      migration, take into account the existence and availability
> >>>>      of
> >>>>      the
> >>>>      migration network on the destination host.
> >>>> 2.2 For manual migration, let user migrate a VM to a host with
> >>>> no
> >>>>      migration network - if the admin wants to keep jamming the
> >>>>      management network with migration traffic, let her.
> >>
> >> Since you send migration network per migration command, why not
> >> allow
> >> to choose any network on the host same as you allow to choose
> >> host? If
> >> host is not selected then allow to choose from cluster's networks.
> >> The default should be the cluster's migration network.
> > 
> 
> WHY???
> It is only adding complexity to the user experience, I don't see the
> big
> benefit of having it.
> The user can do more things and the UI is getting more and more
> complex
> for simple actions.
> 
> I think we should keep the requirement reasonable, what should lead
> us
> when adding a requirement is how exotic this use-case is and what is
> the
> complexity it adds to the more common use cases.

What complexity does it add?
Drop box to select a network filtered by the selected host, or not filtered. 

With the suggested VDSM migration API, it's isn't more complex then allowing to select the host while at the same time allowing flexibility to the user.  

Consider the case where you have huge VMs that takes ages to migrate - this will allow temporary diverting some of the migration to another network. This will also allow for your suggestion above not to enforce a migration network but still allow for migrations (though manual only). 

If migrations for a customer (and we have those) is a rare event that is always fully orchestrated, why does he need to have a migration network in the first place. Why not to allow him to select during migration what network to use? 

> 
> 
> > Cool. Added to wiki page.
> > 
> >>
> >> If you allow for the above, we can waver the enforcement of
> >> migration network per host. No migration network == no automatic
> >> migration to/from this host.
> > 
> 
> I think you can wave the enforcement anyway...
> 
> > again, I'd prefer to keep the current default status of ovirtmgmt
> > as a
> > migration network. Besides that, +1.
> > 
> >>
> >>
> >>>>
> >>>> 3. VdsBroker migration verb.
> >>>> 3.1 For the a modern cluster level, with migration network
> >>>> defined
> >>>> on
> >>>>      the destination host, an additional ''miguri'' parameter
> >>>>      should be added
> >>>>      to the "migrate" command
> >>>>
> >>>> _______________________________________________
> >>>> Arch mailing list
> >>>> Arch at ovirt.org
> >>>> http://lists.ovirt.org/mailman/listinfo/arch
> >>>
> >>> How is the authentication of the peers handled? Do we need a cert
> >>> per
> >>> each source/destination logical interface?
> > 
> > I hope Orit or Lain correct me, but I am not aware of any
> > authentication scheme that protects non-tunneled qemu destination
> > from
> > an evil process with network acess to the host.
> > 
> > Dan.
> > _______________________________________________
> > Arch mailing list
> > Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
> > 
> 
>