Design issue when using optional networks for administrative usages

Mike Kolesnik mkolesni at redhat.com
Tue Sep 10 05:28:52 UTC 2013


----- Original Message -----
> On Sun, Sep 08, 2013 at 05:30:20AM -0400, Mike Kolesnik wrote:
> > Hi,
> > 
> > I would like to hear opinions about what I consider a design issue in
> > oVirt.
> > 
> > First of all, a short description of the current situation in oVirt 3.3:
> > Network is a data-center level entity, representing a L2 broadcast domain.
> > Each network can be attached to one or more clusters, where the attachment
> > can have several properties:
> > - Required/Optional - Does the network have to be on all hosts or not?
> > - Usages (administrative):
> > - Display network - used for the display traffic
> > - Migration network - used for the migration traffic
> > 
> > Now, what bothers me is the affinity between these two properties - if a
> > network is defined "optional", can is be used for an "administrative"
> > usage?
> > 
> > Currently I can have the following situation:
> > 0. Fresh install with some hosts and a shared storage, and no networks
> > other than default.
> > 1. Create a network X.
> > 2. Attach to a cluster as "migration", "display", "optional".
> > 3. Create a VM in the same cluster.
> > 
> > Now all is well and everything is green across the board, BUT:
> > 1. The VM can't be run on any host in that cluster if the host doesn't have
> > the display network.
> > 2. VM will migrate over the default network if the network is not present
> > on the source host.
> > 3. Migration will not work if the network is not present on the destination
> > host.
> > 
> > I find this situation very troublesome!
> > We give the admin the impression that everything is fine and dandy, but
> > underneath the surface everything is NOT.
> > 
> > If we look at the previous points we can see that:
> > 1. No VM can run in that cluster, but hosts and network seem A-OK - this is
> > intrinsically awful as we don't reflect the real problem anywhere in the
> > network nor the host statuses but rather postpone it until someone makes
> > an attempt to actually use the VM.
> > 2. Migration network is NOT being used, which was obviously not the intent
> > of the admin who set it up.
> > 3. There is still an open bug for it ( https://bugzilla.redhat.com/983515 )
> > and it's unclear as to what should happen, but it would be either what
> > happens in case #1 or in case #2.
> > 
> > What I suggest is to have any network with usage be "required".
> > This will utilize the existing logic for required networks:
> > - Either the network should not be used until its available on all hosts
> > (reflected in the network status being Non-Operational)
> > - Or the host should be Non-Operational as it's incapable of
> > running/migrating VMs
> > 
> > Therefore reflecting the problem to the admin and giving him a chance to
> > fix it properly, and not hiding the failure until it occurs or doing some
> > unexpected behavior.
> > 
> > I would love to hear your thoughts on the subject.
> 
> Some history first. Once upon at time, we wanted an Up host to mean
> "this host is ready to run any of its cluster's VMs". This meant that if
> a host lost connectivity to one of the cluster networks, it had to be
> taken down.
> 
> Customers did not like our over protection, so we've introduced
> non-required networks. When an admin uses this option he says "I know
> what I'm doing, let me do stuff on this host even if the network is
> down."

So what you're saying is non-required networks should not protect the user at all?

In this case I say we shouldn't impose any limitations whatsoever in this situation,
and if the VM fails to start/migrate then let it fail.

> 
> I think that this request is a valid one, even when a network serves
> other purposes than connecting VMs. When designing migration network,
> we've decided that if it is missing, migration would be attempted over
> the management network, as a fallback. I can imagine an admin who says:
> I don't care much about migrations, most of my VMs are pinned-to-host
> anyway. so if the migration network is gone, don't make a fuss out of
> it.
> 
> The use case for letting a host be Up even if its display network is
> less obvious. But then again, I can think of an admin who uses a vdsm
> hook to set the display IP of each VM. He does not care if the display
> network is up or not.

If the admin uses hooks for his networking needs then I don't see why he
even needs this support in oVirt, so your point is not clear to me..

> 
> In my opinion, the meaning and the danger on non-req networks should be
> properly documented and clear to customers, but some of them are
> expected to find it useful.

I agree, if this is our approach then it should be very very well documented.

> 
> Dan.
> 



More information about the Arch mailing list