[Users] oVirt and Quantum

Tue May 15 14:34:35 UTC 2012

Hi,
Please see my inline comments. There are quite a few.
Thanks
Gary

On 05/15/2012 01:38 PM, Itamar Heim wrote:
> On 04/29/2012 01:41 PM, Gary Kotton wrote:
>> Hi,
>>
>> As part of a POC we have integrated Quantum
>> (http://wiki.openstack.org/Quantum) into oVirt (http://www.ovirt.org/).
>> This has been tested with the OVS and Linux Bridge plugins.
>> The details of the integration can be found at -
>> https://fedoraproject.org/wiki/Quantum_and_oVirt.
>> Any comments and suggestions would be greatly appreciated.
>> Thanks
>> Gary
>
> Thanks Gary - some questions:
>
> 1. you are using the term 'user' for both an end-user and an admin. I 
> think admin is more appropriate in most places.
>
> 2. host management --> interface
> 2.1 you are suggesting to remove the vlan field with a fabric field?
Yes, this is what I am suggesting. VLAN tagging is one method of network 
isolation. There are more, for example GRE tunneling.
> are we sure this is something which shouldn't be presented in the main 
> view and only via extended properties?
>
> 2.2 is the fabric really a host interface level property, or a cluster 
> wide property (would same logical network be implemented on different 
> hosts in the cluster via both vdsm and quantum)?
> would live migration work in same cluster if one host has qunatum 
> fabric and another vdsm fabric for same logical network?
These are all up for discussion. I just wanted to provide something that 
we can start to work with. It would be nice if there was some input from 
product management on this issue (not exactly sure how this works in an 
open source community :))
>
> 2.3 you are describing the subtab of an interface, and mention a 
> qunatum pull down menu. is this in the subtab? in a dialog?
> UI mockups are needed here.
I think that prior to getting mockups, we should ensure that we have the 
idea crystallized.
>
> 2.4 also, is this pulldown correct at host level or cluster level 
> (would live migration work between hosts implementing different 
> plugins for same logical network in same cluster?)
>
> on to backend:
> 3.1 "The Quantum Service is a process that runs the Quantum API web 
> server (port 9696)"
>
> how is authentication between engine and quantum service is done (we 
> have a client certificate for engine which qunatum can verify i guess)
In my opinion this should be running locally on the same host as the 
engine.
>
> 3.2 is there a config of the quantum service uri? should it be a 
> single instance or a per DC property?
> (it sounds like a per DC property, can be a future thing)
I am sorry but I do not understand the question.
>
> 3.3 network creation
>
> semantics: we create a network at DC level. we attach it at cluster 
> level. the UI allows you to "create at DC and attach to cluster" at 
> cluster level.
>
> 3.3.1 what if i create a network with same VLAN in two DC's? what if 
> we attach same logical network to multiple clusters - each will create 
> the logical network in qunatum again?
If I understand you correctly then I think that one Quantum network can 
be created. This should work. In my opinion this should be managed by 
the oVirt engine so the actual creation is not an issue. Ensuring that 
each cluster has the correct network management configurations is what 
is important.
> 3.3.2 what if the qunatum service is down when engine performs this 
> action or quantum returns an error?
The engine should be able to deal with this - similar to the way in 
which it deals with a network creation when VDSM is down.
> 3.3.3 each creation of a DC creates a rhevm network - I assume these 
> are not created, since "no host in a cluster in the DC yet"?
This functionally can remain the same.
> 3.3.4 what happens on moving of a host to another cluster, or moving 
> of a cluster to another DC (possible if it's DC was removed iirc).
The Quantum agents take care of this. On each host there will be a 
Quantum agent that treats the network management.
> 3.3.5 shouldn't this be limited to vm networks, or all networks are 
> relevant?
I think all Quantum networks are relevant
> 3.3.6 what if the host with the qunatum fabric is in maint mode?
I do not understand. When a host is in maintenace mode do VM's receive 
traffic? The Quantum port for the VM's can be set as DOWN
> 3.3.7 could a host with qunatum fabric have a VDSM fabric on another 
> interface (for vm neworks? for non-vm networks)?
Yes. This is something that I would like. I also would like the Quantum 
API to be updated so that we can indicate the phycal network interface 
that the Quantum network will be running on.
>
> 3.5 network deletion (detach network from a cluster)
> 3.5.1 what happens if qunatum is down or returned an error?
The engine should be able to deal with this - similarly to what it does 
today
>
> 3.6 vm creation
> 3.6.1 what about moving a VM from/to a cluster with a quantum fabric?
I do not see a problem here. The agenets running on VDSM will detect and 
treat accordingly.
> 3.6.2 what about import of a VM into a cluster with a qunatum fabric?
Same as above
> 3.6.3 you have vm creation/removal - missing vnic addition/removal
> 3.6.4 worth mentioning qunatum doesn't care about the vnic model? 
> about the mac address?
The Quantum driver on VDSM does take into account the MAC address. This 
is used in some plugins - for example the openvswicth plugin.
>
> 3.7 vm start
> 3.7.1 why is the plugin parameter required at vm start? 
This is to indicate to VDSM the operations to take place - for example 
updating the openvswitch integration bridge
> can multiple plugins be supported in parallel at vdsm/host level or by 
> the quantum service?
Yes/ Multiple agents can run on VDSM. In the engine a little work is 
required to run multiple plugins.
> 3.7.2 aren't network_uuid and port uuid redundant to attachment uuid 
> (I assume qunatum service knows from attachment the port and network) 
> - i have no objection to passing them to vdsm, just trying to 
> understand reasoning for this.
> I am missing what happens at vdsm level in this point (even after 
> reading the matching vdsm part)
The network ID and the port ID are returned by the Quantum service. The 
attachment ID is passed to the Quantum server. If this is unique then it 
can be a key to the above (I am currently working on the scale issue 
with Quantum and there are 2 options, the first is that it is unique, 
the second is that all three are passed to the agent). It is a few extra 
bytes passed. I think that it is better to be safe and have all of the 
information on VDSM for future use.
>
> 3.8 vm suspend/resume (it's fine to say it behaves like X, but still 
> need to be covered to not cause bugs/regressions)
The Quantum port status can be updated.
>
> 3.9 vm stop
Same as above
> 3.9.1 need to make sure vm stop when engine is down is handled correctly.
> 3.9.2 what happens if qunatum service is down? unlike start vm or 
> network creation the operation in this case cannot be 
> stopped/rolledback, only rolled forward.
We have to ensure that it is up.
>
> 3.10 hot plug nic?
Each new NIC has an attachment ID - the agent know that a new NIC is 
added and treats accordingly.
>
> 3.11 vm migration
> 3.11.1 ok, so this is the first time i understand hosts can be mixed 
> in same cluster. worth specifying this in the beginning (still not 
> clear if mixed plugins can exist)
If the networks are configured with the same characteristics then this 
should be OK. As mentioned above we are working on a blueprint to deal 
with connecting to existing networks - i.e. enable the user/admin to 
configure the VLAN tags
> 3.11.2 afair, we don't deal with engine talking to both hosts during 
> live migration. only to host A, who is then communicating with host B.
> so why not always have the VM configuration at vm start (and hot plug 
> nic) have the quantum details so live migration can occur at will 
> without additional information?
I do not understand can you please explain. VDSM creates the tap device 
and builds the libvirt files. The agents detect the tap device and 
attach to the network. I do not understand why it is a problem for the 
live migration. This is also driven by the libvirt XML's being created. 
Is this correct?
> 3.11.3 "In order to implement the above a REST client needs to be 
> implemented in the oVirt engine. "
> did not understand this statement - please elaborate.
All interface with the Quantum server is done via REST. In order for 
oVirt to be able to communicate, it will need to send REST messages to 
the server and be able to parse the replies - this is what I meant by 
the REST client.
>
> 4. host management
> 4.1 deployment
> we do not deploy packages from engine to hosts, we can install them 
> from a repo configured to the host. but this is done today only as 
> part of initial bootstrap/installation of host.
> also, it is not relevant for ovirt node which is 'firmware' like.
> any reason to not require the 'plugin installation packages' at vdsm 
> rpm level for plugins we think are good enough to use (until then, 
> responsibility to deploy them is of admin)
>
Correct. I agree with you.
> (what are plugin level packages at host level - aren't these the agents?)
Each plugin has the relevant packages that should be installed.
>
> 4.2 plugin configuration
> per DC? per cluster? per host? per plugin? please provide more details 
> here on configuration details expected and flow of when they are 
> configured and when they are expected to change?
I think that plugin per cluster is a good start. This could limit live 
migration problems.
> I think this will merit a per 'ovirt supported' qunatum plugin to see 
> it works in a way we can use.
>
> 4.3 connectivity
> again, this requires more details. if needed per plugin.
> what is expected? how authentication/encryption happens? what iptables 
> rules need to change in engine/host if at all, etc.
> I'm fine with this being 'direct access to db from hosts' for POC 
> level of patches, but not for something we actually merge/provide 
> support for later.
I am currently working on a blueprint to ensure better scaling of 
quantum agents. This will be done by making use of the nova RPC library. 
This supports Qpid, rabbit mq, kombu etc. These have an option of being 
secure. Please clarify if this suffices?
>
> 5. VDSM
>
> 5.1 s/The agent packe/The agent package/ ?
:)
>
> 5.2 "The agent package can and may be received from the oVirt Engine 
> or can be downloaded via RPM's"
> see 4.1 above - we don't deploy rpm's/code on the fly today.
When I was sitting with the guys from oVirt this is what I was told. I 
guess that I misunderstood.
>
> 5.3 "n addition to the treatment below VDSM should also maintain a 
> health check to the Quantum agent"
> what if the restart didn't help? how should ovirt treat the host wrt 
> networking? to running VMs? to ability to live migrate VMs from the 
> host if needed?
If the agent is down then the oVirt engine should be notified to at 
least events for the user.
>
> 5.4 "Logical Network Management"
> please see 3.11 above - we would want live migration to not require 
> additional information, so details should be available even if not 
> immediately acted upon.
>
> 5.5 "The tap device created uses an "ethernet" network device. This 
> means that the creation of the libvirt XML file is a bit different."
Yes, that is correct
> 5.5.1 does this impact stable device addresses somehow?
Not that I am aware of
> 5.5.2 how is live migration possible if the libvirt xml to a non 
> quantum host is different (or is this the binding part only)?
If the libvirt is "pacthed" when the migration takes place then this 
should not be a problem.
>
> 5.6 I assume libvirtvm.py is part of vdsm.
> is quantum.py part of quantum code base or vdsm codebase (it sounds 
> like it should be part of quantum code base)?
> so how exactly the rpm for this would look like? deploy it to 
> /xxx/qunatum and have it add a symbolic link to it from vdsm paths?
In my opinion this should be part of VDSM. It would be ideal if each 
plugin can bind load a driver - in Nova each driver is part of the nova 
code. If the API is well defined then all vendors can provide their 
drivers. This is open for discussion and we should try and understand 
what the best way of doing this is. I like the NOVA model.
>
> 5.7 "When a communication channel is established between VDSM and the 
> oVirt engine. The VDSM host should notify the oVirt Engine of the 
> Quantum fabric that is supported."
> how does vdsm goes about detecting an agent is installed exactly?
The user needs to install a package for the agent - this creates the 
relevant configuartion files. These files can be used to detect the 
running agent or via "ps"
> (especially since deployment wise, most likely is all agents would be 
> deployed via rpm requirements?)
> is any agent a service at host level rather than code only?
> is there a scenario where a vdsm level plugin isn't required?
>
> 6. open issues
> 6.1 worth mentioning if any are in the works
No shortage :). But we have a good start. There are some blueprints in 
the works which solve a large amount of problems
> 6.2 specifying the vlan from ovirt engine is not a gap?
Yes it is. This is currently being addressed. This is not a reason to 
stop us moving ahead.
> 6.3 ok, so this is the first time "no multiple plugins" is mentioned 
> ("Non-uniform technology"). but it sounds like the approach in the 
> wiki assume it is/will be possible to have multiple technologies 
> (agents/plugins) going forward?
My take is that this can be hidden by oVirt engine. It is just an 
implemnattion detail - this should not affect the user experience - 
which at the end of the day is what counts
> 6.4 need to discuss which of the open issues should block going 
> forward with merging of code, and expected timeframe for resolution of 
> some of them
Correct. In my opinion there are no major blocking issues at the moment.
>
> 7. other questions/items (some mentioned above as well)
> 7.1 please add ui mockups, db scheme changes and REST api changes
OK
> 7.2 i'm missing the deployment flow:
OK
> - user install engine. is quantum a separate install or bundled going 
> forward?
> - any configuration items by user at this point? what are they?
> - what rpms are deployed at host level? what configuration do they need
> - communication channels and their security are a must to understand
> 7.3 upgrade path / compatibility levels this is relevant to?
> 7.4 monitoring flow - how is monitoring/state of a host is affected by 
> quantum service being down, communication problem from host to (where 
> actually?)