[Engine-devel] [RFC] New Connection Management API

Thu Jan 26 10:22:42 UTC 2012

On 25/01/12 23:35, Saggi Mizrahi wrote:
> <SNIP>
> This is mail was getting way too long.
> 
> About the clear all verb.
> No.
> Just loop, find the connections YOU OWN and clean them. Even though you don't want to support multiple clients to VDSM API doesn't mean the engine shouldn't behave like a proper citizen.
> It's the same reason why VDSM tries and not mess system resources it didn't initiate.

There is a big difference, VDSM living in hybrid mode with other
workload on the host is a valid use case, having more than one
concurrent manager for VDSM is not.
Generating a disconnect request for each connection does not seem like
the right API to me, again think on the simple flow of moving host from
one data center to another, the engine needs to disconnect tall storage
domains (each domain can have couple of connections associated with it).

I am giving example from the engine use cases as it is the main user of
VDSM ATM but I am sure it will be relevant to any other user of VDSM.

> 
> ------------------------
> 
> As I see it the only point of conflict is the so called non-peristed connections.
> I will call them transient connections from now on.
> 
> There are 2 user cases being discussed
> 1. Wait until a connection is made, if it fails don't retry and automatically unmanage.
> 2. If the called of the API forgets or fails to unmanage a connection.
> 

Actually I was not discussing #2 at all.

> Your suggestion as I understand it:
> Transient connections are:
>      - Connection that VDSM will only try to connect to once and will not reconnect to in case of disconnect.

yes

> 
> My problem with this definition that it does not specify the "end of life" of the connection.
> Meaning it solves only use case 1.

since this is the only use case i had in mind, it is what i was looking for.

> If all is well, and it usually is, VDSM will not invoke a disconnect.
> So the caller would have to call unmanage if the connection succeeded at the end of the flow.

agree.

> Now, if you are already calling unmanage if connection succeeded you can just call it anyway.

not exactly, an example I gave earlier on the thread was that VSDM hangs
or have other error and the engine can not initiate unmanaged, instead
let's assume the host is fenced (self-fence or external fence does not
matter), in this scenario the engine will not issue unmanage.

> 
> instead of doing: (with your suggestion)
> ----------------
> manage
> wait until succeeds or lastError has value
> try:
>   do stuff
> finally:
>   unmanage
> 
> do: (with the canonical flow)
> ---
> manage
> try:
>   wait until succeeds or lastError has value
>   do stuff
> finally:
>   unmanage
> 
> This is simpler to do than having another connection type.

You are assuming the engine can communicate with VDSM and there are
scenarios where it is not feasible.

> 
> Now that we got that out of the way lets talk about the 2nd use case.

Since I did not ask VDSM to clean after the (engine) user and you don't
want to do it I am not sure we need to discuss this.

If you insist we can start the discussion on who should implement the
cleanup mechanism but I'm afraid I have no strong arguments for VDSM to
do it, so I rather not go there ;)

You dropped from the discussion my request for supporting list of
connections for manage and unmanage verbs.

> API client died in the middle of the operation and unmanage was never called.
> 
> Your suggested definition means that unless there was a problem with the connection VDSM will still have this connection active. The engine will have to clean it anyway.
> 
> The problem is, VDSM has no way of knowing that a client died, forgot or is thinking really hard and will continue on in about 2 minutes.

> 
> Connections that live until they die is a hard to define and work with lifecycle. Solving this problem is theoretically simple.
> 
> Have clients hold some sort of session token and force the client to update it at a specified interval. You could bind resources (like domains, VMs, connections) to that session token so when it expires VDSM auto cleans the resources.
> 
> This kind of mechanism is out of the scope of this API change. Further more I think that this mechanism should sit in the engine since the session might actually contain resources from multiple hosts and resources that are not managed by VDSM.
> 
> In GUI flows specifically the user might do actions that don't even touch the engine and forcing it to refresh the engine token is simpler then having it refresh the VDSM token.
> 
> I understand that engine currently has no way of tracking a user session. This, as I said, is also true in the case of VDSM. We can start and argue about which project should implement the session semantics. But as I see it it's not relevant to the connection management API.