[Engine-devel] [vdsm] [RFC] New Connection Management API

Ayal Baron abaron at redhat.com
Sun Jan 29 08:49:15 UTC 2012



----- Original Message -----
> top posting since there was a long thread on this anyway.
> some questions/comments:
> 
> 1. about the CIDs - it sounds like the engine needs to persist this
> info, so it can resume normally in case of a failure/restart (this is
> different than today, when the persisted info is the connection
> details,
> rather than some generated identifier)?

It doesn't have to.
engine can have 2 types of CIDs:
1. for engine internal flows:
- This would mostly be around a storage domain so the CID can start with the sd name and whenever engine wants to disconnect the connections of an sd.
Several use cases for this -
a. when moving a host between DCs, and don't tell me engine can't run operations on a host in 'maintenance' mode, either solve this silliness or allow moving a host directly between DCs without it being in maintenance).
b. when removing a connection from the storage domain definition
c. when removing the storage domain from the db
Anyway, in the above cases you have 2 options:
i. just getStorageConnectionList and disconnect anything that starts with this domain name
ii. If the name is constant (e.g. SD_NAME-CONN-IQN) then you can always 'build it'

2. for UI generated flows
I see no reason to persist anything here, just need a connection cleanup flow to get rid of irrelevant connections (you would need this anyway).

> 
> 2. sounds like the engine needs to block in certain cases after a
> manageConnection to make sure it is there and alive before doing an
> operation.

I don't see how this is different from today.

> this means now engine has to check a host has all relevant
> connections
> online before choosing it as a target for live migration even for a
> regular VM (all disks on a storage domain).
> worse/uglier (well, imho), in case of a disk based on a direct LUN,
> the
> engine needs to actively connect the target host, poll till it's up,
> and
> only then live migrate (would be much nicer if vdsm migration
> protocol
> would have taken care of this manageConnection call (preserving the
> CID?)

Today all hosts are connected to all the storage connections (that are needed for storage domains) beforehand and when a VM is migrated the connection is assumed to be on the target host.  What is the difference?
You preconnect to make sure the host is a valid target for migration.
If you keep a set of hosts always connected you can better ensure SLAs.

> 
> 3. in unmanageStorageServer(connectionID) below you finish with
> "Returns:
> Success code if VDSM was able to unmanage the connection.
> It will return an error if the CID is not registered with VDSM.
> Disconnect failures are not reported. Active unmanaged connections
> can
> be tracked with getStorageServerList()"
> 
> it is not clear if vdsm will retry to disconnect, and how races
> between
> those retries and new manage connection requests will be handled.
> if the connection only becomes unmanaged, there is no way to track
> and
> clean it up (engine is not supposed to touch the unmanaged
> connections)

Basically disconnect should never fail, if it does then there is a bug somewhere as it does not require access to a remote host.
If and when this happens, qe will open a bug.
Unless we have a ton of manage ops and then unmanage ops which fail this will not be an issue at all.

> 
> 4. I don't think we handle this today, but while we are planning for
> the
> future - what if the host needs one of the connections to exist
> regardless of engine for another need (say it does boot from network
> from same iscsi target - this is an unmanaged connection which you
> will
> disconnect based on the CID refcount concept).
> i.e., what happens if the host has an unmanaged connection, which
> becomes a managed one.
> solving this probably means when adding a connection, need to add an
> unmanaged_existed_before CID for refcount?

So what happens if the order is reveresed? i.e. manage is called by ovirt and then the other application calls connect underneath? vdsm would have no idea this happened and would disconnect when unmanage arrives.
The solution is simple, either separate connections (hybrid mode doesn't mean we share connections) or the other application has to be aware it is running in hybrid mode and then go through vdsm for connecting (no app is hybrid aware today).

> 
> 
> On 01/23/2012 11:54 PM, Saggi Mizrahi wrote:
> > I have begun work at changing how API clients can control storage
> > connections when interacting with VDSM.
> >
> > Currently there are 2 API calls:
> > connectStorageServer() - Will connect to the storage target if the
> > host is not already connected to it.
> > disconnectStorageServer() - Will disconnect from the storage target
> > if the host is connected to it.
> >
> > This API is very simple but is inappropriate when multiple clients
> > and flows try to access the same storage.
> >
> > This is currently solved by trying to synchronize things inside
> > rhevm. This is hard and convoluted. It also brings out issues with
> > other clients using the VDSM API.
> >
> > Another problem is error recovery. Currently ovirt-engine(OE) has
> > no way of monitoring the connections on all the hosts an if a
> > connection disappears it's OE's responsibility to reconnect.
> >
> > I suggest a different concept where VDSM 'manages' the connections.
> > VDSM receives a manage request with the connection information and
> > from that point forward VDSM will try to keep this connection
> > alive. If the connection fails VDSM will automatically try and
> > recover.
> >
> > Every manage request will also have a connection ID(CID). This CID
> > will be used when the same client asks to unamange the connection.
> > When multiple requests for manage are received to the same
> > connection they all have to have their own unique CID. By
> > internally mapping CIDs to actual connections VDSM can properly
> > disconnect when no CID is addressing the connection. This allows
> > each client and even each flow to have it's own CID effectively
> > eliminating connect\disconnect races.
> >
> > The change from (dis)connect to (un)manage also changes the
> > semantics of the calls significantly.
> > Whereas connectStorageServer would have returned when the storage
> > is either connected or failed to connect, manageStorageServer will
> > return once VDSM registered the CID. This means that the
> > connection might not be active immediately as the VDSM tries to
> > connect. The connection might remain down for a long time if the
> > storage target is down or is having issues.
> >
> > This allows for VDSM to receive the manage request even if the
> > storage is having issues and recover as soon as it's operational
> > without user intervention.
> >
> > In order for the client to query the current state of the
> > connections I propose getStorageConnectionList(). This will return
> > a mapping of CID to connection status. The status contains the
> > connection info (excluding credentials), whether the connection is
> > active, whether the connection is managed (unamanged connection
> > are returned with transient IDs), and, if the connection is down,
> > the last error information.
> >
> > The same actual connection can return multiple times, once for each
> > CID.
> >
> > For cases where an operation requires a connection to be active a
> > user can poll the status of the CID. The user can then choose to
> > poll for a certain amount of time or until an error appears in the
> > error field of the status. This will give you either a timeout or
> > a "try once" semantic depending on the flows needs.
> >
> > All connections that have been managed persist VDSM restart and
> > will be managed until a corresponding unmanage command has been
> > issued.
> >
> > There is no concept of temporary connections as "temporary" is flow
> > dependent and VDSM can't accommodate all interpretation of
> > "temporary". An ad-hoc mechanism can be build using the CID field.
> > For instance a client can manage a connection with
> > "ENGINE_FLOW101_CON1". If the flow got interrupted the client can
> > clean all IDs with certain flow IDs.
> >
> > I think this API gives safety, robustness, and implementation
> > freedom.
> >
> >
> > Nitty Gritty:
> >
> > manageStorageServer
> > ===================
> > Synopsis:
> > manageStorageServer(uri, connectionID):
> >
> > Parameters:
> > uri - a uri pointing to a storage target (eg: nfs://server:export,
> > iscsi://host/iqn;portal=1)
> > connectionID - string with any char except "/".
> >
> > Description:
> > Tells VDSM to start managing the connection. From this moment on
> > VDSM will try and have the connection available when needed. VDSM
> > will monitor the connection and will automatically reconnect on
> > failure.
> > Returns:
> > Success code if VDSM was able to manage the connection.
> > It usually just verifies that the arguments are sane and that the
> > CID is not already in use.
> > This doesn't mean the host is connected.
> > ----
> > unmanageStorageServer
> > =====================
> > Synopsis:
> > unmanageStorageServer(connectionID):
> >
> > Parameters:
> > connectionID - string with any char except "/".
> >
> > Descriptions:
> > Tells VDSM to stop managing the connection. VDSM will try and
> > disconnect for the storage target if this is the last CID
> > referencing the storage connection.
> >
> > Returns:
> > Success code if VDSM was able to unmanage the connection.
> > It will return an error if the CID is not registered with VDSM.
> > Disconnect failures are not reported. Active unmanaged connections
> > can be tracked with getStorageServerList()
> > ----
> > getStorageServerList
> > ====================
> > Synopsis:
> > getStorageServerList()
> >
> > Description:
> > Will return list of all managed and unmanaged connections.
> > Unmanaged connections have temporary IDs and are not guaranteed to
> > be consistent across calls.
> >
> > Results:VDSM was able to manage the connection.
> > It usually just verifies that the arguments are sane and that the
> > CID is not already in use.
> > This doesn't mean the host is connected.
> > ----
> > unmanageStorageServer
> > =====================
> > Synopsis:
> > unmanageStorageServer(connectionID):
> >
> > Parameters:
> > connectionID - string with any char except "/".
> >
> > Descriptions:
> > Tells VDSM to stop managing the connection. VDSM will try and
> > disconnect for the storage target if this is the last CID
> > referencing the storage connection.
> >
> > Returns:
> > Success code if VDSM was able to unmanage the connection.
> > It will return an error if the CID is not registered with VDSM.
> > Disconnect failures are not reported. Active unmanaged connections
> > can be tracked with getStorageServerList()
> > ----
> > getStorageServerList
> > ====================
> > Synopsis:
> > getStorageServerList()
> >
> > Description:
> > Will return list of all managed and unmanaged connections.
> > Unmanaged connections have temporary IDs and are not guaranteed to
> > be consistent across calls.
> >
> > Results:
> > A mapping between CIDs and the status.
> > example return value (Actual key names may differ)
> >
> > {'conA': {'connected': True, 'managed': True, 'lastError': 0,
> > 'connectionInfo': {
> >      'remotePath': 'server:/export
> >      'retrans': 3
> >      'version': 4
> >      }}
> >   'iscsi_session_34': {'connected': False, 'managed': False,
> >   'lastError': 339, 'connectionIfno': {
> >      'hostname': 'dandylopn'
> >      'portal': 1}}
> > }
> > _______________________________________________
> > Engine-devel mailing list
> > Engine-devel at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/engine-devel
> 
> _______________________________________________
> vdsm-devel mailing list
> vdsm-devel at lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel
> 



More information about the Engine-devel mailing list