
On 03/03/2015 12:07 PM, Aline Manera wrote:
On 03/03/2015 04:50, Royce Lv wrote:
OK, let me explain this lock scheme.
1. Every uploading volume needs a lock (thread locking) to make sure no one else is manipulating this volume at the same time to avoid write competition. 2. Consider about we cannot spend too much bandwidth on upload, we need to control concurrent uploading volumes to a reasonable number (here it is assigned to be 5).
If as 1. requires each volume created with a lock, but 2. we just have 5 active uploading volumes, this will be a resource waste.
so we have a lock pool called "upload_lock_pool" to store active uploading locks. This is to say, a volume upload request comes in and allocate a lock, store it in this pool, if a lock for volume "A" is not used any more(all blocking PUT requests are finished and the lock "ref_cnt" is 0), it will be deleted to make room in "upload_lock_pool" for other volumes "PUT" requests.
So scenarios are below: a. An uploading volume request comes in, concurrent thread cnt is less than 5, lock pool does not has this lock for this volume yet. Kimchi will allocate a volume lock for it to use. b. An uploading volume request comes in, concurrent thread cnt is less than 5, lock pool already has this lock for this volume already. This means others is using this lock now, then this thread just add lock ref_cnt, and wait on this lock until it can get it. c. An uploading volume request comes in, concurrent thread cnt exceeds 5, request blocks on semaphore for the pool.
The reason lock pool need a lock is we need to allocate lock in and delete lock from pool, this is a write operation, we need a lock protect pool r/w competition.
Ok - Thanks for explaining! I think I have a better idea on it now.
See my comments below.
On 02/09/2015 09:49 AM, Aline Manera wrote:
On 28/01/2015 11:20, lvroyce@linux.vnet.ibm.com wrote:
From: Royce Lv <lvroyce@linux.vnet.ibm.com>
This lock facility guarentees 5 concurrent volume upload and make sure locks are created and reclaimed when needed.
Signed-off-by: Royce Lv <lvroyce@linux.vnet.ibm.com> --- src/kimchi/i18n.py | 1 + src/kimchi/model/utils.py | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+)
diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py index 0a31cd5..af912bd 100644 --- a/src/kimchi/i18n.py +++ b/src/kimchi/i18n.py @@ -211,6 +211,7 @@ messages = { "KCHVOL0022E": _("Unable to access file %(url)s. Please, check it."), "KCHVOL0023E": _("Unable to clone storage volume '%(name)s' in pool '%(pool)s'. Details: %(err)s"), "KCHVOL0024E": _("Upload volume chunk index, size and total size must be integer"),
+ "KCHVOL0026E": _("Inconsistent upload count"),
What does it mean for the user?
If this message will be shown to user we need to detail more it.
It is an internal error, see comments below.
"KCHIFACE0001E": _("Interface %(name)s does not exist"),
diff --git a/src/kimchi/model/utils.py b/src/kimchi/model/utils.py index 9896289..010966a 100644 --- a/src/kimchi/model/utils.py +++ b/src/kimchi/model/utils.py @@ -19,6 +19,7 @@
import libvirt import socket +import threading import urlparse from lxml import etree, objectify from lxml.builder import E, ElementMaker @@ -28,6 +29,11 @@ from kimchi.model.featuretests import FeatureTests
KIMCHI_META_URL = "https://github.com/kimchi-project/kimchi" KIMCHI_NAMESPACE = "kimchi" +UPLOAD_THREADS = 5 # Concurrent upload volume counts at the same time + +upload_semaphore = threading.BoundedSemaphore(value=UPLOAD_THREADS) +upload_lock_pool = dict() +pool_lock = threading.Lock()
def get_vm_name(vm_name, t_name, name_list): @@ -161,3 +167,43 @@ def get_metadata_node(dom, tag, metadata_support, mode="current"): if node is not None: return etree.tostring(node) return "" + + +class UpdateLock(object): + def __init__(self): + self.ref_cnt = 0 + self.lock = threading.Lock() +
+ def get_lock(self): + self.ref_cnt += 1 + return self.lock +
It always return the lock without checking it has the maximum number So what is it for?
The max number verification must be done here and don't return the lock when it is reached.
The upper limit is checked by upload_semaphore.acquire() before calling get_lock, this ref_cnt is to decide if the volume lock need to be reclaim or not.
+ def release_lock(self): + delete = False + self.ref_cnt -= 1 + if (self.ref_cnt == 0): + delete = True + return delete +
So first time I will use release_lock(), self.ref_cnt will be zero so self.ret_cnt -= 1 turns to -1 and it returns False.
+ +def get_vol_update_lock(vol_path): + # upload_semaphore controls the max upload count + upload_semaphore.acquire() + + # pool lock make sure lock list get/store action is atomic + with pool_lock: + vol_lock = upload_lock_pool.get(vol_path)
+ if vol_lock: + return vol_lock.get_lock()
+ if len(upload_lock_pool.keys()) > (UPLOAD_THREADS - 1): + raise OperationFailed("KCHVOL0026E")
This max number verification must be the first thing to do when returning the lock. As I suggested above we should do that in the get_lock() function.
This to check if for some reason, the cnt in upload_semaphore is less than upper limit but the lock cnt is more than it. It means some one called "get_vol_update_lock", but forget to call release_vol_update_lock or un-handled exception raised before release_vol_update_lock.
When the exception will be raised? From my view, it will always return the lock in the "if" right above
Is the lock for the pool or the volume? It is confused to to review.
+ lock = upload_lock_pool[vol_path] = UpdateLock() + return lock.get_lock() + +
+def release_vol_update_lock(vol_path): + with pool_lock: + vol_lock = upload_lock_pool.get(vol_path)
If, for any reason, I call this function prior to get the lock it will fail as vol_lock will be None. I suggest to do this kind of verification to avoid problems.
ACK.
+ if vol_lock.release_lock(): + upload_lock_pool.pop(vol_path, None) + upload_semaphore.release()