On 03/03/2015 12:07 PM, Aline Manera wrote:
On 03/03/2015 04:50, Royce Lv wrote:
> OK, let me explain this lock scheme.
>
> 1. Every uploading volume needs a lock (thread locking) to make sure
> no one else is manipulating this volume at the same time to avoid
> write competition.
> 2. Consider about we cannot spend too much bandwidth on upload, we
> need to control concurrent uploading volumes to a reasonable number
> (here it is assigned to be 5).
>
> If as 1. requires each volume created with a lock, but 2. we just
> have 5 active uploading volumes, this will be a resource waste.
>
> so we have a lock pool called "upload_lock_pool" to store active
> uploading locks. This is to say, a volume upload request comes in and
> allocate a lock, store it in this pool, if a lock for volume "A" is
> not used any more(all blocking PUT requests are finished and the lock
> "ref_cnt" is 0), it will be deleted to make room in
> "upload_lock_pool" for other volumes "PUT" requests.
>
> So scenarios are below:
> a. An uploading volume request comes in, concurrent thread cnt is
> less than 5, lock pool does not has this lock for this volume yet.
> Kimchi will allocate a volume lock for it to use.
> b. An uploading volume request comes in, concurrent thread cnt is
> less than 5, lock pool already has this lock for this volume already.
> This means others is using this lock now, then this thread just add
> lock ref_cnt, and wait on this lock until it can get it.
> c. An uploading volume request comes in, concurrent thread cnt
> exceeds 5, request blocks on semaphore for the pool.
>
> The reason lock pool need a lock is we need to allocate lock in and
> delete lock from pool, this is a write operation, we need a lock
> protect pool r/w competition.
>
Ok - Thanks for explaining!
I think I have a better idea on it now.
See my comments below.
> On 02/09/2015 09:49 AM, Aline Manera wrote:
>>
>> On 28/01/2015 11:20, lvroyce(a)linux.vnet.ibm.com wrote:
>>> From: Royce Lv <lvroyce(a)linux.vnet.ibm.com>
>>>
>>> This lock facility guarentees 5 concurrent volume upload
>>> and make sure locks are created and reclaimed when needed.
>>>
>>> Signed-off-by: Royce Lv <lvroyce(a)linux.vnet.ibm.com>
>>> ---
>>> src/kimchi/i18n.py | 1 +
>>> src/kimchi/model/utils.py | 46
>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 47 insertions(+)
>>>
>>> diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py
>>> index 0a31cd5..af912bd 100644
>>> --- a/src/kimchi/i18n.py
>>> +++ b/src/kimchi/i18n.py
>>> @@ -211,6 +211,7 @@ messages = {
>>> "KCHVOL0022E": _("Unable to access file %(url)s. Please,
>>> check it."),
>>> "KCHVOL0023E": _("Unable to clone storage volume
'%(name)s'
>>> in pool '%(pool)s'. Details: %(err)s"),
>>> "KCHVOL0024E": _("Upload volume chunk index, size and
total
>>> size must be integer"),
>>
>>> + "KCHVOL0026E": _("Inconsistent upload count"),
>>
>> What does it mean for the user?
If this message will be shown to user we need to detail more it.
It is an internal
error, see comments below.
>>
>>>
>>> "KCHIFACE0001E": _("Interface %(name)s does not
exist"),
>>>
>>> diff --git a/src/kimchi/model/utils.py b/src/kimchi/model/utils.py
>>> index 9896289..010966a 100644
>>> --- a/src/kimchi/model/utils.py
>>> +++ b/src/kimchi/model/utils.py
>>> @@ -19,6 +19,7 @@
>>>
>>> import libvirt
>>> import socket
>>> +import threading
>>> import urlparse
>>> from lxml import etree, objectify
>>> from lxml.builder import E, ElementMaker
>>> @@ -28,6 +29,11 @@ from kimchi.model.featuretests import FeatureTests
>>>
>>> KIMCHI_META_URL = "https://github.com/kimchi-project/kimchi"
>>> KIMCHI_NAMESPACE = "kimchi"
>>> +UPLOAD_THREADS = 5 # Concurrent upload volume counts at the same
>>> time
>>> +
>>> +upload_semaphore = threading.BoundedSemaphore(value=UPLOAD_THREADS)
>>> +upload_lock_pool = dict()
>>> +pool_lock = threading.Lock()
>>>
>>>
>>> def get_vm_name(vm_name, t_name, name_list):
>>> @@ -161,3 +167,43 @@ def get_metadata_node(dom, tag,
>>> metadata_support, mode="current"):
>>> if node is not None:
>>> return etree.tostring(node)
>>> return ""
>>> +
>>> +
>>> +class UpdateLock(object):
>>> + def __init__(self):
>>> + self.ref_cnt = 0
>>> + self.lock = threading.Lock()
>>> +
>>
>>
>>> + def get_lock(self):
>>> + self.ref_cnt += 1
>>> + return self.lock
>>> +
>>
>> It always return the lock without checking it has the maximum number
>> So what is it for?
The max number verification must be done here and don't return the
lock when it is reached.
The upper limit is checked by upload_semaphore.acquire()
before calling
get_lock, this ref_cnt is to decide if the volume lock need to be
reclaim or not.
>>
>>> + def release_lock(self):
>>> + delete = False
>>> + self.ref_cnt -= 1
>>> + if (self.ref_cnt == 0):
>>> + delete = True
>>> + return delete
>>> +
>>
>> So first time I will use release_lock(), self.ref_cnt will be zero
>> so self.ret_cnt -= 1 turns to -1 and it returns False.
>>
>>> +
>>> +def get_vol_update_lock(vol_path):
>>> + # upload_semaphore controls the max upload count
>>> + upload_semaphore.acquire()
>>> +
>>> + # pool lock make sure lock list get/store action is atomic
>>> + with pool_lock:
>>> + vol_lock = upload_lock_pool.get(vol_path)
>>
>>> + if vol_lock:
>>> + return vol_lock.get_lock()
>>> + if len(upload_lock_pool.keys()) > (UPLOAD_THREADS - 1):
>>> + raise OperationFailed("KCHVOL0026E")
This max number verification must be the first thing to do when
returning the lock.
As I suggested above we should do that in the get_lock() function.
This to check if
for some reason, the cnt in upload_semaphore is less
than upper limit but the lock cnt is more than it. It means some one
called "get_vol_update_lock", but forget to call release_vol_update_lock
or un-handled exception raised before release_vol_update_lock.
>>
>> When the exception will be raised?
>> From my view, it will always return the lock in the "if" right above
>>
>> Is the lock for the pool or the volume?
>> It is confused to to review.
>>
>>> + lock = upload_lock_pool[vol_path] = UpdateLock()
>>> + return lock.get_lock()
>>> +
>>> +
>>> +def release_vol_update_lock(vol_path):
>>> + with pool_lock:
>>> + vol_lock = upload_lock_pool.get(vol_path)
If, for any reason, I call this function prior to get the lock it will
fail as vol_lock will be None.
I suggest to do this kind of verification to avoid problems.
ACK.
>>> + if vol_lock.release_lock():
>>> + upload_lock_pool.pop(vol_path, None)
>>> + upload_semaphore.release()
>>
>