[Kimchi-devel] [kimchi-devel][PATCHv3 3/6] Add lock facility for storage volume upload
Royce Lv
lvroyce at linux.vnet.ibm.com
Wed Mar 4 07:02:06 UTC 2015
On 03/03/2015 12:07 PM, Aline Manera wrote:
>
> On 03/03/2015 04:50, Royce Lv wrote:
>> OK, let me explain this lock scheme.
>>
>> 1. Every uploading volume needs a lock (thread locking) to make sure
>> no one else is manipulating this volume at the same time to avoid
>> write competition.
>> 2. Consider about we cannot spend too much bandwidth on upload, we
>> need to control concurrent uploading volumes to a reasonable number
>> (here it is assigned to be 5).
>>
>> If as 1. requires each volume created with a lock, but 2. we just
>> have 5 active uploading volumes, this will be a resource waste.
>>
>> so we have a lock pool called "upload_lock_pool" to store active
>> uploading locks. This is to say, a volume upload request comes in and
>> allocate a lock, store it in this pool, if a lock for volume "A" is
>> not used any more(all blocking PUT requests are finished and the lock
>> "ref_cnt" is 0), it will be deleted to make room in
>> "upload_lock_pool" for other volumes "PUT" requests.
>>
>> So scenarios are below:
>> a. An uploading volume request comes in, concurrent thread cnt is
>> less than 5, lock pool does not has this lock for this volume yet.
>> Kimchi will allocate a volume lock for it to use.
>> b. An uploading volume request comes in, concurrent thread cnt is
>> less than 5, lock pool already has this lock for this volume already.
>> This means others is using this lock now, then this thread just add
>> lock ref_cnt, and wait on this lock until it can get it.
>> c. An uploading volume request comes in, concurrent thread cnt
>> exceeds 5, request blocks on semaphore for the pool.
>>
>> The reason lock pool need a lock is we need to allocate lock in and
>> delete lock from pool, this is a write operation, we need a lock
>> protect pool r/w competition.
>>
>
> Ok - Thanks for explaining!
> I think I have a better idea on it now.
>
> See my comments below.
>
>> On 02/09/2015 09:49 AM, Aline Manera wrote:
>>>
>>> On 28/01/2015 11:20, lvroyce at linux.vnet.ibm.com wrote:
>>>> From: Royce Lv <lvroyce at linux.vnet.ibm.com>
>>>>
>>>> This lock facility guarentees 5 concurrent volume upload
>>>> and make sure locks are created and reclaimed when needed.
>>>>
>>>> Signed-off-by: Royce Lv <lvroyce at linux.vnet.ibm.com>
>>>> ---
>>>> src/kimchi/i18n.py | 1 +
>>>> src/kimchi/model/utils.py | 46
>>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 47 insertions(+)
>>>>
>>>> diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py
>>>> index 0a31cd5..af912bd 100644
>>>> --- a/src/kimchi/i18n.py
>>>> +++ b/src/kimchi/i18n.py
>>>> @@ -211,6 +211,7 @@ messages = {
>>>> "KCHVOL0022E": _("Unable to access file %(url)s. Please,
>>>> check it."),
>>>> "KCHVOL0023E": _("Unable to clone storage volume '%(name)s'
>>>> in pool '%(pool)s'. Details: %(err)s"),
>>>> "KCHVOL0024E": _("Upload volume chunk index, size and total
>>>> size must be integer"),
>>>
>>>> + "KCHVOL0026E": _("Inconsistent upload count"),
>>>
>>> What does it mean for the user?
>
> If this message will be shown to user we need to detail more it.
It is an internal error, see comments below.
>
>>>
>>>>
>>>> "KCHIFACE0001E": _("Interface %(name)s does not exist"),
>>>>
>>>> diff --git a/src/kimchi/model/utils.py b/src/kimchi/model/utils.py
>>>> index 9896289..010966a 100644
>>>> --- a/src/kimchi/model/utils.py
>>>> +++ b/src/kimchi/model/utils.py
>>>> @@ -19,6 +19,7 @@
>>>>
>>>> import libvirt
>>>> import socket
>>>> +import threading
>>>> import urlparse
>>>> from lxml import etree, objectify
>>>> from lxml.builder import E, ElementMaker
>>>> @@ -28,6 +29,11 @@ from kimchi.model.featuretests import FeatureTests
>>>>
>>>> KIMCHI_META_URL = "https://github.com/kimchi-project/kimchi"
>>>> KIMCHI_NAMESPACE = "kimchi"
>>>> +UPLOAD_THREADS = 5 # Concurrent upload volume counts at the same
>>>> time
>>>> +
>>>> +upload_semaphore = threading.BoundedSemaphore(value=UPLOAD_THREADS)
>>>> +upload_lock_pool = dict()
>>>> +pool_lock = threading.Lock()
>>>>
>>>>
>>>> def get_vm_name(vm_name, t_name, name_list):
>>>> @@ -161,3 +167,43 @@ def get_metadata_node(dom, tag,
>>>> metadata_support, mode="current"):
>>>> if node is not None:
>>>> return etree.tostring(node)
>>>> return ""
>>>> +
>>>> +
>>>> +class UpdateLock(object):
>>>> + def __init__(self):
>>>> + self.ref_cnt = 0
>>>> + self.lock = threading.Lock()
>>>> +
>>>
>>>
>>>> + def get_lock(self):
>>>> + self.ref_cnt += 1
>>>> + return self.lock
>>>> +
>>>
>>> It always return the lock without checking it has the maximum number
>>> So what is it for?
>
> The max number verification must be done here and don't return the
> lock when it is reached.
The upper limit is checked by upload_semaphore.acquire() before calling
get_lock, this ref_cnt is to decide if the volume lock need to be
reclaim or not.
>
>>>
>>>> + def release_lock(self):
>>>> + delete = False
>>>> + self.ref_cnt -= 1
>>>> + if (self.ref_cnt == 0):
>>>> + delete = True
>>>> + return delete
>>>> +
>>>
>>> So first time I will use release_lock(), self.ref_cnt will be zero
>>> so self.ret_cnt -= 1 turns to -1 and it returns False.
>>>
>>>> +
>>>> +def get_vol_update_lock(vol_path):
>>>> + # upload_semaphore controls the max upload count
>>>> + upload_semaphore.acquire()
>>>> +
>>>> + # pool lock make sure lock list get/store action is atomic
>>>> + with pool_lock:
>>>> + vol_lock = upload_lock_pool.get(vol_path)
>>>
>>>> + if vol_lock:
>>>> + return vol_lock.get_lock()
>
>>>> + if len(upload_lock_pool.keys()) > (UPLOAD_THREADS - 1):
>>>> + raise OperationFailed("KCHVOL0026E")
>
> This max number verification must be the first thing to do when
> returning the lock.
> As I suggested above we should do that in the get_lock() function.
This to check if for some reason, the cnt in upload_semaphore is less
than upper limit but the lock cnt is more than it. It means some one
called "get_vol_update_lock", but forget to call release_vol_update_lock
or un-handled exception raised before release_vol_update_lock.
>
>>>
>>> When the exception will be raised?
>>> From my view, it will always return the lock in the "if" right above
>>>
>>> Is the lock for the pool or the volume?
>>> It is confused to to review.
>>>
>>>> + lock = upload_lock_pool[vol_path] = UpdateLock()
>>>> + return lock.get_lock()
>>>> +
>>>> +
>
>>>> +def release_vol_update_lock(vol_path):
>>>> + with pool_lock:
>>>> + vol_lock = upload_lock_pool.get(vol_path)
>
> If, for any reason, I call this function prior to get the lock it will
> fail as vol_lock will be None.
> I suggest to do this kind of verification to avoid problems.
ACK.
>
>>>> + if vol_lock.release_lock():
>>>> + upload_lock_pool.pop(vol_path, None)
>>>> + upload_semaphore.release()
>>>
>>
>
More information about the Kimchi-devel
mailing list