[PATCH] host/partitions: avoid calling disks.get_partitions_names() for each partition

In PartitionModel.lookup(), it calls disks.get_partitions_names() to verify the given partition is eligible to be used in a new logical storage pool as follow. if name not in disks.get_partitions_names(): raise NotFoundError(...) When the front-end issues GET request to host/partitions, the back-end would call get_partitions_names() for each partition. If there are many disks and partitions in the host, get_partitions_names() takes a long time to finish, and calling get_partitions_names() for each partition would make the GET request timeout. This patch is to do the following. When the partition is listed from the URI host/partitions, we should avoid calling get_partitions_names() for each of them. When the partition is listed from the URI host/partitions/sdaX, we should verify sdaX is among the eligible partitions. The solution is that we make use of the resource_args in Partitions collection class and override the _get_resources() method. In the _get_resources() method, it sets resource_args to "[True]", indicating that the listed partitions are already verified and eligible, so PartitionModel.lookup() would skip the check. When we are outside of _get_resources() method, resource_args is set to "[False]", and PartitionModel.lookup() enables the check. Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 26 +++++++++++++++++++++----- src/kimchi/model/host.py | 4 ++-- 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ad34919..11673d5 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -17,11 +17,14 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +import threading + import cherrypy from kimchi.control.base import Collection, Resource, SimpleCollection from kimchi.control.utils import UrlSubNode, validate_method from kimchi.exception import OperationFailed +from kimchi.rollbackcontext import RollbackContext from kimchi.template import render @@ -75,18 +78,31 @@ class Partitions(Collection): def __init__(self, model): super(Partitions, self).__init__(model) self.resource = Partition + self._reset_resource_args() + self._resource_args_lock = threading.Lock() + + def _reset_resource_args(self): + self.resource_args = [False] - # Defining get_resources in order to return list of partitions in UI - # sorted by their path def _get_resources(self, flag_filter): - res_list = super(Partitions, self)._get_resources(flag_filter) - res_list.sort(key=lambda x: x.info['path']) + with RollbackContext() as rollback: + self._resource_args_lock.acquire() + rollback.prependDefer(self._resource_args_lock.release) + + self.resource_args = [True] + rollback.prependDefer(self._reset_resource_args) + + # Defining get_resources in order to return list of partitions in + # UI sorted by their path + res_list = super(Partitions, self)._get_resources(flag_filter) + res_list.sort(key=lambda x: x.info['path']) return res_list class Partition(Resource): - def __init__(self, model, id): + def __init__(self, model, verified, id): super(Partition, self).__init__(model, id) + self.model_args = [verified] + list(self.model_args) @property def data(self): diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 47b0aa1..ebd6f79 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -244,8 +244,8 @@ class PartitionModel(object): def __init__(self, **kargs): pass - def lookup(self, name): - if name not in disks.get_partitions_names(): + def lookup(self, verified, name): + if not verified and name not in disks.get_partitions_names(): raise NotFoundError("KCHPART0001E", {'name': name}) return disks.get_partition_details(name) -- 1.9.0

On 05/06/2014 05:04 PM, Zhou Zheng Sheng wrote:
In PartitionModel.lookup(), it calls disks.get_partitions_names() to verify the given partition is eligible to be used in a new logical storage pool as follow. if name not in disks.get_partitions_names(): raise NotFoundError(...)
When the front-end issues GET request to host/partitions, the back-end would call get_partitions_names() for each partition. If there are many disks and partitions in the host, get_partitions_names() takes a long time to finish, and calling get_partitions_names() for each partition would make the GET request timeout.
This patch is to do the following. When the partition is listed from the URI host/partitions, we should avoid calling get_partitions_names() for each of them. When the partition is listed from the URI host/partitions/sdaX, we should verify sdaX is among the eligible partitions.
The solution is that we make use of the resource_args in Partitions collection class and override the _get_resources() method. In the _get_resources() method, it sets resource_args to "[True]", indicating that the listed partitions are already verified and eligible, so PartitionModel.lookup() would skip the check. When we are outside of _get_resources() method, resource_args is set to "[False]", and PartitionModel.lookup() enables the check.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 26 +++++++++++++++++++++----- src/kimchi/model/host.py | 4 ++-- 2 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ad34919..11673d5 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -17,11 +17,14 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import threading + import cherrypy
from kimchi.control.base import Collection, Resource, SimpleCollection from kimchi.control.utils import UrlSubNode, validate_method from kimchi.exception import OperationFailed +from kimchi.rollbackcontext import RollbackContext from kimchi.template import render
@@ -75,18 +78,31 @@ class Partitions(Collection): def __init__(self, model): super(Partitions, self).__init__(model) self.resource = Partition + self._reset_resource_args() + self._resource_args_lock = threading.Lock() + + def _reset_resource_args(self): + self.resource_args = [False]
- # Defining get_resources in order to return list of partitions in UI - # sorted by their path def _get_resources(self, flag_filter): - res_list = super(Partitions, self)._get_resources(flag_filter) - res_list.sort(key=lambda x: x.info['path']) + with RollbackContext() as rollback: + self._resource_args_lock.acquire() + rollback.prependDefer(self._resource_args_lock.release) + + self.resource_args = [True] + rollback.prependDefer(self._reset_resource_args) + + # Defining get_resources in order to return list of partitions in + # UI sorted by their path + res_list = super(Partitions, self)._get_resources(flag_filter) yes, verified flag for resource can avoid call get_partitions_names twice. Now Partition resource is a special case, it's __init__ accept extra argument "verified". Maybe there will be more special cases.
Not sure we can find a better way to define the Partition as follow: class Partition(Resource): def __init__(self, model, id, verified=False): .... def lookup(self, name, verified=False): ... from the super _get_resources, it pass the follow args to it's resource. args = self.resource_args + [ident] Not sure, Is it a good idea for extra self.resource_kargs, and self.model_kargs?
+ res_list.sort(key=lambda x: x.info['path']) return res_list
class Partition(Resource): - def __init__(self, model, id): + def __init__(self, model, verified, id): super(Partition, self).__init__(model, id) + self.model_args = [verified] + list(self.model_args)
@property def data(self): diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 47b0aa1..ebd6f79 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -244,8 +244,8 @@ class PartitionModel(object): def __init__(self, **kargs): pass
- def lookup(self, name): - if name not in disks.get_partitions_names(): + def lookup(self, verified, name): + if not verified and name not in disks.get_partitions_names(): raise NotFoundError("KCHPART0001E", {'name': name})
return disks.get_partition_details(name)
-- Thanks and best regards! Sheldon Feng(冯少合)<shaohef@linux.vnet.ibm.com> IBM Linux Technology Center

on 2014/05/06 23:11, Sheldon wrote:
On 05/06/2014 05:04 PM, Zhou Zheng Sheng wrote:
In PartitionModel.lookup(), it calls disks.get_partitions_names() to verify the given partition is eligible to be used in a new logical storage pool as follow. if name not in disks.get_partitions_names(): raise NotFoundError(...)
When the front-end issues GET request to host/partitions, the back-end would call get_partitions_names() for each partition. If there are many disks and partitions in the host, get_partitions_names() takes a long time to finish, and calling get_partitions_names() for each partition would make the GET request timeout.
This patch is to do the following. When the partition is listed from the URI host/partitions, we should avoid calling get_partitions_names() for each of them. When the partition is listed from the URI host/partitions/sdaX, we should verify sdaX is among the eligible partitions.
The solution is that we make use of the resource_args in Partitions collection class and override the _get_resources() method. In the _get_resources() method, it sets resource_args to "[True]", indicating that the listed partitions are already verified and eligible, so PartitionModel.lookup() would skip the check. When we are outside of _get_resources() method, resource_args is set to "[False]", and PartitionModel.lookup() enables the check.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 26 +++++++++++++++++++++----- src/kimchi/model/host.py | 4 ++-- 2 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ad34919..11673d5 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -17,11 +17,14 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import threading + import cherrypy
from kimchi.control.base import Collection, Resource, SimpleCollection from kimchi.control.utils import UrlSubNode, validate_method from kimchi.exception import OperationFailed +from kimchi.rollbackcontext import RollbackContext from kimchi.template import render
@@ -75,18 +78,31 @@ class Partitions(Collection): def __init__(self, model): super(Partitions, self).__init__(model) self.resource = Partition + self._reset_resource_args() + self._resource_args_lock = threading.Lock() + + def _reset_resource_args(self): + self.resource_args = [False]
- # Defining get_resources in order to return list of partitions in UI - # sorted by their path def _get_resources(self, flag_filter): - res_list = super(Partitions, self)._get_resources(flag_filter) - res_list.sort(key=lambda x: x.info['path']) + with RollbackContext() as rollback: + self._resource_args_lock.acquire() + rollback.prependDefer(self._resource_args_lock.release) + + self.resource_args = [True] + rollback.prependDefer(self._reset_resource_args) + + # Defining get_resources in order to return list of partitions in + # UI sorted by their path + res_list = super(Partitions, self)._get_resources(flag_filter) yes, verified flag for resource can avoid call get_partitions_names twice. Now Partition resource is a special case, it's __init__ accept extra argument "verified". Maybe there will be more special cases.
Not sure we can find a better way to define the Partition as follow: class Partition(Resource): def __init__(self, model, id, verified=False): .... def lookup(self, name, verified=False): ...
from the super _get_resources, it pass the follow args to it's resource. args = self.resource_args + [ident]
Not sure, Is it a good idea for extra self.resource_kargs, and self.model_kargs?
Thanks Sheldon. A problem is that it's an unwritten rule that "ident" or "name" has to be the last one of the resource arguments. It's not a good design but it took hold in the Kimchi MVC framework. I think we can adjust the order to make it the first one, so that we can append variable number of optional arguments for the resource controller class and model.lookup() method. This would involve a big refactor.
+ res_list.sort(key=lambda x: x.info['path']) return res_list
class Partition(Resource): - def __init__(self, model, id): + def __init__(self, model, verified, id): super(Partition, self).__init__(model, id) + self.model_args = [verified] + list(self.model_args)
@property def data(self): diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 47b0aa1..ebd6f79 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -244,8 +244,8 @@ class PartitionModel(object): def __init__(self, **kargs): pass
- def lookup(self, name): - if name not in disks.get_partitions_names(): + def lookup(self, verified, name): + if not verified and name not in disks.get_partitions_names(): raise NotFoundError("KCHPART0001E", {'name': name})
return disks.get_partition_details(name)
-- Zhou Zheng Sheng / 周征晟 E-mail: zhshzhou@linux.vnet.ibm.com Telephone: 86-10-82454397

on 2014/05/06 23:11, Sheldon wrote:
On 05/06/2014 05:04 PM, Zhou Zheng Sheng wrote:
In PartitionModel.lookup(), it calls disks.get_partitions_names() to verify the given partition is eligible to be used in a new logical storage pool as follow. if name not in disks.get_partitions_names(): raise NotFoundError(...)
When the front-end issues GET request to host/partitions, the back-end would call get_partitions_names() for each partition. If there are many disks and partitions in the host, get_partitions_names() takes a long time to finish, and calling get_partitions_names() for each partition would make the GET request timeout.
This patch is to do the following. When the partition is listed from the URI host/partitions, we should avoid calling get_partitions_names() for each of them. When the partition is listed from the URI host/partitions/sdaX, we should verify sdaX is among the eligible partitions.
The solution is that we make use of the resource_args in Partitions collection class and override the _get_resources() method. In the _get_resources() method, it sets resource_args to "[True]", indicating that the listed partitions are already verified and eligible, so PartitionModel.lookup() would skip the check. When we are outside of _get_resources() method, resource_args is set to "[False]", and PartitionModel.lookup() enables the check.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 26 +++++++++++++++++++++----- src/kimchi/model/host.py | 4 ++-- 2 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ad34919..11673d5 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -17,11 +17,14 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import threading + import cherrypy
from kimchi.control.base import Collection, Resource, SimpleCollection from kimchi.control.utils import UrlSubNode, validate_method from kimchi.exception import OperationFailed +from kimchi.rollbackcontext import RollbackContext from kimchi.template import render
@@ -75,18 +78,31 @@ class Partitions(Collection): def __init__(self, model): super(Partitions, self).__init__(model) self.resource = Partition + self._reset_resource_args() + self._resource_args_lock = threading.Lock() + + def _reset_resource_args(self): + self.resource_args = [False]
- # Defining get_resources in order to return list of partitions in UI - # sorted by their path def _get_resources(self, flag_filter): - res_list = super(Partitions, self)._get_resources(flag_filter) - res_list.sort(key=lambda x: x.info['path']) + with RollbackContext() as rollback: + self._resource_args_lock.acquire() + rollback.prependDefer(self._resource_args_lock.release) + + self.resource_args = [True] + rollback.prependDefer(self._reset_resource_args) + + # Defining get_resources in order to return list of partitions in + # UI sorted by their path + res_list = super(Partitions, self)._get_resources(flag_filter) yes, verified flag for resource can avoid call get_partitions_names twice. Now Partition resource is a special case, it's __init__ accept extra argument "verified". Maybe there will be more special cases.
Not sure we can find a better way to define the Partition as follow: class Partition(Resource): def __init__(self, model, id, verified=False): .... def lookup(self, name, verified=False): ...
from the super _get_resources, it pass the follow args to it's resource. args = self.resource_args + [ident]
Not sure, Is it a good idea for extra self.resource_kargs, and self.model_kargs?
There is another problem in self.resource_args/kargs. It's set as the member variable of the collection controller, and the self.resource_args/kargs is expected to be constant once it's set in the collection controller's __init__() method. This is because during the Kimchi lifecycle, there is only one instance of Partitions collection controller, and this only instance serves every request. There can be two concurrent requests visiting self.resource_args/kargs at the same time, so we either should not change self.resource_args/kargs, or add lock on it. What we really need is a place or mechanism to store per-request and per-thread data. Looks thread local is a good place but that would make it like a global variable. I'm thinking of adding a local variable in the Collection.index() class to store per-request data, and pass this variable down to Collection.create() and Collection.get(). So Collection.get() can again pass this local variable down to Collection._get_resources(), Resource.__init__() and Resource.lookup().
+ res_list.sort(key=lambda x: x.info['path']) return res_list
class Partition(Resource): - def __init__(self, model, id): + def __init__(self, model, verified, id): super(Partition, self).__init__(model, id) + self.model_args = [verified] + list(self.model_args)
@property def data(self): diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 47b0aa1..ebd6f79 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -244,8 +244,8 @@ class PartitionModel(object): def __init__(self, **kargs): pass
- def lookup(self, name): - if name not in disks.get_partitions_names(): + def lookup(self, verified, name): + if not verified and name not in disks.get_partitions_names(): raise NotFoundError("KCHPART0001E", {'name': name})
return disks.get_partition_details(name)
-- Zhou Zheng Sheng / 周征晟 E-mail: zhshzhou@linux.vnet.ibm.com Telephone: 86-10-82454397

on 2014/05/06 23:11, Sheldon wrote:
On 05/06/2014 05:04 PM, Zhou Zheng Sheng wrote:
In PartitionModel.lookup(), it calls disks.get_partitions_names() to verify the given partition is eligible to be used in a new logical storage pool as follow. if name not in disks.get_partitions_names(): raise NotFoundError(...)
When the front-end issues GET request to host/partitions, the back-end would call get_partitions_names() for each partition. If there are many disks and partitions in the host, get_partitions_names() takes a long time to finish, and calling get_partitions_names() for each partition would make the GET request timeout.
This patch is to do the following. When the partition is listed from the URI host/partitions, we should avoid calling get_partitions_names() for each of them. When the partition is listed from the URI host/partitions/sdaX, we should verify sdaX is among the eligible partitions.
The solution is that we make use of the resource_args in Partitions collection class and override the _get_resources() method. In the _get_resources() method, it sets resource_args to "[True]", indicating that the listed partitions are already verified and eligible, so PartitionModel.lookup() would skip the check. When we are outside of _get_resources() method, resource_args is set to "[False]", and PartitionModel.lookup() enables the check.
Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> --- src/kimchi/control/host.py | 26 +++++++++++++++++++++----- src/kimchi/model/host.py | 4 ++-- 2 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/src/kimchi/control/host.py b/src/kimchi/control/host.py index ad34919..11673d5 100644 --- a/src/kimchi/control/host.py +++ b/src/kimchi/control/host.py @@ -17,11 +17,14 @@ # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+import threading + import cherrypy
from kimchi.control.base import Collection, Resource, SimpleCollection from kimchi.control.utils import UrlSubNode, validate_method from kimchi.exception import OperationFailed +from kimchi.rollbackcontext import RollbackContext from kimchi.template import render
@@ -75,18 +78,31 @@ class Partitions(Collection): def __init__(self, model): super(Partitions, self).__init__(model) self.resource = Partition + self._reset_resource_args() + self._resource_args_lock = threading.Lock() + + def _reset_resource_args(self): + self.resource_args = [False]
- # Defining get_resources in order to return list of partitions in UI - # sorted by their path def _get_resources(self, flag_filter): - res_list = super(Partitions, self)._get_resources(flag_filter) - res_list.sort(key=lambda x: x.info['path']) + with RollbackContext() as rollback: + self._resource_args_lock.acquire() + rollback.prependDefer(self._resource_args_lock.release) + + self.resource_args = [True] + rollback.prependDefer(self._reset_resource_args) + + # Defining get_resources in order to return list of partitions in + # UI sorted by their path + res_list = super(Partitions, self)._get_resources(flag_filter) yes, verified flag for resource can avoid call get_partitions_names twice. Now Partition resource is a special case, it's __init__ accept extra argument "verified". Maybe there will be more special cases.
Not sure we can find a better way to define the Partition as follow: class Partition(Resource): def __init__(self, model, id, verified=False): .... def lookup(self, name, verified=False): ...
from the super _get_resources, it pass the follow args to it's resource. args = self.resource_args + [ident]
Not sure, Is it a good idea for extra self.resource_kargs, and self.model_kargs?
I gave up refactoring the controller framework. I sent v2 patch to refactor the underlying disks module, and split the eligibility checking code out.
+ res_list.sort(key=lambda x: x.info['path']) return res_list
class Partition(Resource): - def __init__(self, model, id): + def __init__(self, model, verified, id): super(Partition, self).__init__(model, id) + self.model_args = [verified] + list(self.model_args)
@property def data(self): diff --git a/src/kimchi/model/host.py b/src/kimchi/model/host.py index 47b0aa1..ebd6f79 100644 --- a/src/kimchi/model/host.py +++ b/src/kimchi/model/host.py @@ -244,8 +244,8 @@ class PartitionModel(object): def __init__(self, **kargs): pass
- def lookup(self, name): - if name not in disks.get_partitions_names(): + def lookup(self, verified, name): + if not verified and name not in disks.get_partitions_names(): raise NotFoundError("KCHPART0001E", {'name': name})
return disks.get_partition_details(name)
-- Zhou Zheng Sheng / 周征晟 E-mail: zhshzhou@linux.vnet.ibm.com Telephone: 86-10-82454397
participants (2)
-
Sheldon
-
Zhou Zheng Sheng