Hello Everyone,
Problem Statement:
We notice that get list of resources for approx 2500+ resources it was
timing out when the command executed on system, gives list of resources
greater than 2500+.
Considering that to get the list of resource's identifiers(in
get_list()) and to get info for each resource(lookup()), there is just
one single command(which return lines where each line has info for one
resource), and 2500+ resources are present.
For e.g. lscss in system z gives output like
Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
----------------------------------------------------------------------
0.0.0200 0.0.0000 3390/0a 3990/e9 yes e0 e0 ff b0b10d00 00000000
0.0.0201 0.0.0001 3390/0a 3990/e9 yes e0 e0 ff b0b10d00 00000000
0.0.0202 0.0.0002 3390/0c 3990/e9 e0 e0 ff b0b10d00 00000000
...
...
With the current up streamed code,
1) it executes corresponding command and loop through the output to get
the list of resource identifiers.(in Collection's get_list())
2) another loop for looping through the list(output of get_list()) to
get the identifier and then foreach get its info details again by
running the same command and looping through command output for getting
info for that identifier.
So, O(n*2) in worse case. n--> is number of lines command output has,
each line correspond to one resource.
Options to reduce the Big-O complexity for such scenario:
In case where a single command gives full resource info, this can be
avoided, with below suggested ways:
While looping through command output in get_list() just compose list of
dict where each dict will have info for each resource and return the
list of dict which has all info of the resource(instead of list of
resource identifiers).
And while doing looping through the get_list output(), each entry will
give dict of resource, so just assign it to self.info of each resource.
This avoids another looping mention in #2*. And makes it O(n), n--> is
number of lines command output has, each line correspond to one resource..
Just FYI: With Current up streamed code for 2500+ resource, Curl API
call for GET list of all resource was timing out, and with this
improvement we made it in approx 1.3sec approx.
One way is to overridden the _get_resources(..) method in corresponding
control class.
And another way I can think of is to incorporate this option by adding
another Boolean parameter in base.py's _get_resource(..). And depending
upon the parameter either call lookup() for each resource or we do the
one suggested above. But as this is being called from get(), we need to
figure out the way to pass this parameter from control/model of the
corresponding resource class.
Code changes I am suggesting here will look similar to below.
def _get_resources(self, flag_filter, bulk_list):
"""
In case of bulk_list, get_list of model is suppose to return, dict.
"""
try:
get_list = getattr(self.model, model_fn(self, 'get_list'))
idents = get_list(self.model_args, *flag_filter)
res_list = []
for ident in idents:
# internal text, get_list changes ident to unicode for sorted
args = self.resource_args + [ident]
res = self.resource(self.model, *args)
if(bulk_list):
res.info = ident
else:
res.lookup()
res_list.append(res)
return res_list
except AttributeError:
return []
Please provide your inputs.
Thanks,
Archana Singh
On 10/13/2015 6:43 PM, Aline Manera wrote:
Hi Archana,
All the discussion are done through the ML so it is better if you send
your suggestion directly here instead of github.
It is easier to everyone get involved in the discussion.
Regards,
Aline Manera
On 12/10/2015 14:46, Archana Singh wrote:
> Hello Team,
>
> Raised #739, where we see timeout in CURL GET list of resources for
> 2500+ resource in list.
> I have suggested two ways in the issue comments. Please provide your
> inputs.
>
> Thanks,
> Archana Singh
>
> _______________________________________________
> Kimchi-devel mailing list
> Kimchi-devel(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/kimchi-devel
>