ovirt sdk and pipelining

I just read the blog entry about performance increate in for the python sdk (https://www.ovirt.org/blog/2017/05/higher-performance-for-python-sdk/). I'm quite sceptical about pipelining. A few explanation about that can be found at: https://devcentral.f5.com/articles/http-pipelining-a-security-risk-without-r... https://stackoverflow.com/questions/14810890/what-are-the-disadvantages-of-u... It also talks about multiple connection, but don't use pycurl.CurlShare(). I thing this might be very helpfull, as it allows to share cookies, see https://curl.haxx.se/libcurl/c/CURLOPT_SHARE.html.

On 06/16/2017 09:52 AM, Fabrice Bacchella wrote:
I just read the blog entry about performance increate in for the python sdk (https://www.ovirt.org/blog/2017/05/higher-performance-for-python-sdk/).
I'm quite sceptical about pipelining.
A few explanation about that can be found at: https://devcentral.f5.com/articles/http-pipelining-a-security-risk-without-r... https://stackoverflow.com/questions/14810890/what-are-the-disadvantages-of-u...
Did you test it? Can you share the results? In our tests pipe-lining dramatically increases the performance in large scale environments with high latency. In our tests with 4000 virtual machines 10000 disks and 150ms of latency retrieving the complete inventory is reduced from approx 1 hour to approx 2 minutes. Note that the usage of the HTTP protocol in this scenario is very different from the typical usage when a browser retrieves a web page.
It also talks about multiple connection, but don't use pycurl.CurlShare(). I thing this might be very helpfull, as it allows to share cookies, see https://curl.haxx.se/libcurl/c/CURLOPT_SHARE.html.
The SDK uses the curl "multi" mechanism, which automatically shares the DNS cache. In addition version 4 of the SDK does not use cookies. So this shouldn't be relevant.

Le 16 juin 2017 à 10:13, Juan Hernández <jhernand@redhat.com> a écrit :
On 06/16/2017 09:52 AM, Fabrice Bacchella wrote:
I just read the blog entry about performance increate in for the python sdk (https://www.ovirt.org/blog/2017/05/higher-performance-for-python-sdk/).
I'm quite sceptical about pipelining.
In our tests pipe-lining dramatically increases the performance in large scale environments with high latency. In our tests with 4000 virtual machines 10000 disks and 150ms of latency retrieving the complete inventory is reduced from approx 1 hour to approx 2 minutes.
Bench are the ultimate judge. So if it works in many different use case for , that's nice and intersting.
Note that the usage of the HTTP protocol in this scenario is very different from the typical usage when a browser retrieves a web page.
Indeed, all the literature is about interactive usage. A very different use case.
It also talks about multiple connection, but don't use pycurl.CurlShare(). I thing this might be very helpfull, as it allows to share cookies, see https://curl.haxx.se/libcurl/c/CURLOPT_SHARE.html.
The SDK uses the curl "multi" mechanism, which automatically shares the DNS cache.
This: https://curl.haxx.se/libcurl/c/CURLOPT_DNS_USE_GLOBAL_CACHE.html ? WARNING: this option is considered obsolete. Stop using it. Switch over to using the share interface instead! See CURLOPT_SHARE and curl_share_init.
In addition version 4 of the SDK does not use cookies. So this shouldn't be relevant.
From some of my own code: self._share.setopt(pycurl.SH_SHARE, pycurl.LOCK_DATA_COOKIE) self._share.setopt(pycurl.SH_SHARE, pycurl.LOCK_DATA_DNS) self._share.setopt(pycurl.SH_SHARE, pycurl.LOCK_DATA_SSL_SESSION) And users apaches settings can use cookies for custom usages.

El 2017-06-16 08:52, Fabrice Bacchella escribió:
I just read the blog entry about performance increate in for the python sdk (https://www.ovirt.org/blog/2017/05/higher-performance-for-python-sdk/).
I'm quite sceptical about pipelining.
I disagree. Even without reading the post you mention, we already noticed that since this version everything is working much faster than with prior versions. We have a lot of stuff implemented with Python-SDK but on one of them the effect is quite notorious: A script checks VMs' permissions and takes decisions based on them. Without pipelining this script took about 5 minutes to execute, with pipelining it doesn't take more than 15 seconds on a ~1000VMs infrastructure. Regards, Nicolás
A few explanation about that can be found at: https://devcentral.f5.com/articles/http-pipelining-a-security-risk-without-r... https://stackoverflow.com/questions/14810890/what-are-the-disadvantages-of-u...
It also talks about multiple connection, but don't use pycurl.CurlShare(). I thing this might be very helpfull, as it allows to share cookies, see https://curl.haxx.se/libcurl/c/CURLOPT_SHARE.html. _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (3)
-
Fabrice Bacchella
-
Juan Hernández
-
nicolas@devels.es