<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi,<br>

    <br>

    I found that its a little difficult to detect IO bandwidth

    congestion in ovirt storage domains supported by NFS or GlusterFs.<br>

    <br>

    <font size="3"><font color="#000000">For block based storage, it's

        easier to detect, since</font></font> you can use some tool like

    iostat.<font color="#000000"><font size="3"> F</font></font><font

      size="3"><font color="#000000">or the case of file system based

        storage, it's much harder.<br>

        <br>

        I investigate the existing solution. vsphere uses average IO

        latency to detect it. I propose a similar scheme in

        <a class="moz-txt-link-freetext" href="http://www.ovirt.org/Features/Design/SLA_for_storage_io_bandwidth">http://www.ovirt.org/Features/Design/SLA_for_storage_io_bandwidth</a>

        . It simplifies the scheme by make the congestion decision on a

        single host instead of using the statistics from all the hosts

        use the backend storage. It doesn't need communication between

        hosts and maybe in phase two we can add communication and make a

        global decision.<br>

        <br>

        &nbsp;For now, it detects congestion via</font></font><font size="3"><font

        color="#000000"> statistic</font></font><font size="3"><font

        color="#000000">s of vms&nbsp;</font></font><font size="3"><font

        color="#000000"> </font></font><font size="3"><font

        color="#000000">using that backend storage in the local

        host(This info is collected through iostat in vm).&nbsp; It collects

        IO latency in such vms and compute an average latency for that

        backend storage. If it is higher than threshold, a congestion is

        found.<br>

        <br>

        However, when I did testing for such a policy, I found that

        setting IO limit to a smaller value will make latency longer.

        That means if average latency exceeds that threshold and then

        our automatically tuning IO limit will be decreased which lead

        to average IO latency longer. Of course, this </font></font><font

      size="3"><font color="#000000"> IO latency will exceed the

        threshold again and cause the IO limit be decreased.</font></font>

    This will finally cause the IO limit to its lower bound. <br>

    <br>

    This scheme has affected by the following reasons:<br>

    1. we collect statistic data from vms instead of host. (This is

    because it is hard to collect such info for remote storage like NFS,

    GlusterFS)<br>

    2.The IO limit affect the latency.<br>

    3. The threshold is a constant.<br>

    4 I also find that iostat's await(latency info) is not good enough

    since the latency is long for very light io or very heavy IO. <br>

    <font size="3"><font color="#000000"><br>

        <br>

        Does anybody get an idea or have experience on this? Suggestions

        are more than welcome. Thanks in advance.</font></font>

  </body>

</html>