Re: [Engine-devel] [vdsm] Proposal VDSM <=> Engine Data Statistics Retrieval Optimization

30 May 2013

      This is a multi-part message in MIME format.
--------------090700020909000108020704
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

Hi,

I finally have an update for the proposal ready, after some thinking and 
throwing some ideas around with colleagues we came to the conclusion 
that the best approach might be as followed (without having dynamic 
return values)

I have set up some test engine with 'FakeVDSM' and I am tracking the 
data sent. The system has 250 hosts and about 10000 VMs running. I have 
captured 1.3 GiB TCP data within about 20 minutes only from the 
communication between the engine and the fake VDSM host. In a frame of 
23 minutes I have captured 13.560 getAllVMStats and getAllVdsStats 
calls, and 67827 calls to list.

getAllVmStats replies have produced in that time about 510MiB of data 
transmitted to the engine. This proposal below would reduce the overall 
amount of data significantly and we're preparing to work on a prototype 
for the engine backend as well so we can backup this statement with hard 
numbers.

Here's the proposal.

    VDSM <=> Engine data retrieval optimization

      Motivation:

Currently the oVirt Engine is polling a lot of data from VDSM every 15 
seconds. This should be optimized and the amount of data requested 
should be more specific.

For each VM the data currently contains much more information than 
actually needed which blows up the size of the XML content quite big. We 
could optimize this by splitting the reply on the getVmStats based on 
the needs of the engine into different requests. For this reason Omer 
Frenkel and me have split up the data into parts based on their usage.

      Changes

        New Verbs

          getAllRuntimeStats

Get runtime information of all VMs
Returns for each VM a map with UUID and a value of:

  * *@cpuSys* Ratio of CPU time spent by qemu on other than guest time
  * *@cpuUser* Ratio of CPU time spent by the guest VM
  * *@memUsage* The percent of memory in use by the guest
  * *@elapsedTime* The number of seconds that the VM has been running
  * *@status* The current status of the given VM
  * *@statsAge* The age of these statistics in seconds
  * *@hashes* Hashes of several statistics and information around VMs

Hashes consists of:

  * *@info* Hash for VmConfInfo data
  * *@config* Hash of the VM configuration XML
  * *@status* Hash of the VmStatusInfo data
  * *@guestDetails* Hash of the VmGuestDetails data

          getStatus

Get status information about a list of VMs
Parameters:

  * *@vmIDs* a list of UUIDs for VMs to query

Returns for each VM in vmIDs a map with UUID and a value of:

  * *timeOffset* The time difference from host to the VM in seconds
  * *monitorResponse* Indicates if the qemu monitor is responsive
  * *clientIp* The IP address of the client connected to the display
  * *username* the username associated with the current session
  * *session* The current state of user interaction with the VM
  * *guestIPs* A space separated string of assigned IPv4 addresses
  * *pauseCode* Indicates the reason a VM has been paused

          getConfInfo

Get configuration information about a list of VMs
Parameters:

  * *@vmIDs* a list of UUIDs for VMs to query

Returns for each VM in vmIDs a map with UUID and a value of:

  * *acpiEnable* Indicates if ACPI is enabled inside the VM
  * *displayPort* The port in use for unencrypted display data
  * *displaySecurePort* The port in use for encrypted display data
  * *displayType* The type of display in use
  * *displayIp* The IP address to use for accessing the VM display
  * *pid* The process ID of the underlying qemu process
  * *vmType* The type of VM
  * *kvmEnable* Indicates if KVM hardware acceleration is enabled
  * *cdrom* /*optional*/ The path to an ISO image used in the VM's
    CD-ROM device
  * *boot* /*optional*/ An alias for the type of device used to boot the VM

          getAllDeviceStats

VM device statistics containing information for getting statistics and 
SLA information
Returns for each VM a map with UUID and a value of:

  * *memoryStats* Memory statistics as reported by the guest agent
  * *balloonInfo* Guest memory balloon information
  * *disksUsage* Info about mounted filesystems as reported by the agent
  * *network* Network bandwidth/utilization statistics
  * *disks* Disk bandwidth/utilization statistics

          getGuestDetails

Get details from the guest OS from a list of VMs
Parameters:

  * *@vmIDs* a list of UUIDs for VMs to query

Returns for each VM in vmIDs a map with UUID and a value of:

  * *appsList* A list of installed applications with their versions
  * *netIfaces* Network device address info as reported by the agent

        Usage

Currently the engine is requesting currently every 3 seconds the vm list 
from each vdsm host and every 15 seconds all the data mentioned above 
for all VMs.

The change would be as follows:

The engine requests every 3 seconds getAllRuntimeStats from vdsm which 
will give the engine the most used data. If the engine has a mismatch of 
the hashes returned by getAllRuntimeStats it should request the data 
changed.

if hashes.info changed => request getConfInfo with all vmIDs on that 
host where the hash changed if hashes.status changed => request 
getStatus with all vmIDs on that host where the hash changed if 
hashes.guestDetails changed => request getGuestDetails with all vmIDs on 
that host where the hash changed

Request getAllDeviceStats periodically e.g. every 5 minutes, which 
should be sufficient for the DWH, in case it is not it could be even 
configurable.

On 03/17/2013 04:30 PM, Dan Kenigsberg wrote:
...
On Sun, Mar 17, 2013 at 10:28:15AM -0400, Ayal Baron wrote:
...
----- Original Message -----
...
...
...
...
>>> The only reason we have this problem is because there is this
>>> thing against making multiple calls.
>>>
>>> Just split it up.
>>> getVmRuntimeStats() - transient things like mem and cpu%
>>> getVmInformation() - (semi)static things like disk\networking
>>> layout
>>> etc.
>>> Each updated at different intervals.
>> +1 on splitting the data up into 2 separate API calls.
>> You could potentially add a checksum (md5, or any other way) of
>> the
>> "static" data to getVmRuntimeStats and not bother even with
>> polling
>> the VmInformation if this hasn't changed.  Then you could poll
>> as
>> often as you'd like the stats and immediately see if you also
>> need
>> to retrieve VmInfo or not (you rarely would).
> +1 To Ayal's suggestion
> except that instead of the engine hashing the data VDSM sends
> the
> key which is opaque to the engine.
> This can be a local timestap or a generation number.
Of course vdsm does the hash, otherwise you'd need to pass all
the
data to engine which would beat the purpose.
I thought you meant engine will be sending the hash of previous
requests
On 03/13/2013 11:55 PM, Ayal Baron wrote:
...
per VM to vdsm, then vdsm will reply back with vm's removed, vm's
added,
and the details for vm's that changed (i.e., engine would be doing
something like if-modified-since-checksum per vm).
benefit is reducing a round trip.
but first would need to split to calls of stats (always changing)
and
slowly/never changing data.
If vdms accepts the hash then in your method engine would have to
----- Original Message -----
periodically call getVmInfo(hash).
What I was suggesting is that getVmStats would return vmInfo hash
so that we could avoid calling getVmInfo altogether.
The stats *always* change so there is no need for checking if that
info has changed.
What we could do is avoid the split into 2 verbs by calling
getVmStats(hash) and then have getVmStats return everything if the
hash has changed or only the stats if it hasn't.  This would be
the least number of roundtrips and avoid the split.  If you don't
pass a hash it would return everything so this way it's also fully
backward compatible.
For the 'static' data, why is there a need for a hash?
If VDSM sends in each update a timestamp, can't RHEVM just use
if-modified-since with the last timestamp it got from VDSM?
Is it cheaper for VDSM to calculate the hash, than update the
timestamp
On 17/03/13 15:13, Ayal Baron wrote:
per change in any of the fields? It doesn't really need to update the
timestamp per change, only for the first change since last update
sent
actually (so 'dirty' flag in a way, to signify data that RHEVM hasn't
seen yet).
Y.
As Saggi mentioned: "VDSM sends the key which is opaque to the engine. This can be a local timestap or a generation number."
The content doesn't matter, what matters is that it has changed.
timestamp assumes that vdsm will track changes and send only delta.
Although possible this would be an overkill (for every value in the
dict you'd have to hold a timestamp of last change and send only those
which have changed since the timestamp which was passed by the user).
If we're in the spirit of quoting Saggi, this suggestion is not
compatible with "...mak[ing] the return value differ according to input
... is a big no no when talking about type safe APIs.".
Dan.
_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
-- 
Regards,

Vinzenz Feenstra | Senior Software Engineer
RedHat Engineering Virtualization R & D
Phone: +420 532 294 625
IRC: vfeenstr or evilissimo

Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com

--------------090700020909000108020704
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi,<br>
      <br>
      I finally have an update for the proposal ready, after some
      thinking and throwing some ideas around with colleagues we came to
      the conclusion that the best approach might be as followed
      (without having dynamic return values)<br>
      <br>
      I have set up some test engine with 'FakeVDSM' and I am tracking
      the data sent. The system has 250 hosts and about 10000 VMs
      running. I have captured 1.3 GiB TCP data within about 20 minutes
      only from the communication between the engine and the fake VDSM
      host. In a frame of 23 minutes I have captured
      <meta http-equiv="CONTENT-TYPE" content="text/html; charset=UTF-8">
      <title></title>
      <meta name="GENERATOR" content="LibreOffice 3.6 (Linux)">
      <style>
		<!-- 
		BODY,DIV,TABLE,THEAD,TBODY,TFOOT,TR,TH,TD,P { font-family:"Liberation Sans"; font-size:x-small }
		 -->
	</style>13.560 getAllVMStats and getAllVdsStats calls, and 67827 calls
      to list.<br>
      <br>
      getAllVmStats replies have produced in that time about 510MiB of
      data transmitted to the engine. This proposal below would reduce
      the overall amount of data significantly and we're preparing to
      work on a prototype for the engine backend as well so we can
      backup this statement with hard numbers.<br>
      <br>
      <br>
      Here's the proposal.<br>
      <br>
      <h2><span class="mw-headline"
          id="VDSM_.3C.3D.3E_Engine_data_retrieval_optimization">VDSM
          <=> Engine data retrieval optimization </span></h2>
      <h3> <span class="mw-headline" id="Motivation:"> Motivation: </span></h3>
      <p>Currently the oVirt Engine is polling a lot of data from VDSM
        every 15 seconds. This should be optimized and the amount of
        data requested should be more specific.
      </p>
      <p>For each VM the data currently contains much more information
        than actually needed which blows up the size of the XML content
        quite big. We could optimize this by splitting the reply on the
        getVmStats based on the needs of the engine into different
        requests.
        For this reason Omer Frenkel and me have split up the data into
        parts based on their usage.
      </p>
      <h3> <span class="mw-headline" id="Changes"> Changes </span></h3>
      <h4> <span class="mw-headline" id="New_Verbs"> New Verbs </span></h4>
      <h5> <span class="mw-headline" id="getAllRuntimeStats">
          getAllRuntimeStats </span></h5>
      <p>Get runtime information of all VMs<br>
        Returns for each VM a map with UUID and a value of:
      </p>
      <ul>
        <li> <b>@cpuSys</b> Ratio of CPU time spent by qemu on other
          than guest time
        </li>
        <li> <b>@cpuUser</b> Ratio of CPU time spent by the guest VM
        </li>
        <li> <b>@memUsage</b> The percent of memory in use by the guest
        </li>
        <li> <b>@elapsedTime</b> The number of seconds that the VM has
          been running
        </li>
        <li> <b>@status</b> The current status of the given VM
        </li>
        <li> <b>@statsAge</b> The age of these statistics in seconds
        </li>
        <li> <b>@hashes</b> Hashes of several statistics and
          information around VMs
        </li>
      </ul>
      <p>Hashes consists of:
      </p>
      <ul>
        <li> <b>@info</b> Hash for VmConfInfo data
        </li>
        <li> <b>@config</b> Hash of the VM configuration XML
        </li>
        <li> <b>@status</b> Hash of the VmStatusInfo data
        </li>
        <li> <b>@guestDetails</b> Hash of the VmGuestDetails data
        </li>
      </ul>
      <h5> <span class="mw-headline" id="getStatus"> getStatus </span></h5>
      <p>Get status information about a list of VMs<br>
        Parameters: </p>
      <ul>
        <li> <b>@vmIDs</b> a list of UUIDs for VMs to query
        </li>
      </ul>
      <p>Returns for each VM in vmIDs a map with UUID and a value of:
      </p>
      <ul>
        <li> <b>timeOffset</b> The time difference from host to the VM
          in seconds
        </li>
        <li> <b>monitorResponse</b> Indicates if the qemu monitor is
          responsive
        </li>
        <li> <b>clientIp</b> The IP address of the client connected to
          the display
        </li>
        <li> <b>username</b> the username associated with the current
          session
        </li>
        <li> <b>session</b> The current state of user interaction with
          the VM
        </li>
        <li> <b>guestIPs</b> A space separated string of assigned IPv4
          addresses
        </li>
        <li> <b>pauseCode</b> Indicates the reason a VM has been paused
        </li>
      </ul>
      <h5> <span class="mw-headline" id="getConfInfo"> getConfInfo </span></h5>
      <p>Get configuration information about a list of VMs<br>
        Parameters: </p>
      <ul>
        <li> <b>@vmIDs</b> a list of UUIDs for VMs to query
        </li>
      </ul>
      <p>Returns for each VM in vmIDs a map with UUID and a value of:
      </p>
      <ul>
        <li> <b>acpiEnable</b> Indicates if ACPI is enabled inside the
          VM
        </li>
        <li> <b>displayPort</b> The port in use for unencrypted display
          data
        </li>
        <li> <b>displaySecurePort</b> The port in use for encrypted
          display data
        </li>
        <li> <b>displayType</b> The type of display in use
        </li>
        <li> <b>displayIp</b> The IP address to use for accessing the
          VM display
        </li>
        <li> <b>pid</b> The process ID of the underlying qemu process
        </li>
        <li> <b>vmType</b> The type of VM
        </li>
        <li> <b>kvmEnable</b> Indicates if KVM hardware acceleration is
          enabled
        </li>
        <li> <b>cdrom</b> <i><b>optional</b></i> The path to an ISO
          image used in the VM's CD-ROM device
        </li>
        <li> <b>boot</b> <i><b>optional</b></i> An alias for the type
          of device used to boot the VM
        </li>
      </ul>
      <h5> <span class="mw-headline" id="getAllDeviceStats">
          getAllDeviceStats </span></h5>
      <p>VM device statistics containing information for getting
        statistics and SLA information<br>
        Returns for each VM a map with UUID and a value of:
      </p>
      <ul>
        <li> <b>memoryStats</b> Memory statistics as reported by the
          guest agent
        </li>
        <li> <b>balloonInfo</b> Guest memory balloon information
        </li>
        <li> <b>disksUsage</b> Info about mounted filesystems as
          reported by the agent
        </li>
        <li> <b>network</b> Network bandwidth/utilization statistics
        </li>
        <li> <b>disks</b> Disk bandwidth/utilization statistics
        </li>
      </ul>
      <h5> <span class="mw-headline" id="getGuestDetails">
          getGuestDetails </span></h5>
      <p>Get details from the guest OS from a list of VMs<br>
        Parameters: </p>
      <ul>
        <li> <b>@vmIDs</b> a list of UUIDs for VMs to query
        </li>
      </ul>
      <p>Returns for each VM in vmIDs a map with UUID and a value of:
      </p>
      <ul>
        <li> <b>appsList</b> A list of installed applications with
          their versions
        </li>
        <li> <b>netIfaces</b> Network device address info as reported
          by the agent
        </li>
      </ul>
      <h4> <span class="mw-headline" id="Usage"> Usage </span></h4>
      <p>Currently the engine is requesting currently every 3 seconds
        the vm list from each vdsm host and every 15 seconds all the
        data mentioned above for all VMs.
      </p>
      <p>The change would be as follows:
      </p>
      <p>The engine requests every 3 seconds getAllRuntimeStats from
        vdsm which will give the engine the most used data.
        If the engine has a mismatch of the hashes returned by
        getAllRuntimeStats it should request the data changed.
      </p>
      <p>if hashes.info changed => request getConfInfo with all vmIDs
        on that host where the hash changed
        if hashes.status changed => request getStatus with all vmIDs
        on that host where the hash changed
        if hashes.guestDetails changed => request getGuestDetails
        with all vmIDs on that host where the hash changed
      </p>
      <p>Request getAllDeviceStats periodically e.g. every 5 minutes,
        which should be sufficient for the DWH, in case it is not it
        could be even configurable.
      </p>
      <br>
      <br>
      On 03/17/2013 04:30 PM, Dan Kenigsberg wrote:<br>
    </div>
    <blockquote cite="mid:20130317153038.GB4891@redhat.com" type="cite">
      <pre wrap="">On Sun, Mar 17, 2013 at 10:28:15AM -0400, Ayal Baron wrote:
</pre>
      <blockquote type="cite">
        <pre wrap="">

----- Original Message -----
</pre>
        <blockquote type="cite">
          <pre wrap="">On 17/03/13 15:13, Ayal Baron wrote:
</pre>
          <blockquote type="cite">
            <pre wrap="">
----- Original Message -----
</pre>
            <blockquote type="cite">
              <pre wrap="">On 03/13/2013 11:55 PM, Ayal Baron wrote:
...
</pre>
              <blockquote type="cite">
                <blockquote type="cite">
                  <blockquote type="cite">
                    <blockquote type="cite">
                      <pre wrap="">The only reason we have this problem is because there is this
thing against making multiple calls.

Just split it up.
getVmRuntimeStats() - transient things like mem and cpu%
getVmInformation() - (semi)static things like disk\networking
layout
etc.
Each updated at different intervals.
</pre>
                    </blockquote>
                    <pre wrap="">+1 on splitting the data up into 2 separate API calls.
You could potentially add a checksum (md5, or any other way) of
the
"static" data to getVmRuntimeStats and not bother even with
polling
the VmInformation if this hasn't changed.  Then you could poll
as
often as you'd like the stats and immediately see if you also
need
to retrieve VmInfo or not (you rarely would).
</pre>
                  </blockquote>
                  <pre wrap="">+1 To Ayal's suggestion
except that instead of the engine hashing the data VDSM sends
the
key which is opaque to the engine.
This can be a local timestap or a generation number.
</pre>
                </blockquote>
                <pre wrap="">Of course vdsm does the hash, otherwise you'd need to pass all
the
data to engine which would beat the purpose.
</pre>
              </blockquote>
              <pre wrap="">I thought you meant engine will be sending the hash of previous
requests
per VM to vdsm, then vdsm will reply back with vm's removed, vm's
added,
and the details for vm's that changed (i.e., engine would be doing
something like if-modified-since-checksum per vm).
benefit is reducing a round trip.
but first would need to split to calls of stats (always changing)
and
slowly/never changing data.
</pre>
            </blockquote>
            <pre wrap="">If vdms accepts the hash then in your method engine would have to
periodically call getVmInfo(hash).
What I was suggesting is that getVmStats would return vmInfo hash
so that we could avoid calling getVmInfo altogether.
The stats *always* change so there is no need for checking if that
info has changed.
What we could do is avoid the split into 2 verbs by calling
getVmStats(hash) and then have getVmStats return everything if the
hash has changed or only the stats if it hasn't.  This would be
the least number of roundtrips and avoid the split.  If you don't
pass a hash it would return everything so this way it's also fully
backward compatible.
</pre>
          </blockquote>
          <pre wrap="">
For the 'static' data, why is there a need for a hash?
If VDSM sends in each update a timestamp, can't RHEVM just use
if-modified-since with the last timestamp it got from VDSM?
Is it cheaper for VDSM to calculate the hash, than update the
timestamp
per change in any of the fields? It doesn't really need to update the
timestamp per change, only for the first change since last update
sent
actually (so 'dirty' flag in a way, to signify data that RHEVM hasn't
seen yet).
Y.
</pre>
        </blockquote>
        <pre wrap="">
As Saggi mentioned: "VDSM sends the key which is opaque to the engine. This can be a local timestap or a generation number."

The content doesn't matter, what matters is that it has changed.
timestamp assumes that vdsm will track changes and send only delta.
Although possible this would be an overkill (for every value in the
dict you'd have to hold a timestamp of last change and send only those
which have changed since the timestamp which was passed by the user).
</pre>
      </blockquote>
      <pre wrap="">
If we're in the spirit of quoting Saggi, this suggestion is not
compatible with "...mak[ing] the return value differ according to input
... is a big no no when talking about type safe APIs.".

Dan.
_______________________________________________
vdsm-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:vdsm-devel@lists.fedorahosted.org">vdsm-devel@lists.fedorahosted.org</a>
<a class="moz-txt-link-freetext" href="https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel">https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel</a>
</pre>
    </blockquote>
    <br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Regards,

Vinzenz Feenstra | Senior Software Engineer
RedHat Engineering Virtualization R & D
Phone: +420 532 294 625
IRC: vfeenstr or evilissimo

Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com</pre>
  </body>
</html>

--------------090700020909000108020704--