[Engine-devel] [vdsm] Proposal VDSM <=> Engine Data Statistics Retrieval Optimization

Ayal Baron abaron at redhat.com
Sun Mar 17 14:28:15 UTC 2013



----- Original Message -----
> On 17/03/13 15:13, Ayal Baron wrote:
> >
> > ----- Original Message -----
> >> On 03/13/2013 11:55 PM, Ayal Baron wrote:
> >> ...
> >>>>>> The only reason we have this problem is because there is this
> >>>>>> thing against making multiple calls.
> >>>>>>
> >>>>>> Just split it up.
> >>>>>> getVmRuntimeStats() - transient things like mem and cpu%
> >>>>>> getVmInformation() - (semi)static things like disk\networking
> >>>>>> layout
> >>>>>> etc.
> >>>>>> Each updated at different intervals.
> >>>>> +1 on splitting the data up into 2 separate API calls.
> >>>>> You could potentially add a checksum (md5, or any other way) of
> >>>>> the
> >>>>> "static" data to getVmRuntimeStats and not bother even with
> >>>>> polling
> >>>>> the VmInformation if this hasn't changed.  Then you could poll
> >>>>> as
> >>>>> often as you'd like the stats and immediately see if you also
> >>>>> need
> >>>>> to retrieve VmInfo or not (you rarely would).
> >>>> +1 To Ayal's suggestion
> >>>> except that instead of the engine hashing the data VDSM sends
> >>>> the
> >>>> key which is opaque to the engine.
> >>>> This can be a local timestap or a generation number.
> >>> Of course vdsm does the hash, otherwise you'd need to pass all
> >>> the
> >>> data to engine which would beat the purpose.
> >> I thought you meant engine will be sending the hash of previous
> >> requests
> >> per VM to vdsm, then vdsm will reply back with vm's removed, vm's
> >> added,
> >> and the details for vm's that changed (i.e., engine would be doing
> >> something like if-modified-since-checksum per vm).
> >> benefit is reducing a round trip.
> >> but first would need to split to calls of stats (always changing)
> >> and
> >> slowly/never changing data.
> > If vdms accepts the hash then in your method engine would have to
> > periodically call getVmInfo(hash).
> > What I was suggesting is that getVmStats would return vmInfo hash
> > so that we could avoid calling getVmInfo altogether.
> > The stats *always* change so there is no need for checking if that
> > info has changed.
> > What we could do is avoid the split into 2 verbs by calling
> > getVmStats(hash) and then have getVmStats return everything if the
> > hash has changed or only the stats if it hasn't.  This would be
> > the least number of roundtrips and avoid the split.  If you don't
> > pass a hash it would return everything so this way it's also fully
> > backward compatible.
> 
> For the 'static' data, why is there a need for a hash?
> If VDSM sends in each update a timestamp, can't RHEVM just use
> if-modified-since with the last timestamp it got from VDSM?
> Is it cheaper for VDSM to calculate the hash, than update the
> timestamp
> per change in any of the fields? It doesn't really need to update the
> timestamp per change, only for the first change since last update
> sent
> actually (so 'dirty' flag in a way, to signify data that RHEVM hasn't
> seen yet).
> Y.

As Saggi mentioned: "VDSM sends the key which is opaque to the engine. This can be a local timestap or a generation number."

The content doesn't matter, what matters is that it has changed.
timestamp assumes that vdsm will track changes and send only delta.  Although possible this would be an overkill (for every value in the dict you'd have to hold a timestamp of last change and send only those which have changed since the timestamp which was passed by the user).

Either way, I don't care what the 'hash' is, the point was that there is a simple way to keep a single API call, keep BC and toggle returning all data or just statistics (data that changes frequently) since last time user checked while minimizing API calls.

> 
> >
> >>>> But, we might want to consider that when we add events polling
> >>>> becomes (much) less frequent so maybe it'll be an overkill.
> >>> You'd still need to compare versions of the data in vdsm and send
> >>> only if it changed.  If you don't persist what was received last
> >>> then potentially you could have a monday morning effect where
> >>> upon
> >>> on system startup you'd be sending everything.  So I still think
> >>> you'd want to have the hash.
> >>>
> >>>
> >>>>>> ----- Original Message -----
> >>>>>>> From: "Vinzenz Feenstra" <vfeenstr at redhat.com>
> >>>>>>> To: vdsm-devel at lists.fedorahosted.org, engine-devel at ovirt.org
> >>>>>>> Sent: Thursday, March 7, 2013 6:25:54 AM
> >>>>>>> Subject: [Engine-devel] Proposal VDSM <=> Engine Data
> >>>>>>> Statistics
> >>>>>>> Retrieval	Optimization
> >>>>>>>
> >>>>>>>
> >>>>>>> Please find the prettier version on the wiki:
> >>>>>>> http://www.ovirt.org/Proposal_VDSM_-_Engine_Data_Statistics_Retrieval
> >>>>>>>
> >>>>>>> Proposal VDSM - Engine Data Statistics Retrieval
> >>>>>>> VDSM <=> Engine data retrieval optimization
> >>>>>>> Motivation:
> >>>>>>>
> >>>>>>>
> >>>>>>> Currently the RHEVM engine is polling the a lot of data from
> >>>>>>> VDSM
> >>>>>>> every 15 seconds. This should be optimized and the amount of
> >>>>>>> data
> >>>>>>> requested should be more specific.
> >>>>>>>
> >>>>>>> For each VM the data currently contains much more information
> >>>>>>> than
> >>>>>>> actually needed which blows up the size of the XML content
> >>>>>>> quite
> >>>>>>> big. We could optimize this by splitting the reply on the
> >>>>>>> getVmStats
> >>>>>>> based on the request of the engine into sections. For this
> >>>>>>> reason
> >>>>>>> Omer Frenkel and me have split up the data into parts based
> >>>>>>> on
> >>>>>>> their
> >>>>>>> usage.
> >>>>>>>
> >>>>>>> This data can and usually does change during the lifetime of
> >>>>>>> the
> >>>>>>> VM.
> >>>>>>> Rarely Changed:
> >>>>>>>
> >>>>>>>
> >>>>>>> This data is change not very frequent and it should be enough
> >>>>>>> to
> >>>>>>> update this only once in a while. Most commonly this data
> >>>>>>> changes
> >>>>>>> after changes made in the UI or after a migration of the VM
> >>>>>>> to
> >>>>>>> another Host. Status = Running acpiEnable = true vmType = kvm
> >>>>>>> guestName = W864GUESTAGENTT displayType = qxl guestOs = Win 8
> >>>>>>> kvmEnable = true # this should be constant and never changed
> >>>>>>> pauseCode = NOERR monitorResponse = 0 session = Locked #
> >>>>>>> unused
> >>>>>>> netIfaces = [{'name': 'Realtek RTL8139C+ Fast Ethernet NIC',
> >>>>>>> 'inet6':  ['fe80::490c:92bb:bbcc:9f87'], 'inet':
> >>>>>>> ['10.34.60.148'],
> >>>>>>> 'hw': '00:1a:4a:22:3c:db'}] appsList = ['RHEV-Tools 3.2.4',
> >>>>>>> 'RHEV-Agent64 3.2.3', 'RHEV-Serial64 3.2.3', 'RHEV-Network64
> >>>>>>> 3.2.2',
> >>>>>>> 'RHEV-Network64 3.2.3', 'RHEV-Block64 3.2.3', 'RHEV-Balloon64
> >>>>>>> 3.2.3', 'RHEV-Balloon64 3.2.2', 'RHEV-Agent64 3.2.2',
> >>>>>>> 'RHEV-USB
> >>>>>>> 3.2.3', 'RHEV-Block64 3.2.2', 'RHEV-Serial64 3.2.2'] pid =
> >>>>>>> 11314
> >>>>>>> guestIPs = 10.34.60.148 # duplicated info displayIp = 0
> >>>>>>> displayPort
> >>>>>>> = 5902 displaySecurePort = 5903 username =
> >>>>>>> user at W864GUESTAGENTT
> >>>>>>> clientIp = lastLogin = 1361976900.67 Often Changed:
> >>>>>>>
> >>>>>>>
> >>>>>>> This data is changed quite often however it is not necessary
> >>>>>>> to
> >>>>>>> update this data every 15 seconds. As this is cumulative data
> >>>>>>> and
> >>>>>>> reflects the current status, and it does not need to be
> >>>>>>> snapshotted
> >>>>>>> every 15 seconds to retrieve statistics. The data can be
> >>>>>>> retrieved
> >>>>>>> in much more generous time slices. (e.g. Every 5 minutes)
> >>>>>>> network
> >>>>>>> =
> >>>>>>> {'vnet1': {'macAddr': '00:1a:4a:22:3c:db', 'rxDropped': '0',
> >>>>>>> 'txDropped': '0', 'rxErrors': '0', 'txRate': '0.0', 'rxRate':
> >>>>>>> '0.0',
> >>>>>>> 'txErrors': '0', 'state': 'unknown', 'speed': '100', 'name':
> >>>>>>> 'vnet1'}} disksUsage = [{'path': 'c:\\', 'total':
> >>>>>>> '64055406592',
> >>>>>>> 'fs': 'NTFS', 'used': '19223846912'}, {'path': 'd:\\',
> >>>>>>> 'total':
> >>>>>>> '3490912256', 'fs': 'UDF', 'used': '3490912256'}] timeOffset
> >>>>>>> =
> >>>>>>> 14422
> >>>>>>> elapsedTime = 68591 hash = 2335461227228498964 statsAge =
> >>>>>>> 0.09
> >>>>>>> #
> >>>>>>> unused Often Changed but unused
> >>>>>>>
> >>>>>>>
> >>>>>>> This data does not seem to be used in the engine at all. It
> >>>>>>> is
> >>>>>>> not
> >>>>>>> even used in the data warehouse. memoryStats = {'swap_out':
> >>>>>>> '0',
> >>>>>>> 'majflt': '0', 'mem_free': '1466884', 'swap_in': '0',
> >>>>>>> 'pageflt':
> >>>>>>> '0', 'mem_total': '2096736', 'mem_unused': '1466884'}
> >>>>>>> balloonInfo
> >>>>>>> =
> >>>>>>> {'balloon_max': 2097152, 'balloon_cur': 2097152} disks =
> >>>>>>> {'vda':
> >>>>>>> {'readLatency': '0', 'apparentsize': '64424509440',
> >>>>>>> 'writeLatency':
> >>>>>>> '1754496', 	'imageID':
> >>>>>>> '28abb923-7b89-4638-84f8-1700f0b76482',
> >>>>>>> 'flushLatency': '156549',  'readRate': '0.00', 'truesize':
> >>>>>>> '18855059456', 'writeRate': '952.05'}, 'hdc': {'readLatency':
> >>>>>>> '0',
> >>>>>>> 'apparentsize': '0', 'writeLatency': '0', 'flushLatency':
> >>>>>>> '0',
> >>>>>>> 'readRate': '0.00', 'truesize': '0', 'writeRate': '0.00'}}
> >>>>>>> Very
> >>>>>>> frequent uppdates needed by webadmin portal:
> >>>>>>>
> >>>>>>>
> >>>>>>> This data is mostly needed for the webadmin portal and might
> >>>>>>> be
> >>>>>>> required to be updated quite often. An exception here is the
> >>>>>>> statsAge field, which seems to be unused by the Engine. This
> >>>>>>> data
> >>>>>>> could be requested every 15 seconds to keep things as they
> >>>>>>> are
> >>>>>>> now.
> >>>>>>> cpuSys = 2.32 cpuUser = 1.34 memUsage = 30 Proposed Solution
> >>>>>>> for
> >>>>>>> VDSM & Engine:
> >>>>>>>
> >>>>>>>
> >>>>>>> We will introduce new optional parameters to getVmStats,
> >>>>>>> getAllVmStats and list to allow a finer grained specification
> >>>>>>> of
> >>>>>>> data which should be included.
> >>>>>>>
> >>>>>>> Parameter: statsType = <string> (getVmStats, getAllVmStats
> >>>>>>> only)
> >>>>>>> Allowed values:
> >>>>>>>
> >>>>>>>       * full (default to keep backwards compatibility)
> >>>>>>>       * app-list (Just send the application list)
> >>>>>>>       * rare (include everything from rarely changed to very
> >>>>>>>       frequent)
> >>>>>>>       * often (include everything from often changed to very
> >>>>>>>       frequent)
> >>>>>>>       * frequent (only send the very frequently changed
> >>>>>>>       items)
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Parameter: clientId = <string> The client id is specified by
> >>>>>>> the
> >>>>>>> client and should be unique however constantly used.
> >>>>>>>
> >>>>>>> Parameter: diff = <boolean> In combination with the clientId
> >>>>>>> VDSM
> >>>>>>> will send only differences to the previous request from the
> >>>>>>> named
> >>>>>>> clientId. (if diff=true)
> >>>>>>>
> >>>>>>>
> >>>>>>> Additional Change:
> >>>>>>>
> >>>>>>>
> >>>>>>> Besides the introduction of the new parameters for list,
> >>>>>>> getVmStats
> >>>>>>> and getAllVmStats it might make sense to include a hash for
> >>>>>>> the
> >>>>>>> appList into the rarely changed section of the response which
> >>>>>>> would
> >>>>>>> allow to identify changes and avoid having to sent the
> >>>>>>> complete
> >>>>>>> appList every so often and only if the hash known to the
> >>>>>>> client
> >>>>>>> is
> >>>>>>> outdated.
> >>>>>>>
> >>>>>>> Note: The appList (Application List) reported by the guest
> >>>>>>> agent
> >>>>>>> could be fully implemented on request only, as long as the
> >>>>>>> guest
> >>>>>>> agent installed supports this. As there seems to be a request
> >>>>>>> to
> >>>>>>> have the complete list of installed applications on all
> >>>>>>> guests
> >>>>>>> this
> >>>>>>> data could be quite extensive and a huge list. On the other
> >>>>>>> hand
> >>>>>>> this data is only rarely visible and therefore it should not
> >>>>>>> be
> >>>>>>> requested all the time and only on demand. Improvement of the
> >>>>>>> Guest
> >>>>>>> Agent:
> >>>>>>>
> >>>>>>>
> >>>>>>> As part of the proposed solution it is necessary to improve
> >>>>>>> the
> >>>>>>> guest
> >>>>>>> agent as well. For the full application list there should be
> >>>>>>> implemented a caching system which will be fully reactive and
> >>>>>>> should
> >>>>>>> not poll the application list for example all the time. The
> >>>>>>> guest
> >>>>>>> can create a prepared data file containing all data in the
> >>>>>>> JSON
> >>>>>>> format (as used for the communication with VDSM via VIO) and
> >>>>>>> just
> >>>>>>> have to read that file from disk and directly sends it to
> >>>>>>> VDSM.
> >>>>>>> However it is quite possible that this list is to big and it
> >>>>>>> might
> >>>>>>> have to be chunked into pieces. (Multiple messages, which
> >>>>>>> would
> >>>>>>> have
> >>>>>>> to be supported by VDSM then as well) The solution for this
> >>>>>>> is
> >>>>>>> to
> >>>>>>> make VDSM request this data and it will retrieve the data
> >>>>>>> necessary
> >>>>>>> on request only. --
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Vinzenz Feenstra | Senior Software Engineer
> >>>>>>> RedHat Engineering Virtualization R & D
> >>>>>>> Phone: +420 532 294 625
> >>>>>>> IRC: vfeenstr or evilissimo
> >>>>>>>
> >>>>>>> Better technology. Faster innovation. Powered by community
> >>>>>>> collaboration.
> >>>>>>> See how it works at redhat.com
> >>>>>>> _______________________________________________
> >>>>>>> Engine-devel mailing list
> >>>>>>> Engine-devel at ovirt.org
> >>>>>>> http://lists.ovirt.org/mailman/listinfo/engine-devel
> >>>>>>>
> >>>>>> _______________________________________________
> >>>>>> vdsm-devel mailing list
> >>>>>> vdsm-devel at lists.fedorahosted.org
> >>>>>> https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> >>>>>>
> >>> _______________________________________________
> >>> vdsm-devel mailing list
> >>> vdsm-devel at lists.fedorahosted.org
> >>> https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> >>>
> >>
> > _______________________________________________
> > Engine-devel mailing list
> > Engine-devel at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/engine-devel
> 
> 



More information about the Engine-devel mailing list