[Engine-devel] [vdsm] Proposal VDSM <=> Engine Data Statistics Retrieval Optimization

Saggi Mizrahi smizrahi at redhat.com
Wed Mar 13 21:50:49 UTC 2013



----- Original Message -----
> From: "Ayal Baron" <abaron at redhat.com>
> To: "Saggi Mizrahi" <smizrahi at redhat.com>
> Cc: engine-devel at ovirt.org, vdsm-devel at lists.fedorahosted.org, "Vinzenz Feenstra" <vfeenstr at redhat.com>
> Sent: Wednesday, March 13, 2013 5:39:24 PM
> Subject: Re: [vdsm] [Engine-devel] Proposal VDSM <=> Engine Data Statistics Retrieval	Optimization
> 
> 
> 
> ----- Original Message -----
> > I am completely against this.
> > It make the return value differ according to input which
> > is a big no no when talking about type safe APIs.
> > 
> > The only reason we have this problem is because there is this
> > thing against making multiple calls.
> > 
> > Just split it up.
> > getVmRuntimeStats() - transient things like mem and cpu%
> > getVmInformation() - (semi)static things like disk\networking
> > layout
> > etc.
> > Each updated at different intervals.
> 
> +1 on splitting the data up into 2 separate API calls.
> You could potentially add a checksum (md5, or any other way) of the
> "static" data to getVmRuntimeStats and not bother even with polling
> the VmInformation if this hasn't changed.  Then you could poll as
> often as you'd like the stats and immediately see if you also need
> to retrieve VmInfo or not (you rarely would).
+1 To Ayal's suggestion
except that instead of the engine hashing the data VDSM sends the
key which is opaque to the engine.
This can be a local timestap or a generation number.

But, we might want to consider that when we add events polling
becomes (much) less frequent so maybe it'll be an overkill.

> 
> > 
> > ----- Original Message -----
> > > From: "Vinzenz Feenstra" <vfeenstr at redhat.com>
> > > To: vdsm-devel at lists.fedorahosted.org, engine-devel at ovirt.org
> > > Sent: Thursday, March 7, 2013 6:25:54 AM
> > > Subject: [Engine-devel] Proposal VDSM <=> Engine Data Statistics
> > > Retrieval	Optimization
> > > 
> > > 
> > > Please find the prettier version on the wiki:
> > > http://www.ovirt.org/Proposal_VDSM_-_Engine_Data_Statistics_Retrieval
> > > 
> > > Proposal VDSM - Engine Data Statistics Retrieval
> > > VDSM <=> Engine data retrieval optimization
> > > Motivation:
> > > 
> > > 
> > > Currently the RHEVM engine is polling the a lot of data from VDSM
> > > every 15 seconds. This should be optimized and the amount of data
> > > requested should be more specific.
> > > 
> > > For each VM the data currently contains much more information
> > > than
> > > actually needed which blows up the size of the XML content quite
> > > big. We could optimize this by splitting the reply on the
> > > getVmStats
> > > based on the request of the engine into sections. For this reason
> > > Omer Frenkel and me have split up the data into parts based on
> > > their
> > > usage.
> > > 
> > > This data can and usually does change during the lifetime of the
> > > VM.
> > > Rarely Changed:
> > > 
> > > 
> > > This data is change not very frequent and it should be enough to
> > > update this only once in a while. Most commonly this data changes
> > > after changes made in the UI or after a migration of the VM to
> > > another Host. Status = Running acpiEnable = true vmType = kvm
> > > guestName = W864GUESTAGENTT displayType = qxl guestOs = Win 8
> > > kvmEnable = true # this should be constant and never changed
> > > pauseCode = NOERR monitorResponse = 0 session = Locked # unused
> > > netIfaces = [{'name': 'Realtek RTL8139C+ Fast Ethernet NIC',
> > > 'inet6':  ['fe80::490c:92bb:bbcc:9f87'], 'inet':
> > > ['10.34.60.148'],
> > > 'hw': '00:1a:4a:22:3c:db'}] appsList = ['RHEV-Tools 3.2.4',
> > > 'RHEV-Agent64 3.2.3', 'RHEV-Serial64 3.2.3', 'RHEV-Network64
> > > 3.2.2',
> > > 'RHEV-Network64 3.2.3', 'RHEV-Block64 3.2.3', 'RHEV-Balloon64
> > > 3.2.3', 'RHEV-Balloon64 3.2.2', 'RHEV-Agent64 3.2.2', 'RHEV-USB
> > > 3.2.3', 'RHEV-Block64 3.2.2', 'RHEV-Serial64 3.2.2'] pid = 11314
> > > guestIPs = 10.34.60.148 # duplicated info displayIp = 0
> > > displayPort
> > > = 5902 displaySecurePort = 5903 username = user at W864GUESTAGENTT
> > > clientIp = lastLogin = 1361976900.67 Often Changed:
> > > 
> > > 
> > > This data is changed quite often however it is not necessary to
> > > update this data every 15 seconds. As this is cumulative data and
> > > reflects the current status, and it does not need to be
> > > snapshotted
> > > every 15 seconds to retrieve statistics. The data can be
> > > retrieved
> > > in much more generous time slices. (e.g. Every 5 minutes) network
> > > =
> > > {'vnet1': {'macAddr': '00:1a:4a:22:3c:db', 'rxDropped': '0',
> > > 'txDropped': '0', 'rxErrors': '0', 'txRate': '0.0', 'rxRate':
> > > '0.0',
> > > 'txErrors': '0', 'state': 'unknown', 'speed': '100', 'name':
> > > 'vnet1'}} disksUsage = [{'path': 'c:\\', 'total': '64055406592',
> > > 'fs': 'NTFS', 'used': '19223846912'}, {'path': 'd:\\', 'total':
> > > '3490912256', 'fs': 'UDF', 'used': '3490912256'}] timeOffset =
> > > 14422
> > > elapsedTime = 68591 hash = 2335461227228498964 statsAge = 0.09 #
> > > unused Often Changed but unused
> > > 
> > > 
> > > This data does not seem to be used in the engine at all. It is
> > > not
> > > even used in the data warehouse. memoryStats = {'swap_out': '0',
> > > 'majflt': '0', 'mem_free': '1466884', 'swap_in': '0', 'pageflt':
> > > '0', 'mem_total': '2096736', 'mem_unused': '1466884'} balloonInfo
> > > =
> > > {'balloon_max': 2097152, 'balloon_cur': 2097152} disks = {'vda':
> > > {'readLatency': '0', 'apparentsize': '64424509440',
> > > 'writeLatency':
> > > '1754496', 	'imageID': '28abb923-7b89-4638-84f8-1700f0b76482',
> > > 'flushLatency': '156549',  'readRate': '0.00', 'truesize':
> > > '18855059456', 'writeRate': '952.05'}, 'hdc': {'readLatency':
> > > '0',
> > > 'apparentsize': '0', 'writeLatency': '0', 'flushLatency': '0',
> > > 'readRate': '0.00', 'truesize': '0', 'writeRate': '0.00'}} Very
> > > frequent uppdates needed by webadmin portal:
> > > 
> > > 
> > > This data is mostly needed for the webadmin portal and might be
> > > required to be updated quite often. An exception here is the
> > > statsAge field, which seems to be unused by the Engine. This data
> > > could be requested every 15 seconds to keep things as they are
> > > now.
> > > cpuSys = 2.32 cpuUser = 1.34 memUsage = 30 Proposed Solution for
> > > VDSM & Engine:
> > > 
> > > 
> > > We will introduce new optional parameters to getVmStats,
> > > getAllVmStats and list to allow a finer grained specification of
> > > data which should be included.
> > > 
> > > Parameter: statsType = <string> (getVmStats, getAllVmStats only)
> > > Allowed values:
> > > 
> > >     * full (default to keep backwards compatibility)
> > >     * app-list (Just send the application list)
> > >     * rare (include everything from rarely changed to very
> > >     frequent)
> > >     * often (include everything from often changed to very
> > >     frequent)
> > >     * frequent (only send the very frequently changed items)
> > > 
> > > 
> > > 
> > > Parameter: clientId = <string> The client id is specified by the
> > > client and should be unique however constantly used.
> > > 
> > > Parameter: diff = <boolean> In combination with the clientId VDSM
> > > will send only differences to the previous request from the named
> > > clientId. (if diff=true)
> > > 
> > > 
> > > Additional Change:
> > > 
> > > 
> > > Besides the introduction of the new parameters for list,
> > > getVmStats
> > > and getAllVmStats it might make sense to include a hash for the
> > > appList into the rarely changed section of the response which
> > > would
> > > allow to identify changes and avoid having to sent the complete
> > > appList every so often and only if the hash known to the client
> > > is
> > > outdated.
> > > 
> > > Note: The appList (Application List) reported by the guest agent
> > > could be fully implemented on request only, as long as the guest
> > > agent installed supports this. As there seems to be a request to
> > > have the complete list of installed applications on all guests
> > > this
> > > data could be quite extensive and a huge list. On the other hand
> > > this data is only rarely visible and therefore it should not be
> > > requested all the time and only on demand. Improvement of the
> > > Guest
> > > Agent:
> > > 
> > > 
> > > As part of the proposed solution it is necessary to improve the
> > > guest
> > > agent as well. For the full application list there should be
> > > implemented a caching system which will be fully reactive and
> > > should
> > > not poll the application list for example all the time. The guest
> > > can create a prepared data file containing all data in the JSON
> > > format (as used for the communication with VDSM via VIO) and just
> > > have to read that file from disk and directly sends it to VDSM.
> > > However it is quite possible that this list is to big and it
> > > might
> > > have to be chunked into pieces. (Multiple messages, which would
> > > have
> > > to be supported by VDSM then as well) The solution for this is to
> > > make VDSM request this data and it will retrieve the data
> > > necessary
> > > on request only. --
> > > Regards,
> > > 
> > > Vinzenz Feenstra | Senior Software Engineer
> > > RedHat Engineering Virtualization R & D
> > > Phone: +420 532 294 625
> > > IRC: vfeenstr or evilissimo
> > > 
> > > Better technology. Faster innovation. Powered by community
> > > collaboration.
> > > See how it works at redhat.com
> > > _______________________________________________
> > > Engine-devel mailing list
> > > Engine-devel at ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/engine-devel
> > > 
> > _______________________________________________
> > vdsm-devel mailing list
> > vdsm-devel at lists.fedorahosted.org
> > https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> > 
> 



More information about the Engine-devel mailing list