----- Original Message -----
> From: "Ayal Baron" <abaron(a)redhat.com>
> To: "Itamar Heim" <iheim(a)redhat.com>
> Cc: engine-devel(a)ovirt.org, vdsm-devel(a)lists.fedorahosted.org
> Sent: Sunday, March 17, 2013 3:13:09 PM
> Subject: Re: [Engine-devel] [vdsm] Proposal VDSM <=> Engine Data
> Statistics Retrieval Optimization
>
>
>
> ----- Original Message -----
> > On 03/13/2013 11:55 PM, Ayal Baron wrote:
> > ...
> > >>>> The only reason we have this problem is because there is
> > >>>> this
> > >>>> thing against making multiple calls.
> > >>>>
> > >>>> Just split it up.
> > >>>> getVmRuntimeStats() - transient things like mem and cpu%
> > >>>> getVmInformation() - (semi)static things like
> > >>>> disk\networking
> > >>>> layout
> > >>>> etc.
> > >>>> Each updated at different intervals.
> > >>>
> > >>> +1 on splitting the data up into 2 separate API calls.
> > >>> You could potentially add a checksum (md5, or any other way)
> > >>> of
> > >>> the
> > >>> "static" data to getVmRuntimeStats and not bother even
with
> > >>> polling
> > >>> the VmInformation if this hasn't changed. Then you could
> > >>> poll
> > >>> as
> > >>> often as you'd like the stats and immediately see if you also
> > >>> need
> > >>> to retrieve VmInfo or not (you rarely would).
> > >> +1 To Ayal's suggestion
> > >> except that instead of the engine hashing the data VDSM sends
> > >> the
> > >> key which is opaque to the engine.
> > >> This can be a local timestap or a generation number.
> > >
> > > Of course vdsm does the hash, otherwise you'd need to pass all
> > > the
> > > data to engine which would beat the purpose.
> >
> > I thought you meant engine will be sending the hash of previous
> > requests
> > per VM to vdsm, then vdsm will reply back with vm's removed, vm's
> > added,
> > and the details for vm's that changed (i.e., engine would be
> > doing
> > something like if-modified-since-checksum per vm).
> > benefit is reducing a round trip.
> > but first would need to split to calls of stats (always changing)
> > and
> > slowly/never changing data.
>
> If vdms accepts the hash then in your method engine would have to
> periodically call getVmInfo(hash).
> What I was suggesting is that getVmStats would return vmInfo hash
> so
> that we could avoid calling getVmInfo altogether.
> The stats *always* change so there is no need for checking if that
> info has changed.
> What we could do is avoid the split into 2 verbs by calling
> getVmStats(hash) and then have getVmStats return everything if the
> hash has changed or only the stats if it hasn't. This would be the
> least number of roundtrips and avoid the split. If you don't pass
> a
> hash it would return everything so this way it's also fully
> backward
> compatible.
Actually, I assume we can pass hash 0 (to have vdsm return
"everything"). I assume that the chances for md5 on "real data" (i.e
-
real data that is known to engine) to be 0 are very slim.
We'd need to support hash=None to keep backward compatibility, plus there are no
assumptions this way on hash algorithm so why bother with hash=0?
>
> >
> > >
> > >>
> > >> But, we might want to consider that when we add events polling
> > >> becomes (much) less frequent so maybe it'll be an overkill.
> > >
> > > You'd still need to compare versions of the data in vdsm and
> > > send
> > > only if it changed. If you don't persist what was received
> > > last
> > > then potentially you could have a monday morning effect where
> > > upon
> > > on system startup you'd be sending everything. So I still
> > > think
> > > you'd want to have the hash.
> > >
> > >
> > >>
> > >>>
> > >>>>
> > >>>> ----- Original Message -----
> > >>>>> From: "Vinzenz Feenstra"
<vfeenstr(a)redhat.com>
> > >>>>> To: vdsm-devel(a)lists.fedorahosted.org,
> > >>>>> engine-devel(a)ovirt.org
> > >>>>> Sent: Thursday, March 7, 2013 6:25:54 AM
> > >>>>> Subject: [Engine-devel] Proposal VDSM <=> Engine
Data
> > >>>>> Statistics
> > >>>>> Retrieval Optimization
> > >>>>>
> > >>>>>
> > >>>>> Please find the prettier version on the wiki:
> > >>>>>
http://www.ovirt.org/Proposal_VDSM_-_Engine_Data_Statistics_Retrieval
> > >>>>>
> > >>>>> Proposal VDSM - Engine Data Statistics Retrieval
> > >>>>> VDSM <=> Engine data retrieval optimization
> > >>>>> Motivation:
> > >>>>>
> > >>>>>
> > >>>>> Currently the RHEVM engine is polling the a lot of data
> > >>>>> from
> > >>>>> VDSM
> > >>>>> every 15 seconds. This should be optimized and the amount
> > >>>>> of
> > >>>>> data
> > >>>>> requested should be more specific.
> > >>>>>
> > >>>>> For each VM the data currently contains much more
> > >>>>> information
> > >>>>> than
> > >>>>> actually needed which blows up the size of the XML
content
> > >>>>> quite
> > >>>>> big. We could optimize this by splitting the reply on the
> > >>>>> getVmStats
> > >>>>> based on the request of the engine into sections. For
this
> > >>>>> reason
> > >>>>> Omer Frenkel and me have split up the data into parts
based
> > >>>>> on
> > >>>>> their
> > >>>>> usage.
> > >>>>>
> > >>>>> This data can and usually does change during the lifetime
> > >>>>> of
> > >>>>> the
> > >>>>> VM.
> > >>>>> Rarely Changed:
> > >>>>>
> > >>>>>
> > >>>>> This data is change not very frequent and it should be
> > >>>>> enough
> > >>>>> to
> > >>>>> update this only once in a while. Most commonly this data
> > >>>>> changes
> > >>>>> after changes made in the UI or after a migration of the
VM
> > >>>>> to
> > >>>>> another Host. Status = Running acpiEnable = true vmType =
> > >>>>> kvm
> > >>>>> guestName = W864GUESTAGENTT displayType = qxl guestOs =
Win
> > >>>>> 8
> > >>>>> kvmEnable = true # this should be constant and never
> > >>>>> changed
> > >>>>> pauseCode = NOERR monitorResponse = 0 session = Locked #
> > >>>>> unused
> > >>>>> netIfaces = [{'name': 'Realtek RTL8139C+ Fast
Ethernet
> > >>>>> NIC',
> > >>>>> 'inet6': ['fe80::490c:92bb:bbcc:9f87'],
'inet':
> > >>>>> ['10.34.60.148'],
> > >>>>> 'hw': '00:1a:4a:22:3c:db'}] appsList =
['RHEV-Tools 3.2.4',
> > >>>>> 'RHEV-Agent64 3.2.3', 'RHEV-Serial64
3.2.3',
> > >>>>> 'RHEV-Network64
> > >>>>> 3.2.2',
> > >>>>> 'RHEV-Network64 3.2.3', 'RHEV-Block64
3.2.3',
> > >>>>> 'RHEV-Balloon64
> > >>>>> 3.2.3', 'RHEV-Balloon64 3.2.2',
'RHEV-Agent64 3.2.2',
> > >>>>> 'RHEV-USB
> > >>>>> 3.2.3', 'RHEV-Block64 3.2.2',
'RHEV-Serial64 3.2.2'] pid =
> > >>>>> 11314
> > >>>>> guestIPs = 10.34.60.148 # duplicated info displayIp = 0
> > >>>>> displayPort
> > >>>>> = 5902 displaySecurePort = 5903 username =
> > >>>>> user@W864GUESTAGENTT
> > >>>>> clientIp = lastLogin = 1361976900.67 Often Changed:
> > >>>>>
> > >>>>>
> > >>>>> This data is changed quite often however it is not
> > >>>>> necessary
> > >>>>> to
> > >>>>> update this data every 15 seconds. As this is cumulative
> > >>>>> data
> > >>>>> and
> > >>>>> reflects the current status, and it does not need to be
> > >>>>> snapshotted
> > >>>>> every 15 seconds to retrieve statistics. The data can be
> > >>>>> retrieved
> > >>>>> in much more generous time slices. (e.g. Every 5 minutes)
> > >>>>> network
> > >>>>> =
> > >>>>> {'vnet1': {'macAddr':
'00:1a:4a:22:3c:db', 'rxDropped':
> > >>>>> '0',
> > >>>>> 'txDropped': '0', 'rxErrors':
'0', 'txRate': '0.0',
> > >>>>> 'rxRate':
> > >>>>> '0.0',
> > >>>>> 'txErrors': '0', 'state':
'unknown', 'speed': '100',
> > >>>>> 'name':
> > >>>>> 'vnet1'}} disksUsage = [{'path':
'c:\\', 'total':
> > >>>>> '64055406592',
> > >>>>> 'fs': 'NTFS', 'used':
'19223846912'}, {'path': 'd:\\',
> > >>>>> 'total':
> > >>>>> '3490912256', 'fs': 'UDF',
'used': '3490912256'}]
> > >>>>> timeOffset
> > >>>>> =
> > >>>>> 14422
> > >>>>> elapsedTime = 68591 hash = 2335461227228498964 statsAge =
> > >>>>> 0.09
> > >>>>> #
> > >>>>> unused Often Changed but unused
> > >>>>>
> > >>>>>
> > >>>>> This data does not seem to be used in the engine at all.
It
> > >>>>> is
> > >>>>> not
> > >>>>> even used in the data warehouse. memoryStats =
{'swap_out':
> > >>>>> '0',
> > >>>>> 'majflt': '0', 'mem_free':
'1466884', 'swap_in': '0',
> > >>>>> 'pageflt':
> > >>>>> '0', 'mem_total': '2096736',
'mem_unused': '1466884'}
> > >>>>> balloonInfo
> > >>>>> =
> > >>>>> {'balloon_max': 2097152, 'balloon_cur':
2097152} disks =
> > >>>>> {'vda':
> > >>>>> {'readLatency': '0',
'apparentsize': '64424509440',
> > >>>>> 'writeLatency':
> > >>>>> '1754496', 'imageID':
> > >>>>> '28abb923-7b89-4638-84f8-1700f0b76482',
> > >>>>> 'flushLatency': '156549',
'readRate': '0.00', 'truesize':
> > >>>>> '18855059456', 'writeRate':
'952.05'}, 'hdc':
> > >>>>> {'readLatency':
> > >>>>> '0',
> > >>>>> 'apparentsize': '0',
'writeLatency': '0', 'flushLatency':
> > >>>>> '0',
> > >>>>> 'readRate': '0.00', 'truesize':
'0', 'writeRate': '0.00'}}
> > >>>>> Very
> > >>>>> frequent uppdates needed by webadmin portal:
> > >>>>>
> > >>>>>
> > >>>>> This data is mostly needed for the webadmin portal and
> > >>>>> might
> > >>>>> be
> > >>>>> required to be updated quite often. An exception here is
> > >>>>> the
> > >>>>> statsAge field, which seems to be unused by the Engine.
> > >>>>> This
> > >>>>> data
> > >>>>> could be requested every 15 seconds to keep things as
they
> > >>>>> are
> > >>>>> now.
> > >>>>> cpuSys = 2.32 cpuUser = 1.34 memUsage = 30 Proposed
> > >>>>> Solution
> > >>>>> for
> > >>>>> VDSM & Engine:
> > >>>>>
> > >>>>>
> > >>>>> We will introduce new optional parameters to getVmStats,
> > >>>>> getAllVmStats and list to allow a finer grained
> > >>>>> specification
> > >>>>> of
> > >>>>> data which should be included.
> > >>>>>
> > >>>>> Parameter: statsType = <string> (getVmStats,
getAllVmStats
> > >>>>> only)
> > >>>>> Allowed values:
> > >>>>>
> > >>>>> * full (default to keep backwards compatibility)
> > >>>>> * app-list (Just send the application list)
> > >>>>> * rare (include everything from rarely changed to
very
> > >>>>> frequent)
> > >>>>> * often (include everything from often changed to
very
> > >>>>> frequent)
> > >>>>> * frequent (only send the very frequently changed
> > >>>>> items)
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> Parameter: clientId = <string> The client id is
specified
> > >>>>> by
> > >>>>> the
> > >>>>> client and should be unique however constantly used.
> > >>>>>
> > >>>>> Parameter: diff = <boolean> In combination with the
> > >>>>> clientId
> > >>>>> VDSM
> > >>>>> will send only differences to the previous request from
the
> > >>>>> named
> > >>>>> clientId. (if diff=true)
> > >>>>>
> > >>>>>
> > >>>>> Additional Change:
> > >>>>>
> > >>>>>
> > >>>>> Besides the introduction of the new parameters for list,
> > >>>>> getVmStats
> > >>>>> and getAllVmStats it might make sense to include a hash
for
> > >>>>> the
> > >>>>> appList into the rarely changed section of the response
> > >>>>> which
> > >>>>> would
> > >>>>> allow to identify changes and avoid having to sent the
> > >>>>> complete
> > >>>>> appList every so often and only if the hash known to the
> > >>>>> client
> > >>>>> is
> > >>>>> outdated.
> > >>>>>
> > >>>>> Note: The appList (Application List) reported by the
guest
> > >>>>> agent
> > >>>>> could be fully implemented on request only, as long as
the
> > >>>>> guest
> > >>>>> agent installed supports this. As there seems to be a
> > >>>>> request
> > >>>>> to
> > >>>>> have the complete list of installed applications on all
> > >>>>> guests
> > >>>>> this
> > >>>>> data could be quite extensive and a huge list. On the
other
> > >>>>> hand
> > >>>>> this data is only rarely visible and therefore it should
> > >>>>> not
> > >>>>> be
> > >>>>> requested all the time and only on demand. Improvement of
> > >>>>> the
> > >>>>> Guest
> > >>>>> Agent:
> > >>>>>
> > >>>>>
> > >>>>> As part of the proposed solution it is necessary to
improve
> > >>>>> the
> > >>>>> guest
> > >>>>> agent as well. For the full application list there should
> > >>>>> be
> > >>>>> implemented a caching system which will be fully reactive
> > >>>>> and
> > >>>>> should
> > >>>>> not poll the application list for example all the time.
The
> > >>>>> guest
> > >>>>> can create a prepared data file containing all data in
the
> > >>>>> JSON
> > >>>>> format (as used for the communication with VDSM via VIO)
> > >>>>> and
> > >>>>> just
> > >>>>> have to read that file from disk and directly sends it to
> > >>>>> VDSM.
> > >>>>> However it is quite possible that this list is to big and
> > >>>>> it
> > >>>>> might
> > >>>>> have to be chunked into pieces. (Multiple messages, which
> > >>>>> would
> > >>>>> have
> > >>>>> to be supported by VDSM then as well) The solution for
this
> > >>>>> is
> > >>>>> to
> > >>>>> make VDSM request this data and it will retrieve the data
> > >>>>> necessary
> > >>>>> on request only. --
> > >>>>> Regards,
> > >>>>>
> > >>>>> Vinzenz Feenstra | Senior Software Engineer
> > >>>>> RedHat Engineering Virtualization R & D
> > >>>>> Phone: +420 532 294 625
> > >>>>> IRC: vfeenstr or evilissimo
> > >>>>>
> > >>>>> Better technology. Faster innovation. Powered by
community
> > >>>>> collaboration.
> > >>>>> See how it works at
redhat.com
> > >>>>> _______________________________________________
> > >>>>> Engine-devel mailing list
> > >>>>> Engine-devel(a)ovirt.org
> > >>>>>
http://lists.ovirt.org/mailman/listinfo/engine-devel
> > >>>>>
> > >>>> _______________________________________________
> > >>>> vdsm-devel mailing list
> > >>>> vdsm-devel(a)lists.fedorahosted.org
> > >>>>
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> > >>>>
> > >>>
> > >>
> > > _______________________________________________
> > > vdsm-devel mailing list
> > > vdsm-devel(a)lists.fedorahosted.org
> > >
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> > >
> >
> >
> _______________________________________________
> Engine-devel mailing list
> Engine-devel(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/engine-devel
>