[Users] Fwd: Re: Memory usage

Martin Sivak msivak at redhat.com
Fri Feb 21 14:10:44 UTC 2014


Hi,

I do not see any reference to a bug in this thread, I will try to find it in the BZ.

> What do you think it is then?
Integer overflow.. 35% of 12G is about 4G.

For some reason we have had a weird cast to int in the code. It has been there for years. See http://gerrit.ovirt.org/#/c/24859/ if you want to see the source change. I guess nobody had tried getting the stats with so much RAM.

I just tested it on host simulator with a VM having about 13G of RAM and randomized load and the issue did not reproduce.

Best Regards

Martin Sivak

--
Martin Sivák
msivak at redhat.com
Red Hat Czech
RHEV-M SLA / Brno, CZ


----- Original Message -----
> I think someone created one. Don't know for sure... What do you think it is
> then?
> On Feb 21, 2014 11:24 AM, "Martin Sivak" <msivak at redhat.com> wrote:
> 
> > Hi
> >
> > I think I found the issue. I suspect it will happen when you get over 35%
> > of used memory.
> >
> > Can you please create a bug for this issue? Or is there one already?
> >
> > --
> > Martin Sivák
> > msivak at redhat.com
> > Red Hat Czech
> > RHEV-M SLA / Brno, CZ
> >
> > ----- Original Message -----
> > > I'm waiting for the next alarm. In the meanwhile this is the output now:
> > > vdsClient -s saturnus2 getVmStats 3b9aa245-75ff-42e8-b921-1c9ce61826bf
> > >
> > > 3b9aa245-75ff-42e8-b921-1c9ce61826bf
> > >     Status = Running
> > >     guestFQDN = galatea.***.****
> > >     memUsage = 32
> > >     acpiEnable = true
> > >     netIfaces = [{'inet6': [], 'hw': '52:54:00:49:11:9d', 'inet':
> > > ['192.168.99.19'], 'name': 'eth1'}, {'inet6': [], 'hw':
> > > '00:1a:4a:67:c4:b3', 'inet': ['10.110.X.X'], 'name': 'eth2'}, {'inet6':
> > [],
> > > 'hw': '52:54:00:f1:22:48', 'inet': ['192.168.122.1'], 'name': 'virbr0'}]
> > >     pid = 38144
> > >     session = Unknown
> > >     vmType = kvm
> > >     timeOffset = 0
> > >     balloonInfo = {'balloon_max': '12582912', 'balloon_min': '8388608',
> > > 'balloon_target': '12582912', 'balloon_cur': '12582912'}
> > >     pauseCode = NOERR
> > >     disksUsage = [{'path': '/', 'total': '37655093248', 'fs': 'ext4',
> > > 'used': '21765771264'}, {'path': '/boot', 'total': '507744256', 'fs':
> > > 'ext4', 'used': '49356800'}]
> > >     network = {'vnet0': {'macAddr': '00:1a:4a:67:c4:b3', 'rxDropped':
> > '0',
> > > 'txDropped': '0', 'rxErrors': '0', 'txRate': '0.0', 'rxRate': '0.0',
> > > 'txErrors': '0', 'state': 'unknown', 'speed': '1000', 'name': 'vnet0'},
> > > 'vnet1': {'macAddr': '52:54:00:49:11:9d', 'rxDropped': '0', 'txDropped':
> > > '0', 'rxErrors': '0', 'txRate': '0.0', 'rxRate': '0.0', 'txErrors': '0',
> > > 'state': 'unknown', 'speed': '1000', 'name': 'vnet1'}, 'vnet2':
> > {'macAddr':
> > > '52:54:00:66:d3:aa', 'rxDropped': '0', 'txDropped': '0', 'rxErrors': '0',
> > > 'txRate': '0.0', 'rxRate': '0.0', 'txErrors': '0', 'state': 'unknown',
> > > 'speed': '1000', 'name': 'vnet2'}}
> > >     memoryStats = {'swap_out': '0', 'majflt': '0', 'mem_free': '8520656',
> > > 'swap_in': '0', 'pageflt': '396', 'mem_total': '12197520', 'mem_unused':
> > > '1580896'}
> > >     guestName = galatea.brusselsairport.aero
> > >     elapsedTime = 2251149
> > >     displayType = qxl
> > >     cpuSys = 12.76
> > >     appsList = ['ovirt-guest-agent-1.0.8-1.el6', 'kernel-2.6.32-71.el6']
> > >     guestOs = 2.6.32-71.el6.x86_64
> > >     username = vandenpt
> > >     hash = -3779086589437073991
> > >     displayIp = 0
> > >     displayPort = 5900
> > >     guestIPs = 192.168.99.19 10.110.50.84 192.168.122.1
> > >     kvmEnable = true
> > >     disks = {'vda': {'readLatency': '0', 'apparentsize': '45097156608',
> > > 'writeLatency': '1313701', 'imageID':
> > > '528a377f-4f98-4023-92de-ce52c394d4a9', 'flushLatency': '187524',
> > > 'readRate': '0.00', 'truesize': '45097156608', 'writeRate': '8502.90'},
> > > 'hdc': {'readLatency': '0', 'apparentsize': '0', 'writeLatency': '0',
> > > 'flushLatency': '0', 'readRate': '0.00', 'truesize': '0', 'writeRate':
> > > '0.00'}}
> > >     monitorResponse = 0
> > >     statsAge = 0.47
> > >     cpuUser = 68.45
> > >     lastLogin = 1392707812.29
> > >     clientIp =
> > >     displaySecurePort = 5901
> > >
> > >
> > >
> > > 2014-02-17 14:00 GMT+01:00 Martin Sivak <msivak at redhat.com>:
> > >
> > > > Hi Koen,
> > > >
> > > > can you try the mentioned vdsClient command (vdsClient getVmStats
> > > > <affected vm>) as soon as you get the warning about negative memory?
> > The
> > > > one David sent to us does not contain the negative number and we
> > weren't
> > > > able to find the issue based on it.
> > > >
> > > > Thanks
> > > >
> > > > --
> > > > Martin Sivák
> > > > msivak at redhat.com
> > > > Red Hat Czech
> > > > RHEV-M SLA / Brno, CZ
> > > >
> > > > ----- Original Message -----
> > > > > Any updates about the issue? Because just now, the alarm went off
> > again
> > > > :-)
> > > > > Kind regards,
> > > > >
> > > > > Koen
> > > > >
> > > > >
> > > > > 2014-02-13 16:55 GMT+01:00 Martin Sivak <msivak at redhat.com>:
> > > > >
> > > > > > Thank you.
> > > > > >
> > > > > > I wonder why mem_total is smaller than balloon_cur (and max). I
> > will
> > > > have
> > > > > > to dig a bit more to see what happened.
> > > > > >
> > > > > > --
> > > > > > Martin Sivák
> > > > > > msivak at redhat.com
> > > > > > Red Hat Czech
> > > > > > RHEV-M SLA / Brno, CZ
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > > ---------- Forwarded message ----------
> > > > > > > From: "david van zeebroeck" < david at analytics.brusselsairport.be>
> > > > > > > Date: Feb 13, 2014 4:31 PM
> > > > > > > Subject: Re: [Users] Memory usage
> > > > > > > To: "Martin Sivak" < msivak at redhat.com >
> > > > > > > Cc: "Doron Fediuck" < dfediuck at redhat.com >, "Koen Vanoppen" <
> > > > > > > vanoppen.koen at gmail.com >, < users at ovirt.org >
> > > > > > >
> > > > > > > in attachement is the output of vdsclient
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Feb 13, 2014 at 2:26 PM, Martin Sivak <
> > msivak at redhat.com >
> > > > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > Hi everybody,
> > > > > > >
> > > > > > > would it be possible to add the output of vdsClient getVmStats
> > > > <affected
> > > > > > vm
> > > > > > > id> from the affected host here and to the bug?
> > > > > > >
> > > > > > > Also can we please get the bug number that tracks this issue? I
> > > > could not
> > > > > > > find it in BZ.
> > > > > > >
> > > > > > > --
> > > > > > > Martin Sivák
> > > > > > > msivak at redhat.com
> > > > > > > Red Hat Czech
> > > > > > > RHEV-M SLA / Brno, CZ
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > > >
> > > > > > > >
> > > > > > > > ----- Original Message -----
> > > > > > > > > From: "Koen Vanoppen" < vanoppen.koen at gmail.com >
> > > > > > > > > To: users at ovirt.org
> > > > > > > > > Sent: Thursday, February 13, 2014 12:18:24 PM
> > > > > > > > > Subject: Re: [Users] Memory usage
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We can't turn off the memory balloon option because we are
> > > > running
> > > > > > the
> > > > > > > > > 3.3.2
> > > > > > > > > with the bug of the memory balloon. Thx for the help!
> > > > > > > >
> > > > > > > > Which balloon bug?
> > > > > > > >
> > > > > > > > > On Feb 13, 2014 10:51 AM, "Doron Fediuck" <
> > dfediuck at redhat.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ----- Original Message -----
> > > > > > > > > > From: "René Koch" < rkoch at linuxland.at >
> > > > > > > > > > To: "Doron Fediuck" < dfediuck at redhat.com >
> > > > > > > > > > Cc: "Koen Vanoppen" < vanoppen.koen at gmail.com >,
> > > > users at ovirt.org ,
> > > > > > > > > > "Martin
> > > > > > > > > > Sivak" < msivak at redhat.com >
> > > > > > > > > > Sent: Thursday, February 13, 2014 11:16:27 AM
> > > > > > > > > > Subject: Re: [Users] Memory usage
> > > > > > > > > >
> > > > > > > > > > On Thu, 2014-02-13 at 03:49 -0500, Doron Fediuck wrote:
> > > > > > > > > > >
> > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > From: "René Koch" < rkoch at linuxland.at >
> > > > > > > > > > > > To: "Doron Fediuck" < dfediuck at redhat.com >
> > > > > > > > > > > > Cc: "Koen Vanoppen" < vanoppen.koen at gmail.com >,
> > > > > > users at ovirt.org ,
> > > > > > > > > > > > "Martin
> > > > > > > > > > > > Sivak" < msivak at redhat.com >
> > > > > > > > > > > > Sent: Wednesday, February 12, 2014 7:31:47 PM
> > > > > > > > > > > > Subject: Re: [Users] Memory usage
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 2014-02-12 at 11:22 -0500, Doron Fediuck wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > From: "René Koch" < rkoch at linuxland.at >
> > > > > > > > > > > > > > To: "Koen Vanoppen" < vanoppen.koen at gmail.com >
> > > > > > > > > > > > > > Cc: users at ovirt.org
> > > > > > > > > > > > > > Sent: Wednesday, February 12, 2014 4:18:37 PM
> > > > > > > > > > > > > > Subject: Re: [Users] Memory usage
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, 2014-02-12 at 15:14 +0100, Koen Vanoppen
> > wrote:
> > > > > > > > > > > > > > > In The GUI, it says it's using 25% of the memory.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I guess it's the real value, right?
> > > > > > > > > > > > > > The same happened for the memcached vm, someone
> > > > reported
> > > > > > to me
> > > > > > > > > > > > > > -
> > > > > > > > > > > > > > negative value in REST-API, but correct graph in
> > oVirt
> > > > > > webadmin
> > > > > > > > > > > > > > GUI.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I fear I have no idea how this can happen - so
> > maybe
> > > > > > someone
> > > > > > > > > > > > > > else
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > help you troubleshoot this issue.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2014-02-12 15:10 GMT+01:00 Koen Vanoppen <
> > > > > > > > > > > > > > > vanoppen.koen at gmail.com >:
> > > > > > > > > > > > > > > Thanks for the quick respons, but there is no
> > > > memcached
> > > > > > > > > > > > > > > running on that VM.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Kind regards
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2014-02-12 15:06 GMT+01:00 René Koch <
> > > > > > rkoch at linuxland.at >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, 2014-02-12 at 14:55 +0100, Koen Vanoppen
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > Dear all,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > When we monitor one of our machines, we noticed
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > there was one vm
> > > > > > > > > > > > > > > > that was constantly giving a error of memory
> > > > > > > > > > > > > > > > usage.
> > > > > > > > > > > > > > > But when we took a
> > > > > > > > > > > > > > > > look at it, there is actually nothing wrong
> > with
> > > > > > > > > > > > > > > > it.
> > > > > > > > > > > > > > > Now we looked
> > > > > > > > > > > > > > > > furhter then that. We looked at the API of the
> > > > > > > > > > > > > > > machine and noticed
> > > > > > > > > > > > > > > > something very strange:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > <statistic
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > >
> > > >
> > href="/api/vms/3b9aa245-75ff-42e8-b921-1c9ce61826bf/statistics/b7499508-c1c3-32f0-8174-c1783e57bb08"
> > > > > > > > > > > > > > >
> > > > > >
> > > >
> > id="b7499508-c1c3-32f0-8174-c1783e57bb08"><name>memory.used</name><description>Memory
> > > > > > > > > > > > > > > used (agent)</description><values
> > > > > > > > > > > > > > >
> > > > > >
> > > >
> > type="INTEGER"><value><datum>-944892806</datum></value></values><type>GAUGE</type><unit>BYTES</unit>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > It's a negative...
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Do you have memcached running in this vm?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I heard about this issue with memcached, but
> > never
> > > > > > > > > > > > > > > tested memcached in
> > > > > > > > > > > > > > > my oVirt environment. You get the real usage
> > value
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > memory.used = memory.installed + memory.used
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > René
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > What could be the problem?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Kind regards,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > koen
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Guys,
> > > > > > > > > > > > > these values are usually a result of overcommitment
> > > > mechanism
> > > > > > > > > > > > > usage.
> > > > > > > > > > > > > For example, if KSM is effective, it will free a lot
> > of
> > > > > > memory
> > > > > > > > > > > > > pages,
> > > > > > > > > > > > > and total-free-committed becomes negative.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks a lot for the information.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > This was reported in
> > > > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=977758
> > > > > > > > > > > > > and the engine is using memFree reported by vdsm,
> > which
> > > > is
> > > > > > more
> > > > > > > > > > > > > accurate.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The API reports the old version due to backwards
> > > > > > compatibility.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I just had a look at the bugzilla report and the RHEV
> > > > > > documentation
> > > > > > > > > > > > which says "Current memory in bytes used by the virtual
> > > > > > machine.".
> > > > > > > > > > > > So this means, the reported values are totally useless
> > for
> > > > > > > > > > > > monitoring
> > > > > > > > > > > > memory usage of a virtual machine if KSM is active. I
> > would
> > > > > > expect
> > > > > > > > > > > > to
> > > > > > > > > > > > get the memory usage of a virtual machine and not how
> > much
> > > > > > memory
> > > > > > > > > > > > is
> > > > > > > > > > > > consumed on the hypervisor (this is pretty useless
> > > > information
> > > > > > for
> > > > > > > > > > > > me).
> > > > > > > > > > > >
> > > > > > > > > > > > Is it planned to report the memory usage in a virtual
> > > > machine
> > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > API
> > > > > > > > > > > > as well?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > René
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > René,
> > > > > > > > > > > started digging into this, but it may take some time.
> > > > > > > > > > >
> > > > > > > > > > > Note that there are 2 memory usage reports; one for the
> > host,
> > > > > > > > > > > and another for each VM.
> > > > > > > > > > > My response is related to the host, while the original
> > > > question
> > > > > > was
> > > > > > > > > > > for the VM. I still think the root cause is the same (ie-
> > > > > > related to
> > > > > > > > > > > overcommitment; for example the VM may have a balloon
> > > > inflated),
> > > > > > but
> > > > > > > > > > > would like to properly check and fix if needed.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yes, I know. So my question was if we can add another value
> > > > for the
> > > > > > > > > > usage in the vm in addition to the usage of the vm on the
> > host.
> > > > > > > > > >
> > > > > > > > > We'll know once we understand the cause- ie is it a bug needs
> > > > fixing
> > > > > > or
> > > > > > > > > should we indeed add another value
> > > > > > > > >
> > > > > > > > > If you have a setup with this issue, try removing the balloon
> > > > device
> > > > > > > > > and see if it helps / changes.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > In order to make sure we keep track of it, please open a
> > bug
> > > > > > with all
> > > > > > > > > > > the
> > > > > > > > > > > relevant info.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I'll open a bug for this.
> > > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Thanks and keep reporting!
> > > > > > > > > > > Doron
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > Users mailing list
> > > > > > > > > Users at ovirt.org
> > > > > > > > > http://lists.ovirt.org/mailman/listinfo/users
> > > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > Users mailing list
> > > > > > > > Users at ovirt.org
> > > > > > > > http://lists.ovirt.org/mailman/listinfo/users
> > > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Users mailing list
> > > > > > > Users at ovirt.org
> > > > > > > http://lists.ovirt.org/mailman/listinfo/users
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Users mailing list
> > > > > > > Users at ovirt.org
> > > > > > > http://lists.ovirt.org/mailman/listinfo/users
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 



More information about the Users mailing list