I'm waiting for the next alarm. In the meanwhile this is the output now:
vdsClient -s saturnus2 getVmStats 3b9aa245-75ff-42e8-b921-1c9ce61826bf
3b9aa245-75ff-42e8-b921-1c9ce61826bf
Status = Running
guestFQDN = galatea.***.****
memUsage = 32
acpiEnable = true
netIfaces = [{'inet6': [], 'hw': '52:54:00:49:11:9d',
'inet':
['192.168.99.19'], 'name': 'eth1'}, {'inet6': [],
'hw':
'00:1a:4a:67:c4:b3', 'inet': ['10.110.X.X'], 'name':
'eth2'}, {'inet6': [],
'hw': '52:54:00:f1:22:48', 'inet': ['192.168.122.1'],
'name': 'virbr0'}]
pid = 38144
session = Unknown
vmType = kvm
timeOffset = 0
balloonInfo = {'balloon_max': '12582912', 'balloon_min':
'8388608',
'balloon_target': '12582912', 'balloon_cur': '12582912'}
pauseCode = NOERR
disksUsage = [{'path': '/', 'total': '37655093248',
'fs': 'ext4',
'used': '21765771264'}, {'path': '/boot', 'total':
'507744256', 'fs':
'ext4', 'used': '49356800'}]
network = {'vnet0': {'macAddr': '00:1a:4a:67:c4:b3',
'rxDropped': '0',
'txDropped': '0', 'rxErrors': '0', 'txRate':
'0.0', 'rxRate': '0.0',
'txErrors': '0', 'state': 'unknown', 'speed':
'1000', 'name': 'vnet0'},
'vnet1': {'macAddr': '52:54:00:49:11:9d', 'rxDropped':
'0', 'txDropped':
'0', 'rxErrors': '0', 'txRate': '0.0',
'rxRate': '0.0', 'txErrors': '0',
'state': 'unknown', 'speed': '1000', 'name':
'vnet1'}, 'vnet2': {'macAddr':
'52:54:00:66:d3:aa', 'rxDropped': '0', 'txDropped':
'0', 'rxErrors': '0',
'txRate': '0.0', 'rxRate': '0.0', 'txErrors':
'0', 'state': 'unknown',
'speed': '1000', 'name': 'vnet2'}}
memoryStats = {'swap_out': '0', 'majflt': '0',
'mem_free': '8520656',
'swap_in': '0', 'pageflt': '396', 'mem_total':
'12197520', 'mem_unused':
'1580896'}
guestName = galatea.brusselsairport.aero
elapsedTime = 2251149
displayType = qxl
cpuSys = 12.76
appsList = ['ovirt-guest-agent-1.0.8-1.el6', 'kernel-2.6.32-71.el6']
guestOs = 2.6.32-71.el6.x86_64
username = vandenpt
hash = -3779086589437073991
displayIp = 0
displayPort = 5900
guestIPs = 192.168.99.19 10.110.50.84 192.168.122.1
kvmEnable = true
disks = {'vda': {'readLatency': '0', 'apparentsize':
'45097156608',
'writeLatency': '1313701', 'imageID':
'528a377f-4f98-4023-92de-ce52c394d4a9', 'flushLatency': '187524',
'readRate': '0.00', 'truesize': '45097156608',
'writeRate': '8502.90'},
'hdc': {'readLatency': '0', 'apparentsize': '0',
'writeLatency': '0',
'flushLatency': '0', 'readRate': '0.00',
'truesize': '0', 'writeRate':
'0.00'}}
monitorResponse = 0
statsAge = 0.47
cpuUser = 68.45
lastLogin = 1392707812.29
clientIp =
displaySecurePort = 5901
2014-02-17 14:00 GMT+01:00 Martin Sivak <msivak(a)redhat.com>:
Hi Koen,
can you try the mentioned vdsClient command (vdsClient getVmStats
<affected vm>) as soon as you get the warning about negative memory? The
one David sent to us does not contain the negative number and we weren't
able to find the issue based on it.
Thanks
--
Martin Sivák
msivak(a)redhat.com
Red Hat Czech
RHEV-M SLA / Brno, CZ
----- Original Message -----
> Any updates about the issue? Because just now, the alarm went off again
:-)
> Kind regards,
>
> Koen
>
>
> 2014-02-13 16:55 GMT+01:00 Martin Sivak <msivak(a)redhat.com>:
>
> > Thank you.
> >
> > I wonder why mem_total is smaller than balloon_cur (and max). I will
have
> > to dig a bit more to see what happened.
> >
> > --
> > Martin Sivák
> > msivak(a)redhat.com
> > Red Hat Czech
> > RHEV-M SLA / Brno, CZ
> >
> > ----- Original Message -----
> > > ---------- Forwarded message ----------
> > > From: "david van zeebroeck" <
david(a)analytics.brusselsairport.be >
> > > Date: Feb 13, 2014 4:31 PM
> > > Subject: Re: [Users] Memory usage
> > > To: "Martin Sivak" < msivak(a)redhat.com >
> > > Cc: "Doron Fediuck" < dfediuck(a)redhat.com >, "Koen
Vanoppen" <
> > > vanoppen.koen(a)gmail.com >, < users(a)ovirt.org >
> > >
> > > in attachement is the output of vdsclient
> > >
> > >
> > >
> > >
> > > On Thu, Feb 13, 2014 at 2:26 PM, Martin Sivak < msivak(a)redhat.com >
> > wrote:
> > >
> > >
> > > Hi everybody,
> > >
> > > would it be possible to add the output of vdsClient getVmStats
<affected
> > vm
> > > id> from the affected host here and to the bug?
> > >
> > > Also can we please get the bug number that tracks this issue? I
could not
> > > find it in BZ.
> > >
> > > --
> > > Martin Sivák
> > > msivak(a)redhat.com
> > > Red Hat Czech
> > > RHEV-M SLA / Brno, CZ
> > >
> > > ----- Original Message -----
> > > >
> > > >
> > > > ----- Original Message -----
> > > > > From: "Koen Vanoppen" < vanoppen.koen(a)gmail.com
>
> > > > > To: users(a)ovirt.org
> > > > > Sent: Thursday, February 13, 2014 12:18:24 PM
> > > > > Subject: Re: [Users] Memory usage
> > > > >
> > > > >
> > > > >
> > > > > We can't turn off the memory balloon option because we are
running
> > the
> > > > > 3.3.2
> > > > > with the bug of the memory balloon. Thx for the help!
> > > >
> > > > Which balloon bug?
> > > >
> > > > > On Feb 13, 2014 10:51 AM, "Doron Fediuck" <
dfediuck(a)redhat.com>
> > wrote:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > > From: "René Koch" < rkoch(a)linuxland.at >
> > > > > > To: "Doron Fediuck" < dfediuck(a)redhat.com
>
> > > > > > Cc: "Koen Vanoppen" < vanoppen.koen(a)gmail.com
>,
users(a)ovirt.org ,
> > > > > > "Martin
> > > > > > Sivak" < msivak(a)redhat.com >
> > > > > > Sent: Thursday, February 13, 2014 11:16:27 AM
> > > > > > Subject: Re: [Users] Memory usage
> > > > > >
> > > > > > On Thu, 2014-02-13 at 03:49 -0500, Doron Fediuck wrote:
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > > > From: "René Koch" <
rkoch(a)linuxland.at >
> > > > > > > > To: "Doron Fediuck" <
dfediuck(a)redhat.com >
> > > > > > > > Cc: "Koen Vanoppen" <
vanoppen.koen(a)gmail.com >,
> > users(a)ovirt.org ,
> > > > > > > > "Martin
> > > > > > > > Sivak" < msivak(a)redhat.com >
> > > > > > > > Sent: Wednesday, February 12, 2014 7:31:47 PM
> > > > > > > > Subject: Re: [Users] Memory usage
> > > > > > > >
> > > > > > > > On Wed, 2014-02-12 at 11:22 -0500, Doron Fediuck
wrote:
> > > > > > > > >
> > > > > > > > > ----- Original Message -----
> > > > > > > > > > From: "René Koch" <
rkoch(a)linuxland.at >
> > > > > > > > > > To: "Koen Vanoppen" <
vanoppen.koen(a)gmail.com >
> > > > > > > > > > Cc: users(a)ovirt.org
> > > > > > > > > > Sent: Wednesday, February 12, 2014
4:18:37 PM
> > > > > > > > > > Subject: Re: [Users] Memory usage
> > > > > > > > > >
> > > > > > > > > > On Wed, 2014-02-12 at 15:14 +0100, Koen
Vanoppen wrote:
> > > > > > > > > > > In The GUI, it says it's using
25% of the memory.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I guess it's the real value,
right?
> > > > > > > > > > The same happened for the memcached vm,
someone
reported
> > to me
> > > > > > > > > > -
> > > > > > > > > > negative value in REST-API, but correct
graph in oVirt
> > webadmin
> > > > > > > > > > GUI.
> > > > > > > > > >
> > > > > > > > > > I fear I have no idea how this can
happen - so maybe
> > someone
> > > > > > > > > > else
> > > > > > > > > > can
> > > > > > > > > > help you troubleshoot this issue.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2014-02-12 15:10 GMT+01:00 Koen
Vanoppen <
> > > > > > > > > > > vanoppen.koen(a)gmail.com >:
> > > > > > > > > > > Thanks for the quick respons, but
there is no
memcached
> > > > > > > > > > > running on that VM.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Kind regards
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2014-02-12 15:06 GMT+01:00 René
Koch <
> > rkoch(a)linuxland.at >:
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 2014-02-12 at 14:55 +0100,
Koen Vanoppen
> > > > > > > > > > > wrote:
> > > > > > > > > > > > Dear all,
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > When we monitor one of our
machines, we noticed
> > > > > > > > > > > > that
> > > > > > > > > > > there was one vm
> > > > > > > > > > > > that was constantly giving a
error of memory
> > > > > > > > > > > > usage.
> > > > > > > > > > > But when we took a
> > > > > > > > > > > > look at it, there is actually
nothing wrong with
> > > > > > > > > > > > it.
> > > > > > > > > > > Now we looked
> > > > > > > > > > > > furhter then that. We looked
at the API of the
> > > > > > > > > > > machine and noticed
> > > > > > > > > > > > something very strange:
> > > > > > > > > > > >
> > > > > > > > > > > > <statistic
> > > > > > > > > > > >
> > > > > > > > > > >
> >
href="/api/vms/3b9aa245-75ff-42e8-b921-1c9ce61826bf/statistics/b7499508-c1c3-32f0-8174-c1783e57bb08"
> > > > > > > > > > >
> >
id="b7499508-c1c3-32f0-8174-c1783e57bb08"><name>memory.used</name><description>Memory
> > > > > > > > > > > used
(agent)</description><values
> > > > > > > > > > >
> >
type="INTEGER"><value><datum>-944892806</datum></value></values><type>GAUGE</type><unit>BYTES</unit>
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > It's a negative...
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Do you have memcached running in
this vm?
> > > > > > > > > > >
> > > > > > > > > > > I heard about this issue with
memcached, but never
> > > > > > > > > > > tested memcached in
> > > > > > > > > > > my oVirt environment. You get the
real usage value
> > > > > > > > > > > with
> > > > > > > > > > > memory.used = memory.installed +
memory.used
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > René
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > What could be the problem?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Kind regards,
> > > > > > > > > > > >
> > > > > > > > > > > > koen
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Guys,
> > > > > > > > > these values are usually a result of
overcommitment
mechanism
> > > > > > > > > usage.
> > > > > > > > > For example, if KSM is effective, it will
free a lot of
> > memory
> > > > > > > > > pages,
> > > > > > > > > and total-free-committed becomes negative.
> > > > > > > >
> > > > > > > >
> > > > > > > > Thanks a lot for the information.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > This was reported in
> > > > > > > > >
https://bugzilla.redhat.com/show_bug.cgi?id=977758
> > > > > > > > > and the engine is using memFree reported by
vdsm, which
is
> > more
> > > > > > > > > accurate.
> > > > > > > > >
> > > > > > > > > The API reports the old version due to
backwards
> > compatibility.
> > > > > > > >
> > > > > > > >
> > > > > > > > I just had a look at the bugzilla report and the
RHEV
> > documentation
> > > > > > > > which says "Current memory in bytes used by
the virtual
> > machine.".
> > > > > > > > So this means, the reported values are totally
useless for
> > > > > > > > monitoring
> > > > > > > > memory usage of a virtual machine if KSM is
active. I would
> > expect
> > > > > > > > to
> > > > > > > > get the memory usage of a virtual machine and not
how much
> > memory
> > > > > > > > is
> > > > > > > > consumed on the hypervisor (this is pretty
useless
information
> > for
> > > > > > > > me).
> > > > > > > >
> > > > > > > > Is it planned to report the memory usage in a
virtual
machine
> > in
> > > > > > > > the
> > > > > > > > API
> > > > > > > > as well?
> > > > > > > >
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > René
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > René,
> > > > > > > started digging into this, but it may take some time.
> > > > > > >
> > > > > > > Note that there are 2 memory usage reports; one for
the host,
> > > > > > > and another for each VM.
> > > > > > > My response is related to the host, while the
original
question
> > was
> > > > > > > for the VM. I still think the root cause is the same
(ie-
> > related to
> > > > > > > overcommitment; for example the VM may have a balloon
inflated),
> > but
> > > > > > > would like to properly check and fix if needed.
> > > > > >
> > > > > >
> > > > > > Yes, I know. So my question was if we can add another
value
for the
> > > > > > usage in the vm in addition to the usage of the vm on the
host.
> > > > > >
> > > > > We'll know once we understand the cause- ie is it a bug
needs
fixing
> > or
> > > > > should we indeed add another value
> > > > >
> > > > > If you have a setup with this issue, try removing the balloon
device
> > > > > and see if it helps / changes.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > In order to make sure we keep track of it, please open
a bug
> > with all
> > > > > > > the
> > > > > > > relevant info.
> > > > > > >
> > > > > >
> > > > > > I'll open a bug for this.
> > > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > > Thanks and keep reporting!
> > > > > > > Doron
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Users mailing list
> > > > > Users(a)ovirt.org
> > > > >
http://lists.ovirt.org/mailman/listinfo/users
> > > > >
> > > > _______________________________________________
> > > > Users mailing list
> > > > Users(a)ovirt.org
> > > >
http://lists.ovirt.org/mailman/listinfo/users
> > > >
> > > _______________________________________________
> > > Users mailing list
> > > Users(a)ovirt.org
> > >
http://lists.ovirt.org/mailman/listinfo/users
> > >
> > >
> > > _______________________________________________
> > > Users mailing list
> > > Users(a)ovirt.org
> > >
http://lists.ovirt.org/mailman/listinfo/users
> > >
> >
>