From budic at onholyground.com Thu Sep 10 22:29:34 2015 Content-Type: multipart/mixed; boundary="===============6479761344880891401==" MIME-Version: 1.0 From: Darrell Budic To: users at ovirt.org Subject: Re: [ovirt-users] vdsm high mem usage Date: Thu, 10 Sep 2015 21:29:30 -0500 Message-ID: <2459DE56-EB7D-42B8-92B5-6DA97C988A02@onholyground.com> In-Reply-To: CAORA4xDoOsJVgboaBxD3JF9Q9-tfZdiAWck3L5rBGZ3Cmf-Tfw@mail.gmail.com --===============6479761344880891401== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable --Apple-Mail=3D_B86C8BFC-E955-41B4-9601-CB6FF25A370A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=3Dutf-8 If you=3DE2=3D80=3D99re using nfs mounts (even if they are gluster based), = =3D it=3DE2=3D80=3D99s safe to restart vdsmd, you=3DE2=3D80=3D99ll see it chang= e status =3D in ovirt, but your VMs will continue running. If you=3DE2=3D80=3D99re mount= ing =3D gluster based storage as glusterfs shares directly (not over nfs), =3D there=3DE2=3D80=3D99s another issue that will cause all your VMs to pause a= nd =3D the only way to recover is to stop them and restart them, but that=3DE2=3D8= 0=3D99=3D s going to happen to them anyway when vdsmd runs out of ram and =3D crashes=3DE2=3D80=3DA6 Best solution is to migrate them yourself in this ca= se, =3D then restart and migrate back. Or live migrate them to NFS mounted =3D storage so when vdsm crashes they don=3DE2=3D80=3D99t lock up, and clean up= =3D after you=3DE2=3D80=3D99ve had an opportunity to upgrade or patch. Upgrade to 3.5.3 or later at your earliest opportunity, the mem leak is =3D resolved there. Sounds like you already found the patch you can apply if = =3D upgrading isn=3DE2=3D80=3D99t an option, but it will still require you to = =3D restart your vdsms. -Darrell > On Sep 10, 2015, at 1:45 PM, Michael Kleinpaste =3D wrote: >=3D20 > Hi everybody. >=3D20 > So I ran into that high mem usage thing. The problem I have with =3D patching is that this is a live system so I can't do it mid day. Can =3D anybody tell me if it is possible to just restart the vdsm service or =3D does the host have to be in "maintenance mode" before restarting it? It = =3D is using gluster storage, if that makes a difference as well. >=3D20 > Thanks, >=3D20 > --=3D20 > Michael Kleinpaste > Senior Systems Administrator > SharperLending, LLC. > www.SharperLending.com <> > Michael.Kleinpaste(a)SharperLending.com > (509) 324-1230 Fax: (509) 324-1234 > _______________________________________________ > Users mailing list > Users(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/users --Apple-Mail=3D_B86C8BFC-E955-41B4-9601-CB6FF25A370A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=3Dutf-8 If you=3DE2=3D80=3D99re using nfs mounts (even if they are glu= ster =3D based), it=3DE2=3D80=3D99s safe to restart vdsmd, you=3DE2=3D80=3D99ll see = it change =3D status in ovirt, but your VMs will continue running. If you=3DE2=3D80=3D99r= e =3D mounting gluster based storage as glusterfs shares directly (not over =3D nfs), there=3DE2=3D80=3D99s another issue that will cause all your VMs to = =3D pause and the only way to recover is to stop them and restart them, but =3D that=3DE2=3D80=3D99s going to happen to them anyway when vdsmd runs out of = ram =3D and crashes=3DE2=3D80=3DA6 Best solution is to migrate them yourself in thi= s =3D case, then restart and migrate back. Or live migrate them to NFS mounted = =3D storage so when vdsm crashes they don=3DE2=3D80=3D99t lock up, and clean up= =3D after you=3DE2=3D80=3D99ve had an opportunity to upgrade or patch.

Upgrade to 3.5.3 or = =3D later at your earliest opportunity, the mem leak is resolved there. =3D Sounds like you already found the patch you can apply if upgrading =3D isn=3DE2=3D80=3D99t an option, but it will still require you to restart you= r =3D vdsms.

&nbs= p; =3D -Darrell

On Sep 10, 2015, at 1:45 PM,= =3D Michael Kleinpaste <michael.kleinpaste(a)sharperlending.com> wrote:
Hi everybody.

S= o I =3D ran into that high mem usage thing. The problem I have with patching is =3D that this is a live system so I can't do it mid day.  Can anybody =3D tell me if it is possible to just restart the vdsm service or does the =3D host have to be in "maintenance mode" before restarting it?  It is =3D using gluster storage, if that makes a difference as well.

Thanks,

--
Michael Kleinpaste
Seni= or =3D Systems Administrator
SharperLending, LLC.
www.SharperLending.com
= Michael.Kleinpaste(a)SharperLending.com
(509) 324-1230   Fax: (509) =3D 324-1234
_______________________________________________
Users =3D mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

= =3D --Apple-Mail=3D_B86C8BFC-E955-41B4-9601-CB6FF25A370A-- --===============6479761344880891401== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" Ci0tQXBwbGUtTWFpbD1fQjg2QzhCRkMtRTk1NS00MUI0LTk2MDEtQ0I2RkYyNUEzNzBBCkNvbnRl bnQtVHJhbnNmZXItRW5jb2Rpbmc6IHF1b3RlZC1wcmludGFibGUKQ29udGVudC1UeXBlOiB0ZXh0 L3BsYWluOwoJY2hhcnNldD11dGYtOAoKSWYgeW91PUUyPTgwPTk5cmUgdXNpbmcgbmZzIG1vdW50 cyAoZXZlbiBpZiB0aGV5IGFyZSBnbHVzdGVyIGJhc2VkKSwgPQppdD1FMj04MD05OXMgc2FmZSB0 byByZXN0YXJ0IHZkc21kLCB5b3U9RTI9ODA9OTlsbCBzZWUgaXQgY2hhbmdlIHN0YXR1cyA9Cmlu IG92aXJ0LCBidXQgeW91ciBWTXMgd2lsbCBjb250aW51ZSBydW5uaW5nLiBJZiB5b3U9RTI9ODA9 OTlyZSBtb3VudGluZyA9CmdsdXN0ZXIgYmFzZWQgc3RvcmFnZSBhcyBnbHVzdGVyZnMgc2hhcmVz IGRpcmVjdGx5IChub3Qgb3ZlciBuZnMpLCA9CnRoZXJlPUUyPTgwPTk5cyBhbm90aGVyIGlzc3Vl IHRoYXQgd2lsbCBjYXVzZSBhbGwgeW91ciBWTXMgdG8gcGF1c2UgYW5kID0KdGhlIG9ubHkgd2F5 IHRvIHJlY292ZXIgaXMgdG8gc3RvcCB0aGVtIGFuZCByZXN0YXJ0IHRoZW0sIGJ1dCB0aGF0PUUy PTgwPTk5PQpzIGdvaW5nIHRvIGhhcHBlbiB0byB0aGVtIGFueXdheSB3aGVuIHZkc21kIHJ1bnMg b3V0IG9mIHJhbSBhbmQgPQpjcmFzaGVzPUUyPTgwPUE2IEJlc3Qgc29sdXRpb24gaXMgdG8gbWln cmF0ZSB0aGVtIHlvdXJzZWxmIGluIHRoaXMgY2FzZSwgPQp0aGVuIHJlc3RhcnQgYW5kIG1pZ3Jh dGUgYmFjay4gT3IgbGl2ZSBtaWdyYXRlIHRoZW0gdG8gTkZTIG1vdW50ZWQgPQpzdG9yYWdlIHNv IHdoZW4gdmRzbSBjcmFzaGVzIHRoZXkgZG9uPUUyPTgwPTk5dCBsb2NrIHVwLCBhbmQgY2xlYW4g dXAgPQphZnRlciB5b3U9RTI9ODA9OTl2ZSBoYWQgYW4gb3Bwb3J0dW5pdHkgdG8gdXBncmFkZSBv ciBwYXRjaC4KClVwZ3JhZGUgdG8gMy41LjMgb3IgbGF0ZXIgYXQgeW91ciBlYXJsaWVzdCBvcHBv cnR1bml0eSwgdGhlIG1lbSBsZWFrIGlzID0KcmVzb2x2ZWQgdGhlcmUuIFNvdW5kcyBsaWtlIHlv dSBhbHJlYWR5IGZvdW5kIHRoZSBwYXRjaCB5b3UgY2FuIGFwcGx5IGlmID0KdXBncmFkaW5nIGlz bj1FMj04MD05OXQgYW4gb3B0aW9uLCBidXQgaXQgd2lsbCBzdGlsbCByZXF1aXJlIHlvdSB0byA9 CnJlc3RhcnQgeW91ciB2ZHNtcy4KCiAgLURhcnJlbGwKCj4gT24gU2VwIDEwLCAyMDE1LCBhdCAx OjQ1IFBNLCBNaWNoYWVsIEtsZWlucGFzdGUgPQo8bWljaGFlbC5rbGVpbnBhc3RlQHNoYXJwZXJs ZW5kaW5nLmNvbT4gd3JvdGU6Cj49MjAKPiBIaSBldmVyeWJvZHkuCj49MjAKPiBTbyBJIHJhbiBp bnRvIHRoYXQgaGlnaCBtZW0gdXNhZ2UgdGhpbmcuIFRoZSBwcm9ibGVtIEkgaGF2ZSB3aXRoID0K cGF0Y2hpbmcgaXMgdGhhdCB0aGlzIGlzIGEgbGl2ZSBzeXN0ZW0gc28gSSBjYW4ndCBkbyBpdCBt aWQgZGF5LiAgQ2FuID0KYW55Ym9keSB0ZWxsIG1lIGlmIGl0IGlzIHBvc3NpYmxlIHRvIGp1c3Qg cmVzdGFydCB0aGUgdmRzbSBzZXJ2aWNlIG9yID0KZG9lcyB0aGUgaG9zdCBoYXZlIHRvIGJlIGlu ICJtYWludGVuYW5jZSBtb2RlIiBiZWZvcmUgcmVzdGFydGluZyBpdD8gIEl0ID0KaXMgdXNpbmcg Z2x1c3RlciBzdG9yYWdlLCBpZiB0aGF0IG1ha2VzIGEgZGlmZmVyZW5jZSBhcyB3ZWxsLgo+PTIw Cj4gVGhhbmtzLAo+PTIwCj4gLS09MjAKPiBNaWNoYWVsIEtsZWlucGFzdGUKPiBTZW5pb3IgU3lz dGVtcyBBZG1pbmlzdHJhdG9yCj4gU2hhcnBlckxlbmRpbmcsIExMQy4KPiB3d3cuU2hhcnBlckxl bmRpbmcuY29tIDw+Cj4gTWljaGFlbC5LbGVpbnBhc3RlQFNoYXJwZXJMZW5kaW5nLmNvbQo+ICg1 MDkpIDMyNC0xMjMwICAgRmF4OiAoNTA5KSAzMjQtMTIzNAo+IF9fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fCj4gVXNlcnMgbWFpbGluZyBsaXN0Cj4gVXNlcnNA b3ZpcnQub3JnCj4gaHR0cDovL2xpc3RzLm92aXJ0Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL3VzZXJz CgoKLS1BcHBsZS1NYWlsPV9CODZDOEJGQy1FOTU1LTQxQjQtOTYwMS1DQjZGRjI1QTM3MEEKQ29u dGVudC1UcmFuc2Zlci1FbmNvZGluZzogcXVvdGVkLXByaW50YWJsZQpDb250ZW50LVR5cGU6IHRl eHQvaHRtbDsKCWNoYXJzZXQ9dXRmLTgKCjxodG1sPjxoZWFkPjxtZXRhIGh0dHAtZXF1aXY9M0Qi Q29udGVudC1UeXBlIiBjb250ZW50PTNEInRleHQvaHRtbCA9CmNoYXJzZXQ9M0R1dGYtOCI+PC9o ZWFkPjxib2R5IHN0eWxlPTNEIndvcmQtd3JhcDogYnJlYWstd29yZDsgPQotd2Via2l0LW5ic3At bW9kZTogc3BhY2U7IC13ZWJraXQtbGluZS1icmVhazogYWZ0ZXItd2hpdGUtc3BhY2U7IiA9CmNs YXNzPTNEIiI+SWYgeW91PUUyPTgwPTk5cmUgdXNpbmcgbmZzIG1vdW50cyAoZXZlbiBpZiB0aGV5 IGFyZSBnbHVzdGVyID0KYmFzZWQpLCBpdD1FMj04MD05OXMgc2FmZSB0byByZXN0YXJ0IHZkc21k LCB5b3U9RTI9ODA9OTlsbCBzZWUgaXQgY2hhbmdlID0Kc3RhdHVzIGluIG92aXJ0LCBidXQgeW91 ciBWTXMgd2lsbCBjb250aW51ZSBydW5uaW5nLiBJZiB5b3U9RTI9ODA9OTlyZSA9Cm1vdW50aW5n IGdsdXN0ZXIgYmFzZWQgc3RvcmFnZSBhcyBnbHVzdGVyZnMgc2hhcmVzIGRpcmVjdGx5IChub3Qg b3ZlciA9Cm5mcyksIHRoZXJlPUUyPTgwPTk5cyBhbm90aGVyIGlzc3VlIHRoYXQgd2lsbCBjYXVz ZSBhbGwgeW91ciBWTXMgdG8gPQpwYXVzZSBhbmQgdGhlIG9ubHkgd2F5IHRvIHJlY292ZXIgaXMg dG8gc3RvcCB0aGVtIGFuZCByZXN0YXJ0IHRoZW0sIGJ1dCA9CnRoYXQ9RTI9ODA9OTlzIGdvaW5n IHRvIGhhcHBlbiB0byB0aGVtIGFueXdheSB3aGVuIHZkc21kIHJ1bnMgb3V0IG9mIHJhbSA9CmFu ZCBjcmFzaGVzPUUyPTgwPUE2IEJlc3Qgc29sdXRpb24gaXMgdG8gbWlncmF0ZSB0aGVtIHlvdXJz ZWxmIGluIHRoaXMgPQpjYXNlLCB0aGVuIHJlc3RhcnQgYW5kIG1pZ3JhdGUgYmFjay4gT3IgbGl2 ZSBtaWdyYXRlIHRoZW0gdG8gTkZTIG1vdW50ZWQgPQpzdG9yYWdlIHNvIHdoZW4gdmRzbSBjcmFz aGVzIHRoZXkgZG9uPUUyPTgwPTk5dCBsb2NrIHVwLCBhbmQgY2xlYW4gdXAgPQphZnRlciB5b3U9 RTI9ODA9OTl2ZSBoYWQgYW4gb3Bwb3J0dW5pdHkgdG8gdXBncmFkZSBvciBwYXRjaC48ZGl2ID0K Y2xhc3M9M0QiIj48YnIgY2xhc3M9M0QiIj48L2Rpdj48ZGl2IGNsYXNzPTNEIiI+VXBncmFkZSB0 byAzLjUuMyBvciA9CmxhdGVyIGF0IHlvdXIgZWFybGllc3Qgb3Bwb3J0dW5pdHksIHRoZSBtZW0g bGVhayBpcyByZXNvbHZlZCB0aGVyZS4gPQpTb3VuZHMgbGlrZSB5b3UgYWxyZWFkeSBmb3VuZCB0 aGUgcGF0Y2ggeW91IGNhbiBhcHBseSBpZiB1cGdyYWRpbmcgPQppc249RTI9ODA9OTl0IGFuIG9w dGlvbiwgYnV0IGl0IHdpbGwgc3RpbGwgcmVxdWlyZSB5b3UgdG8gcmVzdGFydCB5b3VyID0KdmRz bXMuPC9kaXY+PGRpdiBjbGFzcz0zRCIiPjxiciBjbGFzcz0zRCIiPjwvZGl2PjxkaXYgY2xhc3M9 M0QiIj4mbmJzcDsgPQotRGFycmVsbDwvZGl2PjxkaXYgY2xhc3M9M0QiIj48YnIgY2xhc3M9M0Qi Ij48ZGl2PjxibG9ja3F1b3RlID0KdHlwZT0zRCJjaXRlIiBjbGFzcz0zRCIiPjxkaXYgY2xhc3M9 M0QiIj5PbiBTZXAgMTAsIDIwMTUsIGF0IDE6NDUgUE0sID0KTWljaGFlbCBLbGVpbnBhc3RlICZs dDs8YSA9CmhyZWY9M0QibWFpbHRvOm1pY2hhZWwua2xlaW5wYXN0ZUBzaGFycGVybGVuZGluZy5j b20iID0KY2xhc3M9M0QiIj5taWNoYWVsLmtsZWlucGFzdGVAc2hhcnBlcmxlbmRpbmcuY29tPC9h PiZndDsgd3JvdGU6PC9kaXY+PGJyID0KY2xhc3M9M0QiQXBwbGUtaW50ZXJjaGFuZ2UtbmV3bGlu ZSI+PGRpdiBjbGFzcz0zRCIiPjxkaXYgZGlyPTNEImx0ciIgPQpjbGFzcz0zRCIiPjxkaXYgY2xh c3M9M0QidXliOEdmIiA9CnN0eWxlPTNEImZvbnQtc2l6ZToxMy4ycHg7bGluZS1oZWlnaHQ6MTku OHB4Ij48ZGl2IGNsYXNzPTNEIkYzaGxPIj48ZGl2ID0KZGlyPTNEImx0ciIgY2xhc3M9M0QiIj5I aSBldmVyeWJvZHkuPGJyIGNsYXNzPTNEIiI+PGJyIGNsYXNzPTNEIiI+U28gSSA9CnJhbiBpbnRv IHRoYXQgaGlnaCBtZW0gdXNhZ2UgdGhpbmcuIFRoZSBwcm9ibGVtIEkgaGF2ZSB3aXRoIHBhdGNo aW5nIGlzID0KdGhhdCB0aGlzIGlzIGEgbGl2ZSBzeXN0ZW0gc28gSSBjYW4ndCBkbyBpdCBtaWQg ZGF5LiZuYnNwOyBDYW4gYW55Ym9keSA9CnRlbGwgbWUgaWYgaXQgaXMgcG9zc2libGUgdG8ganVz dCByZXN0YXJ0IHRoZSB2ZHNtIHNlcnZpY2Ugb3IgZG9lcyB0aGUgPQpob3N0IGhhdmUgdG8gYmUg aW4gIm1haW50ZW5hbmNlIG1vZGUiIGJlZm9yZSByZXN0YXJ0aW5nIGl0PyZuYnNwOyBJdCBpcyA9 CnVzaW5nIGdsdXN0ZXIgc3RvcmFnZSwgaWYgdGhhdCBtYWtlcyBhIGRpZmZlcmVuY2UgYXMgd2Vs bC48ZGl2ID0KY2xhc3M9M0QiIj48YnIgY2xhc3M9M0QiIj48L2Rpdj48ZGl2IGNsYXNzPTNEIiI+ VGhhbmtzLDwvZGl2PjxkaXYgPQpjbGFzcz0zRCIiPjxiciBjbGFzcz0zRCIiPjwvZGl2PjwvZGl2 PjwvZGl2PjwvZGl2PjwvZGl2PjxkaXYgZGlyPTNEImx0ciIgPQpjbGFzcz0zRCIiPi0tIDxiciBj bGFzcz0zRCIiPjwvZGl2PjxkaXYgZGlyPTNEImx0ciIgY2xhc3M9M0QiIj48YiA9CmNsYXNzPTNE IiI+TWljaGFlbCBLbGVpbnBhc3RlPC9iPjxiciBjbGFzcz0zRCIiPjxzcGFuIGNsYXNzPTNEIiI+ U2VuaW9yID0KU3lzdGVtcyBBZG1pbmlzdHJhdG9yPC9zcGFuPjxiciBjbGFzcz0zRCIiPjxzcGFu ID0KY2xhc3M9M0QiIj5TaGFycGVyTGVuZGluZywgTExDLjwvc3Bhbj48YnIgY2xhc3M9M0QiIj48 YSA9CmNsYXNzPTNEIiI+d3d3LlNoYXJwZXJMZW5kaW5nLmNvbTwvYT48YnIgY2xhc3M9M0QiIj48 c3BhbiBjbGFzcz0zRCIiPjxhID0KaHJlZj0zRCJtYWlsdG86TWljaGFlbC5LbGVpbnBhc3RlQFNo YXJwZXJMZW5kaW5nLmNvbSIgPQpjbGFzcz0zRCIiPk1pY2hhZWwuS2xlaW5wYXN0ZUBTaGFycGVy TGVuZGluZy5jb208L2E+PC9zcGFuPjxiciA9CmNsYXNzPTNEIiI+PHNwYW4gY2xhc3M9M0QiIj4o NTA5KSAzMjQtMTIzMCAmbmJzcDsgRmF4OiAoNTA5KSA9CjMyNC0xMjM0PC9zcGFuPjwvZGl2Pgpf X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXzxiciBjbGFzcz0z RCIiPlVzZXJzID0KbWFpbGluZyBsaXN0PGJyIGNsYXNzPTNEIiI+PGEgaHJlZj0zRCJtYWlsdG86 VXNlcnNAb3ZpcnQub3JnIiA9CmNsYXNzPTNEIiI+VXNlcnNAb3ZpcnQub3JnPC9hPjxiciA9CmNs YXNzPTNEIiI+aHR0cDovL2xpc3RzLm92aXJ0Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL3VzZXJzPGJy ID0KY2xhc3M9M0QiIj48L2Rpdj48L2Jsb2NrcXVvdGU+PC9kaXY+PGJyIGNsYXNzPTNEIiI+PC9k aXY+PC9ib2R5PjwvaHRtbD49CgotLUFwcGxlLU1haWw9X0I4NkM4QkZDLUU5NTUtNDFCNC05NjAx LUNCNkZGMjVBMzcwQS0tCg== --===============6479761344880891401==--