vdsm high mem usage

Hi everybody. So I ran into that high mem usage thing. The problem I have with patching is that this is a live system so I can't do it mid day. Can anybody tell me if it is possible to just restart the vdsm service or does the host have to be in "maintenance mode" before restarting it? It is using gluster storage, if that makes a difference as well. Thanks, -- *Michael Kleinpaste* Senior Systems Administrator SharperLending, LLC. www.SharperLending.com Michael.Kleinpaste@SharperLending.com (509) 324-1230 Fax: (509) 324-1234

On Sep 10, 2015, at 1:45 PM, Michael Kleinpaste = <michael.kleinpaste@sharperlending.com> wrote: =20 Hi everybody. =20 So I ran into that high mem usage thing. The problem I have with =
--Apple-Mail=_B86C8BFC-E955-41B4-9601-CB6FF25A370A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 If you=E2=80=99re using nfs mounts (even if they are gluster based), = it=E2=80=99s safe to restart vdsmd, you=E2=80=99ll see it change status = in ovirt, but your VMs will continue running. If you=E2=80=99re mounting = gluster based storage as glusterfs shares directly (not over nfs), = there=E2=80=99s another issue that will cause all your VMs to pause and = the only way to recover is to stop them and restart them, but that=E2=80=99= s going to happen to them anyway when vdsmd runs out of ram and = crashes=E2=80=A6 Best solution is to migrate them yourself in this case, = then restart and migrate back. Or live migrate them to NFS mounted = storage so when vdsm crashes they don=E2=80=99t lock up, and clean up = after you=E2=80=99ve had an opportunity to upgrade or patch. Upgrade to 3.5.3 or later at your earliest opportunity, the mem leak is = resolved there. Sounds like you already found the patch you can apply if = upgrading isn=E2=80=99t an option, but it will still require you to = restart your vdsms. -Darrell patching is that this is a live system so I can't do it mid day. Can = anybody tell me if it is possible to just restart the vdsm service or = does the host have to be in "maintenance mode" before restarting it? It = is using gluster storage, if that makes a difference as well.
=20 Thanks, =20 --=20 Michael Kleinpaste Senior Systems Administrator SharperLending, LLC. www.SharperLending.com <> Michael.Kleinpaste@SharperLending.com (509) 324-1230 Fax: (509) 324-1234 _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--Apple-Mail=_B86C8BFC-E955-41B4-9601-CB6FF25A370A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D"">If you=E2=80=99re using nfs mounts (even if they are gluster = based), it=E2=80=99s safe to restart vdsmd, you=E2=80=99ll see it change = status in ovirt, but your VMs will continue running. If you=E2=80=99re = mounting gluster based storage as glusterfs shares directly (not over = nfs), there=E2=80=99s another issue that will cause all your VMs to = pause and the only way to recover is to stop them and restart them, but = that=E2=80=99s going to happen to them anyway when vdsmd runs out of ram = and crashes=E2=80=A6 Best solution is to migrate them yourself in this = case, then restart and migrate back. Or live migrate them to NFS mounted = storage so when vdsm crashes they don=E2=80=99t lock up, and clean up = after you=E2=80=99ve had an opportunity to upgrade or patch.<div = class=3D""><br class=3D""></div><div class=3D"">Upgrade to 3.5.3 or = later at your earliest opportunity, the mem leak is resolved there. = Sounds like you already found the patch you can apply if upgrading = isn=E2=80=99t an option, but it will still require you to restart your = vdsms.</div><div class=3D""><br class=3D""></div><div class=3D""> = -Darrell</div><div class=3D""><br class=3D""><div><blockquote = type=3D"cite" class=3D""><div class=3D"">On Sep 10, 2015, at 1:45 PM, = Michael Kleinpaste <<a = href=3D"mailto:michael.kleinpaste@sharperlending.com" = class=3D"">michael.kleinpaste@sharperlending.com</a>> wrote:</div><br = class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" = class=3D""><div class=3D"uyb8Gf" = style=3D"font-size:13.2px;line-height:19.8px"><div class=3D"F3hlO"><div = dir=3D"ltr" class=3D"">Hi everybody.<br class=3D""><br class=3D"">So I = ran into that high mem usage thing. The problem I have with patching is = that this is a live system so I can't do it mid day. Can anybody = tell me if it is possible to just restart the vdsm service or does the = host have to be in "maintenance mode" before restarting it? It is = using gluster storage, if that makes a difference as well.<div = class=3D""><br class=3D""></div><div class=3D"">Thanks,</div><div = class=3D""><br class=3D""></div></div></div></div></div><div dir=3D"ltr" = class=3D"">-- <br class=3D""></div><div dir=3D"ltr" class=3D""><b = class=3D"">Michael Kleinpaste</b><br class=3D""><span class=3D"">Senior = Systems Administrator</span><br class=3D""><span = class=3D"">SharperLending, LLC.</span><br class=3D""><a = class=3D"">www.SharperLending.com</a><br class=3D""><span class=3D""><a = href=3D"mailto:Michael.Kleinpaste@SharperLending.com" = class=3D"">Michael.Kleinpaste@SharperLending.com</a></span><br = class=3D""><span class=3D"">(509) 324-1230 Fax: (509) = 324-1234</span></div> _______________________________________________<br class=3D"">Users = mailing list<br class=3D""><a href=3D"mailto:Users@ovirt.org" = class=3D"">Users@ovirt.org</a><br = class=3D"">http://lists.ovirt.org/mailman/listinfo/users<br = class=3D""></div></blockquote></div><br class=3D""></div></body></html>= --Apple-Mail=_B86C8BFC-E955-41B4-9601-CB6FF25A370A--

If you’re using nfs mounts (even if they are gluster based), it’s safe to restart vdsmd, you’ll see it change status in ovirt, but your VMs will continue running. If you’re mounting gluster based storage as glusterfs shares directly (not over nfs), there’s another issue that will cause all your VMs to pause and the only way to recover is to stop them and restart them, but that’s going to happen to them anyway when vdsmd runs out of ram and crashes… Best solution is to migrate them yourself in this case, then restart and migrate back. This is what I have done. The easiest way to do so is to set the host in
Hello Michael, I ran into the issue myself and can confirm restarting vdsm with nfs mitgates the issue. I even had a cron job for that On 11.09.2015 04:30, Darrell Budic wrote: maintenance, wait for the migration finishes and then restart vdsm. You sould do this only at one host and then wait a while so you do not run into OOM on the whole cluster at once. Or live
migrate them to NFS mounted storage so when vdsm crashes they don’t lock up, and clean up after you’ve had an opportunity to upgrade or patch.
Upgrade to 3.5.3 or later at your earliest opportunity, the mem leak is resolved there. Sounds like you already found the patch you can apply if upgrading isn’t an option, but it will still require you to restart your vdsms.
I can confirm 3.5.3 finally solved the issue for us and VDSM keeps below 100MB RSS.
-Darrell
On Sep 10, 2015, at 1:45 PM, Michael Kleinpaste <michael.kleinpaste@sharperlending.com <mailto:michael.kleinpaste@sharperlending.com>> wrote:
Hi everybody.
So I ran into that high mem usage thing. The problem I have with patching is that this is a live system so I can't do it mid day. Can anybody tell me if it is possible to just restart the vdsm service or does the host have to be in "maintenance mode" before restarting it? It is using gluster storage, if that makes a difference as well.
Thanks,
-- *Michael Kleinpaste* Senior Systems Administrator SharperLending, LLC. www.SharperLending.com Michael.Kleinpaste@SharperLending.com <mailto:Michael.Kleinpaste@SharperLending.com> (509) 324-1230 Fax: (509) 324-1234 _______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users
-- Daniel Helgenberger m box bewegtbild GmbH P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19 D-10115 BERLIN www.m-box.de www.monkeymen.tv Geschäftsführer: Martin Retschitzegger / Michaela Göllner Handeslregister: Amtsgericht Charlottenburg / HRB 112767
participants (3)
-
Daniel Helgenberger
-
Darrell Budic
-
Michael Kleinpaste