[ovirt-users] Can I reduce the Java heap size of engine-backup???

Yedidyah Bar David didi at redhat.com
Thu Dec 31 07:48:33 UTC 2015


On Wed, Dec 30, 2015 at 7:50 PM, John Florian <jflorian at doubledog.org> wrote:
> On 12/29/2015 02:02 AM, Yedidyah Bar David wrote:
>> On Tue, Dec 29, 2015 at 12:51 AM, John Florian <jflorian at doubledog.org> wrote:
>>> I'm trying to run the engine-backup script via a Bacula job using the
>>> RunScript option so that the engine-backup dumps its output someplace
>>> where Bacula will collect it once engine-backup finishes.  However the
>>> job is failing and with enough digging I eventually learned the script
>>> was writing the following in /tmp/hs_err_pid5789.log:
>>>
>>> #
>>> # There is insufficient memory for the Java Runtime Environment to continue.
>>> # Native memory allocation (mmap) failed to map 2555904 bytes for
>>> committing reserved memory.
>>> # Possible reasons:
>>> #   The system is out of physical RAM or swap space
>>> #   In 32 bit mode, the process size limit was hit
>>> # Possible solutions:
>>> #   Reduce memory load on the system
>>> #   Increase physical memory or swap space
>>> #   Check if swap backing store is full
>>> #   Use 64 bit Java on a 64 bit OS
>>> #   Decrease Java heap size (-Xmx/-Xms)
>>> #   Decrease number of Java threads
>>> #   Decrease Java thread stack sizes (-Xss)
>>> #   Set larger code cache with -XX:ReservedCodeCacheSize=
>>> # This output file may be truncated or incomplete.
>>> #
>>> #  Out of Memory Error (os_linux.cpp:2627), pid=5789, tid=140709998221056
>>> #
>>> # JRE version:  (8.0_65-b17) (build )
>>> # Java VM: OpenJDK 64-Bit Server VM (25.65-b01 mixed mode linux-amd64
>>> compressed oops)
>>> # Failed to write core dump. Core dumps have been disabled. To enable
>>> core dumping, try "ulimit -c unlimited" before starting Java again
>>> #
>>>
>>>
>>> So is there any good way to reduce the Java heap size?  I mean I know
>>> what -Xmx does, but where might I try setting it, ideally so that it
>>> affects the engine-backup only?  Any idea of good setting for a very
>>> small environment with a dozen VMs?
>> engine-backup does not directly call nor need java.
>>
>> AFAICS it only calls it indirectly as part of some other initialization
>> by running java-home [1], which is a script that decides what JAVA_HOME
>> to use for the engine. This script only runs 'java -version', which imo
>> should not need that much memory. Perhaps there is something else I do
>> not fully understand, such as bacula severely limiting available resources
>> for the process it runs, or something like that.
>>
>> If you only want to debug it, and not as a recommended final solution,
>> you can create a script [2] which only outputs the needed java home.
>> Simply run [1] and make [2] echo the same thing. If [2] exists, [1] will
>> only run it and nothing else, as you can see inside it.
>>
>> I do not think this will work - quite likely engine-backup will fail
>> shortly later, if indeed it gets access to so little memory. Please
>> report back. Thanks and good luck,
>>
>> [1] /usr/share/ovirt-engine/bin/java-home
>> [2] /usr/share/ovirt-engine/bin/java-home.local
> Thanks for the info and response Didi.  Doing the above did allow the
> backup to run successfully.

OK.

>  I had also replaced the Bacula RunScript
> with "bash -c ulimit" which reported unlimited but I don't play with
> those types of limits enough to know if that's correctly reporting to
> what engine-backup is constrained.

And was this enough?

>  I did occur to me that perhaps a
> better way to learn of any such constraints would be to query Bacula's
> file daemon (the only necessary Bacula component running on client
> systems that are getting backed up) since I suspect it must be this
> component that's actually spawning the RunScript client side.  From the
> Bacula Director (server side) I queried the status of the client which
> is my oVirt engine and it reports:
>
> europa.doubledog.org-fd Version: 5.2.13 (19 February 2013)
> x86_64-redhat-linux-gnu redhat (Core)
> Daemon started 28-Dec-15 16:08. Jobs: run=2 running=0.
>  Heap: heap=32,768 smbytes=190,247 max_bytes=1,599,864 bufs=100
> max_bufs=6,758
>  Sizeof: boffset_t=8 size_t=8 debug=0 trace=0
>
> Alas, I know of no way to increase any of the bacula-fd limits.  If I
> dead-end here, perhaps I'll query the Bacula mailing lists.

For both yourself and for others, I think it's best to continue with
this route.

Also note that I have no idea how much memory pg_dump might need on
a larger database, also including dwh which tends to get larger faster
than the engine's.

>
> Meanwhile I tried the following for a more permanent solution but this
> failed same as before:
>
> # diff -u java-home.orig-3.6.1.3 java-home
> --- java-home.orig-3.6.1.3      2015-12-10 13:07:44.000000000 -0500
> +++ java-home   2015-12-30 12:12:45.779462769 -0500
> @@ -13,7 +13,7 @@
>         local ret=1
>
>         if [ -x "${dir}/bin/java" ]; then
> -               local version="$("${dir}/bin/java" -version 2>&1 | sed \
> +               local version="$("${dir}/bin/java" -Xmx 8 -version 2>&1
> | sed \
>                         -e 's/^openjdk version "1\.8\.0.*/VERSION_OK/' \
>                         -e 's/^java version "1\.7\.0.*/VERSION_OK/' \
>                         -e 's/^OpenJDK .*(.*).*/VENDOR_OK/' \

No idea here, you might try passing other options, and/or strace/valgrind/etc,
and/or monitor with other (including java-specific) tools, etc., and/or ask
Java experts (I am not one). Adding Juan.

>
>
> If this script is merely checking the validity of  the JRE/JDK, should
> it not be possible to have a test on the rpm details first and only
> proceed as it does now if that doesn't work?  The current tests should
> work w/o much regard for how the JRE/JDK got installed, but if it was
> installed via rpm it seems a simpler test could be used as a shortcut.

Patches are welcome :-)

Note that current code is designed to be compatible with many environments,
including different el/fedora versions, upgrades inside them etc., and
the $0.local was added mainly to allow supporting other systems (including
gentoo) where $0.local will also be shipped/packaged by the distribution.
Obviously we can add similar patches to make it even more complex, but as
I wrote above, not sure it's worth it - because if memory is your only
problem, you might simply postpone it this way.

Best,
-- 
Didi



More information about the Users mailing list