Can I reduce the Java heap size of engine-backup???

I'm trying to run the engine-backup script via a Bacula job using the RunScript option so that the engine-backup dumps its output someplace where Bacula will collect it once engine-backup finishes. However the job is failing and with enough digging I eventually learned the script was writing the following in /tmp/hs_err_pid5789.log: # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 2555904 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2627), pid=5789, tid=140709998221056 # # JRE version: (8.0_65-b17) (build ) # Java VM: OpenJDK 64-Bit Server VM (25.65-b01 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # So is there any good way to reduce the Java heap size? I mean I know what -Xmx does, but where might I try setting it, ideally so that it affects the engine-backup only? Any idea of good setting for a very small environment with a dozen VMs? -- John Florian

On Tue, Dec 29, 2015 at 12:51 AM, John Florian <jflorian@doubledog.org> wrote:
I'm trying to run the engine-backup script via a Bacula job using the RunScript option so that the engine-backup dumps its output someplace where Bacula will collect it once engine-backup finishes. However the job is failing and with enough digging I eventually learned the script was writing the following in /tmp/hs_err_pid5789.log:
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 2555904 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2627), pid=5789, tid=140709998221056 # # JRE version: (8.0_65-b17) (build ) # Java VM: OpenJDK 64-Bit Server VM (25.65-b01 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again #
So is there any good way to reduce the Java heap size? I mean I know what -Xmx does, but where might I try setting it, ideally so that it affects the engine-backup only? Any idea of good setting for a very small environment with a dozen VMs?
engine-backup does not directly call nor need java. AFAICS it only calls it indirectly as part of some other initialization by running java-home [1], which is a script that decides what JAVA_HOME to use for the engine. This script only runs 'java -version', which imo should not need that much memory. Perhaps there is something else I do not fully understand, such as bacula severely limiting available resources for the process it runs, or something like that. If you only want to debug it, and not as a recommended final solution, you can create a script [2] which only outputs the needed java home. Simply run [1] and make [2] echo the same thing. If [2] exists, [1] will only run it and nothing else, as you can see inside it. I do not think this will work - quite likely engine-backup will fail shortly later, if indeed it gets access to so little memory. Please report back. Thanks and good luck, [1] /usr/share/ovirt-engine/bin/java-home [2] /usr/share/ovirt-engine/bin/java-home.local -- Didi

On Tue, Dec 29, 2015 at 12:51 AM, John Florian <jflorian@doubledog.org> wrote:
I'm trying to run the engine-backup script via a Bacula job using the RunScript option so that the engine-backup dumps its output someplace where Bacula will collect it once engine-backup finishes. However the job is failing and with enough digging I eventually learned the script was writing the following in /tmp/hs_err_pid5789.log:
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 2555904 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2627), pid=5789, tid=140709998221056 # # JRE version: (8.0_65-b17) (build ) # Java VM: OpenJDK 64-Bit Server VM (25.65-b01 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again #
So is there any good way to reduce the Java heap size? I mean I know what -Xmx does, but where might I try setting it, ideally so that it affects the engine-backup only? Any idea of good setting for a very small environment with a dozen VMs? engine-backup does not directly call nor need java.
AFAICS it only calls it indirectly as part of some other initialization by running java-home [1], which is a script that decides what JAVA_HOME to use for the engine. This script only runs 'java -version', which imo should not need that much memory. Perhaps there is something else I do not fully understand, such as bacula severely limiting available resources for the process it runs, or something like that.
If you only want to debug it, and not as a recommended final solution, you can create a script [2] which only outputs the needed java home. Simply run [1] and make [2] echo the same thing. If [2] exists, [1] will only run it and nothing else, as you can see inside it.
I do not think this will work - quite likely engine-backup will fail shortly later, if indeed it gets access to so little memory. Please report back. Thanks and good luck,
[1] /usr/share/ovirt-engine/bin/java-home [2] /usr/share/ovirt-engine/bin/java-home.local Thanks for the info and response Didi. Doing the above did allow the backup to run successfully. I had also replaced the Bacula RunScript with "bash -c ulimit" which reported unlimited but I don't play with
On 12/29/2015 02:02 AM, Yedidyah Bar David wrote: those types of limits enough to know if that's correctly reporting to what engine-backup is constrained. I did occur to me that perhaps a better way to learn of any such constraints would be to query Bacula's file daemon (the only necessary Bacula component running on client systems that are getting backed up) since I suspect it must be this component that's actually spawning the RunScript client side. From the Bacula Director (server side) I queried the status of the client which is my oVirt engine and it reports: europa.doubledog.org-fd Version: 5.2.13 (19 February 2013) x86_64-redhat-linux-gnu redhat (Core) Daemon started 28-Dec-15 16:08. Jobs: run=2 running=0. Heap: heap=32,768 smbytes=190,247 max_bytes=1,599,864 bufs=100 max_bufs=6,758 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0 Alas, I know of no way to increase any of the bacula-fd limits. If I dead-end here, perhaps I'll query the Bacula mailing lists. Meanwhile I tried the following for a more permanent solution but this failed same as before: # diff -u java-home.orig-3.6.1.3 java-home --- java-home.orig-3.6.1.3 2015-12-10 13:07:44.000000000 -0500 +++ java-home 2015-12-30 12:12:45.779462769 -0500 @@ -13,7 +13,7 @@ local ret=1 if [ -x "${dir}/bin/java" ]; then - local version="$("${dir}/bin/java" -version 2>&1 | sed \ + local version="$("${dir}/bin/java" -Xmx 8 -version 2>&1 | sed \ -e 's/^openjdk version "1\.8\.0.*/VERSION_OK/' \ -e 's/^java version "1\.7\.0.*/VERSION_OK/' \ -e 's/^OpenJDK .*(.*).*/VENDOR_OK/' \ If this script is merely checking the validity of the JRE/JDK, should it not be possible to have a test on the rpm details first and only proceed as it does now if that doesn't work? The current tests should work w/o much regard for how the JRE/JDK got installed, but if it was installed via rpm it seems a simpler test could be used as a shortcut. -- John Florian

On Wed, Dec 30, 2015 at 7:50 PM, John Florian <jflorian@doubledog.org> wrote:
On 12/29/2015 02:02 AM, Yedidyah Bar David wrote:
On Tue, Dec 29, 2015 at 12:51 AM, John Florian <jflorian@doubledog.org> wrote:
I'm trying to run the engine-backup script via a Bacula job using the RunScript option so that the engine-backup dumps its output someplace where Bacula will collect it once engine-backup finishes. However the job is failing and with enough digging I eventually learned the script was writing the following in /tmp/hs_err_pid5789.log:
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 2555904 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2627), pid=5789, tid=140709998221056 # # JRE version: (8.0_65-b17) (build ) # Java VM: OpenJDK 64-Bit Server VM (25.65-b01 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again #
So is there any good way to reduce the Java heap size? I mean I know what -Xmx does, but where might I try setting it, ideally so that it affects the engine-backup only? Any idea of good setting for a very small environment with a dozen VMs? engine-backup does not directly call nor need java.
AFAICS it only calls it indirectly as part of some other initialization by running java-home [1], which is a script that decides what JAVA_HOME to use for the engine. This script only runs 'java -version', which imo should not need that much memory. Perhaps there is something else I do not fully understand, such as bacula severely limiting available resources for the process it runs, or something like that.
If you only want to debug it, and not as a recommended final solution, you can create a script [2] which only outputs the needed java home. Simply run [1] and make [2] echo the same thing. If [2] exists, [1] will only run it and nothing else, as you can see inside it.
I do not think this will work - quite likely engine-backup will fail shortly later, if indeed it gets access to so little memory. Please report back. Thanks and good luck,
[1] /usr/share/ovirt-engine/bin/java-home [2] /usr/share/ovirt-engine/bin/java-home.local Thanks for the info and response Didi. Doing the above did allow the backup to run successfully.
OK.
I had also replaced the Bacula RunScript with "bash -c ulimit" which reported unlimited but I don't play with those types of limits enough to know if that's correctly reporting to what engine-backup is constrained.
And was this enough?
I did occur to me that perhaps a better way to learn of any such constraints would be to query Bacula's file daemon (the only necessary Bacula component running on client systems that are getting backed up) since I suspect it must be this component that's actually spawning the RunScript client side. From the Bacula Director (server side) I queried the status of the client which is my oVirt engine and it reports:
europa.doubledog.org-fd Version: 5.2.13 (19 February 2013) x86_64-redhat-linux-gnu redhat (Core) Daemon started 28-Dec-15 16:08. Jobs: run=2 running=0. Heap: heap=32,768 smbytes=190,247 max_bytes=1,599,864 bufs=100 max_bufs=6,758 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0
Alas, I know of no way to increase any of the bacula-fd limits. If I dead-end here, perhaps I'll query the Bacula mailing lists.
For both yourself and for others, I think it's best to continue with this route. Also note that I have no idea how much memory pg_dump might need on a larger database, also including dwh which tends to get larger faster than the engine's.
Meanwhile I tried the following for a more permanent solution but this failed same as before:
# diff -u java-home.orig-3.6.1.3 java-home --- java-home.orig-3.6.1.3 2015-12-10 13:07:44.000000000 -0500 +++ java-home 2015-12-30 12:12:45.779462769 -0500 @@ -13,7 +13,7 @@ local ret=1
if [ -x "${dir}/bin/java" ]; then - local version="$("${dir}/bin/java" -version 2>&1 | sed \ + local version="$("${dir}/bin/java" -Xmx 8 -version 2>&1 | sed \ -e 's/^openjdk version "1\.8\.0.*/VERSION_OK/' \ -e 's/^java version "1\.7\.0.*/VERSION_OK/' \ -e 's/^OpenJDK .*(.*).*/VENDOR_OK/' \
No idea here, you might try passing other options, and/or strace/valgrind/etc, and/or monitor with other (including java-specific) tools, etc., and/or ask Java experts (I am not one). Adding Juan.
If this script is merely checking the validity of the JRE/JDK, should it not be possible to have a test on the rpm details first and only proceed as it does now if that doesn't work? The current tests should work w/o much regard for how the JRE/JDK got installed, but if it was installed via rpm it seems a simpler test could be used as a shortcut.
Patches are welcome :-) Note that current code is designed to be compatible with many environments, including different el/fedora versions, upgrades inside them etc., and the $0.local was added mainly to allow supporting other systems (including gentoo) where $0.local will also be shipped/packaged by the distribution. Obviously we can add similar patches to make it even more complex, but as I wrote above, not sure it's worth it - because if memory is your only problem, you might simply postpone it this way. Best, -- Didi

On 12/31/2015 08:48 AM, Yedidyah Bar David wrote:
On Wed, Dec 30, 2015 at 7:50 PM, John Florian <jflorian@doubledog.org> wrote:
On 12/29/2015 02:02 AM, Yedidyah Bar David wrote:
On Tue, Dec 29, 2015 at 12:51 AM, John Florian <jflorian@doubledog.org> wrote:
I'm trying to run the engine-backup script via a Bacula job using the RunScript option so that the engine-backup dumps its output someplace where Bacula will collect it once engine-backup finishes. However the job is failing and with enough digging I eventually learned the script was writing the following in /tmp/hs_err_pid5789.log:
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 2555904 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2627), pid=5789, tid=140709998221056 # # JRE version: (8.0_65-b17) (build ) # Java VM: OpenJDK 64-Bit Server VM (25.65-b01 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again #
So is there any good way to reduce the Java heap size? I mean I know what -Xmx does, but where might I try setting it, ideally so that it affects the engine-backup only? Any idea of good setting for a very small environment with a dozen VMs? engine-backup does not directly call nor need java.
AFAICS it only calls it indirectly as part of some other initialization by running java-home [1], which is a script that decides what JAVA_HOME to use for the engine. This script only runs 'java -version', which imo should not need that much memory. Perhaps there is something else I do not fully understand, such as bacula severely limiting available resources for the process it runs, or something like that.
If you only want to debug it, and not as a recommended final solution, you can create a script [2] which only outputs the needed java home. Simply run [1] and make [2] echo the same thing. If [2] exists, [1] will only run it and nothing else, as you can see inside it.
I do not think this will work - quite likely engine-backup will fail shortly later, if indeed it gets access to so little memory. Please report back. Thanks and good luck,
[1] /usr/share/ovirt-engine/bin/java-home [2] /usr/share/ovirt-engine/bin/java-home.local Thanks for the info and response Didi. Doing the above did allow the backup to run successfully.
OK.
I had also replaced the Bacula RunScript with "bash -c ulimit" which reported unlimited but I don't play with those types of limits enough to know if that's correctly reporting to what engine-backup is constrained.
And was this enough?
I did occur to me that perhaps a better way to learn of any such constraints would be to query Bacula's file daemon (the only necessary Bacula component running on client systems that are getting backed up) since I suspect it must be this component that's actually spawning the RunScript client side. From the Bacula Director (server side) I queried the status of the client which is my oVirt engine and it reports:
europa.doubledog.org-fd Version: 5.2.13 (19 February 2013) x86_64-redhat-linux-gnu redhat (Core) Daemon started 28-Dec-15 16:08. Jobs: run=2 running=0. Heap: heap=32,768 smbytes=190,247 max_bytes=1,599,864 bufs=100 max_bufs=6,758 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0
Alas, I know of no way to increase any of the bacula-fd limits. If I dead-end here, perhaps I'll query the Bacula mailing lists.
For both yourself and for others, I think it's best to continue with this route.
Also note that I have no idea how much memory pg_dump might need on a larger database, also including dwh which tends to get larger faster than the engine's.
Meanwhile I tried the following for a more permanent solution but this failed same as before:
# diff -u java-home.orig-3.6.1.3 java-home --- java-home.orig-3.6.1.3 2015-12-10 13:07:44.000000000 -0500 +++ java-home 2015-12-30 12:12:45.779462769 -0500 @@ -13,7 +13,7 @@ local ret=1
if [ -x "${dir}/bin/java" ]; then - local version="$("${dir}/bin/java" -version 2>&1 | sed \ + local version="$("${dir}/bin/java" -Xmx 8 -version 2>&1 | sed \ -e 's/^openjdk version "1\.8\.0.*/VERSION_OK/' \ -e 's/^java version "1\.7\.0.*/VERSION_OK/' \ -e 's/^OpenJDK .*(.*).*/VENDOR_OK/' \
No idea here, you might try passing other options, and/or strace/valgrind/etc, and/or monitor with other (including java-specific) tools, etc., and/or ask Java experts (I am not one). Adding Juan.
I believe that this isn't really a memory problem, as the amount of memory that the Java virtual machine is requesting is very small, less than 3 MiB. It is probably related to the fact that the Bacula daemon that runs the script runs in its own SELinux "bacula_t" context. You can quickly verify this by temporarily disabling SELinux, trying to perform the backup, and then enabling it again: # setenforce 0 # Perform the backup # setenforce 1 You should also see a description of the problem in the /var/log/audit/audit.log file. When I tried it I saw this: type=AVC msg=audit(1451571576.334:336): avc: denied { execmem } for pid=4622 comm="java" scontext=system_u:system_r:bacula_t:s0 tcontext=system_u:system_r:bacula_t:s0 tclass=process That message says that the Java virtual machine is trying to map an area of memory that is both writeable and executable. That makes sense, it is probably an area used by the HotSpot compiler, that generates code during runtime. But this happens to be forbidden for the "bacula_t" SELinux context. You have several alternatives here. The more drastic one is to disable SELinux permanently, setting the SELINUX variable in /etc/selinux/config to permissive or disabled. This is bad idea in general, and if I remember correctly oVirt doesn't work well with SELinux disabled. You can also just disable SELinux for the bacula daemon, removing the "bacula" policy module, and then restarting them: # semodule -r bacula # systemctl restart bacula-fd This isn't good idea either, as it will remove the "bacula.pp" file, which isn't a configuration file and will come back when you upgrade the SELinux RPMs. Another thing you can do is set only the "bacula_t" type to permissive: # semanage permissive -a bacula_t This service won't then enjoy the SELinux protection, but the others will. This is probably the better choice. Finally, you can also create your own policy module, allowing to the "bacula_t" context the "execmem" operation. The easiest way to do this is to use the "audit2allow" tool, which generates the policy module from the audit log: # audit2allow -M mypolicy <<. type=AVC msg=audit(1451571576.334:336): avc: denied { execmem } for pid=4622 comm="java" scontext=system_u:system_r:bacula_t:s0 tcontext=system_u:system_r:bacula_t:s0 tclass=process . This will generate a "mypolicy.pp" file that allows that operation. You can then activate it like this: # sepolicy -i mypolicy.pp
If this script is merely checking the validity of the JRE/JDK, should it not be possible to have a test on the rpm details first and only proceed as it does now if that doesn't work? The current tests should work w/o much regard for how the JRE/JDK got installed, but if it was installed via rpm it seems a simpler test could be used as a shortcut.
Patches are welcome :-)
Note that current code is designed to be compatible with many environments, including different el/fedora versions, upgrades inside them etc., and the $0.local was added mainly to allow supporting other systems (including gentoo) where $0.local will also be shipped/packaged by the distribution. Obviously we can add similar patches to make it even more complex, but as I wrote above, not sure it's worth it - because if memory is your only problem, you might simply postpone it this way.
Best,
-- Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta 3ºD, 28016 Madrid, Spain Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L.

On 12/31/2015 10:42 AM, Juan Hernández wrote:
On 12/31/2015 08:48 AM, Yedidyah Bar David wrote:
On Wed, Dec 30, 2015 at 7:50 PM, John Florian <jflorian@doubledog.org> wrote:
On 12/29/2015 02:02 AM, Yedidyah Bar David wrote:
On Tue, Dec 29, 2015 at 12:51 AM, John Florian <jflorian@doubledog.org> wrote:
I'm trying to run the engine-backup script via a Bacula job using the RunScript option so that the engine-backup dumps its output someplace where Bacula will collect it once engine-backup finishes. However the job is failing and with enough digging I eventually learned the script was writing the following in /tmp/hs_err_pid5789.log:
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 2555904 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2627), pid=5789, tid=140709998221056 # # JRE version: (8.0_65-b17) (build ) # Java VM: OpenJDK 64-Bit Server VM (25.65-b01 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again #
So is there any good way to reduce the Java heap size? I mean I know what -Xmx does, but where might I try setting it, ideally so that it affects the engine-backup only? Any idea of good setting for a very small environment with a dozen VMs? engine-backup does not directly call nor need java.
AFAICS it only calls it indirectly as part of some other initialization by running java-home [1], which is a script that decides what JAVA_HOME to use for the engine. This script only runs 'java -version', which imo should not need that much memory. Perhaps there is something else I do not fully understand, such as bacula severely limiting available resources for the process it runs, or something like that.
If you only want to debug it, and not as a recommended final solution, you can create a script [2] which only outputs the needed java home. Simply run [1] and make [2] echo the same thing. If [2] exists, [1] will only run it and nothing else, as you can see inside it.
I do not think this will work - quite likely engine-backup will fail shortly later, if indeed it gets access to so little memory. Please report back. Thanks and good luck,
[1] /usr/share/ovirt-engine/bin/java-home [2] /usr/share/ovirt-engine/bin/java-home.local Thanks for the info and response Didi. Doing the above did allow the backup to run successfully. OK.
I had also replaced the Bacula RunScript with "bash -c ulimit" which reported unlimited but I don't play with those types of limits enough to know if that's correctly reporting to what engine-backup is constrained. And was this enough?
I did occur to me that perhaps a better way to learn of any such constraints would be to query Bacula's file daemon (the only necessary Bacula component running on client systems that are getting backed up) since I suspect it must be this component that's actually spawning the RunScript client side. From the Bacula Director (server side) I queried the status of the client which is my oVirt engine and it reports:
europa.doubledog.org-fd Version: 5.2.13 (19 February 2013) x86_64-redhat-linux-gnu redhat (Core) Daemon started 28-Dec-15 16:08. Jobs: run=2 running=0. Heap: heap=32,768 smbytes=190,247 max_bytes=1,599,864 bufs=100 max_bufs=6,758 Sizeof: boffset_t=8 size_t=8 debug=0 trace=0
Alas, I know of no way to increase any of the bacula-fd limits. If I dead-end here, perhaps I'll query the Bacula mailing lists. For both yourself and for others, I think it's best to continue with this route.
Also note that I have no idea how much memory pg_dump might need on a larger database, also including dwh which tends to get larger faster than the engine's.
Meanwhile I tried the following for a more permanent solution but this failed same as before:
# diff -u java-home.orig-3.6.1.3 java-home --- java-home.orig-3.6.1.3 2015-12-10 13:07:44.000000000 -0500 +++ java-home 2015-12-30 12:12:45.779462769 -0500 @@ -13,7 +13,7 @@ local ret=1
if [ -x "${dir}/bin/java" ]; then - local version="$("${dir}/bin/java" -version 2>&1 | sed \ + local version="$("${dir}/bin/java" -Xmx 8 -version 2>&1 | sed \ -e 's/^openjdk version "1\.8\.0.*/VERSION_OK/' \ -e 's/^java version "1\.7\.0.*/VERSION_OK/' \ -e 's/^OpenJDK .*(.*).*/VENDOR_OK/' \ No idea here, you might try passing other options, and/or strace/valgrind/etc, and/or monitor with other (including java-specific) tools, etc., and/or ask Java experts (I am not one). Adding Juan.
I believe that this isn't really a memory problem, as the amount of memory that the Java virtual machine is requesting is very small, less than 3 MiB. It is probably related to the fact that the Bacula daemon that runs the script runs in its own SELinux "bacula_t" context. You can quickly verify this by temporarily disabling SELinux, trying to perform the backup, and then enabling it again:
# setenforce 0 # Perform the backup # setenforce 1
You should also see a description of the problem in the /var/log/audit/audit.log file. When I tried it I saw this:
type=AVC msg=audit(1451571576.334:336): avc: denied { execmem } for pid=4622 comm="java" scontext=system_u:system_r:bacula_t:s0 tcontext=system_u:system_r:bacula_t:s0 tclass=process
That message says that the Java virtual machine is trying to map an area of memory that is both writeable and executable. That makes sense, it is probably an area used by the HotSpot compiler, that generates code during runtime. But this happens to be forbidden for the "bacula_t" SELinux context.
Bingo! I almost discovered this last night. My original RunScript sent the output of engine-backup to /tmp for simplicity but my Bacula file set ignores /tmp so I had to target elsewhere. That led to AVCs and I dug into the policy to discover that /var/bacula would be an acceptable, writable location per SEL policy and still be included in my file set. It had occurred to me at that time that perhaps SEL was interfering with the engine-backup also but I failed to go back and look for that.
You have several alternatives here. The more drastic one is to disable SELinux permanently, setting the SELINUX variable in /etc/selinux/config to permissive or disabled. This is bad idea in general, and if I remember correctly oVirt doesn't work well with SELinux disabled.
You can also just disable SELinux for the bacula daemon, removing the "bacula" policy module, and then restarting them:
# semodule -r bacula # systemctl restart bacula-fd
This isn't good idea either, as it will remove the "bacula.pp" file, which isn't a configuration file and will come back when you upgrade the SELinux RPMs.
Another thing you can do is set only the "bacula_t" type to permissive:
# semanage permissive -a bacula_t
Oh cool, I was unaware you could disable selectively like that.
This service won't then enjoy the SELinux protection, but the others will. This is probably the better choice.
Finally, you can also create your own policy module, allowing to the "bacula_t" context the "execmem" operation. The easiest way to do this is to use the "audit2allow" tool, which generates the policy module from the audit log:
# audit2allow -M mypolicy <<. type=AVC msg=audit(1451571576.334:336): avc: denied { execmem } for pid=4622 comm="java" scontext=system_u:system_r:bacula_t:s0 tcontext=system_u:system_r:bacula_t:s0 tclass=process .
This will generate a "mypolicy.pp" file that allows that operation. You can then activate it like this:
# sepolicy -i mypolicy.pp
This is the route I went and it seems to work perfectly. Thanks for the excellent write up. Now I'm glad I forget to follow through on my SEL investigation as I wouldn't have come up with so correct a solution. This seems like a good case for a new SE Boolean, so I've submitted: https://bugs.centos.org/view.php?id=10052 I'm relatively new to CentOS so hopefully this will get addressed as fast as most SEL issues reported for Fedora. Thanks Juan and Didi for the excellent help! Best wishes for 2016. :-)
If this script is merely checking the validity of the JRE/JDK, should it not be possible to have a test on the rpm details first and only proceed as it does now if that doesn't work? The current tests should work w/o much regard for how the JRE/JDK got installed, but if it was installed via rpm it seems a simpler test could be used as a shortcut.
Patches are welcome :-)
Note that current code is designed to be compatible with many environments, including different el/fedora versions, upgrades inside them etc., and the $0.local was added mainly to allow supporting other systems (including gentoo) where $0.local will also be shipped/packaged by the distribution. Obviously we can add similar patches to make it even more complex, but as I wrote above, not sure it's worth it - because if memory is your only problem, you might simply postpone it this way.
Best,
-- John Florian
participants (3)
-
John Florian
-
Juan Hernández
-
Yedidyah Bar David