United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-5072161 : OutOfMemoryError while using GZIPOutputStream

Details
Type:
Bug
Submit Date:
2004-07-08
Status:
Closed
Updated Date:
2006-03-20
Project Name:
JDK
Resolved Date:
2006-03-20
Component:
core-libs
OS:
generic
Sub-Component:
java.util.jar
CPU:
generic
Priority:
P2
Resolution:
Duplicate
Affected Versions:
1.3.1_12
Fixed Versions:

Related Reports
Duplicate:
Relates:
Relates:
Relates:
Relates:
Relates:
Relates:

Sub Tasks

Description
Customer is using his J2EE Server with JDK 1.3.1_12. The SAP J2EE Engine is running the Enterprise Portal and serving compressed output. The java.util.zip.GZIPOutputStream which is used calls into the OpenSource compression library zlib (www.gzip.org/zlib) which we ship (even in Tiger!) in version 1.1.3 from 9 July 1998. After some hours the server crashes with an OOM.

- The latest zlib is: 1.2.1, from 17. Nov. 2003, check the ChangeLog there are MANY fixes.
- An alternative usage of a pure-java compression makes the crash disappear.
- SAP swaped the zlib-sources in the JDK ( just to narrow this issue down!) to the latest one ( ... and build their own VM ): No OOM anymore.
###@###.### 10/16/04 01:02 GMT

I've changed this bug from
Synopsis: OutOfMemoryError while using GZIPOutputStream, zlib memleak
to
(ref) Finalizer thread sometimes can't keep up with other threads

to more accurately reflect the underlying problem.
###@###.### 2005-1-10 23:30:21 GMT

                                    

Comments
EVALUATION

In the prior 2 weeks, feedback (to me via private email) was that this bug can be closed.  Since the underlying issue still seems to persist (e.g. 4797189 and 6293787) I'm marking it a duplicate.
                                     
2006-03-20
EVALUATION

Since this bug has been in the incomplete state for a year, has also received no updates during that time, and there are other equally descriptive bugs (e.g. 4797189, 6293787), this will be closed in 2 weeks unless further information is provided that can move it out ouf the incomplete/need-more-info state.
                                     
2006-03-03
EVALUATION

Upgrading zlib to 1.2.1 is a good idea, for a post-Tiger release.
The zlib shipped with the JDK is not a plain 1.1.3; bug fixes have been
applied, so the upgrade is non-trivial.

We are not aware of any memory leaks in the jar/zip code relating to
our use of zlib, despite efforts in the past to find such leaks. 
We need a small reproducible test case, preferably one that can
be reproduced using the latest Tiger bits.

###@###.### 2004-07-09

Here is a test program based on a JDC comment by "hubt":
----------------------------------------------------------------
import java.io.*;
import java.util.zip.*;

public class Bug {
    public static void main(String args[]) throws Throwable {
	boolean gc = args.length > 0 && args[0].equals("gc");
	boolean close = args.length > 0 && args[0].equals("close");
	byte[] bytes = "Hello World".getBytes();
	int i = 0;
	ByteArrayOutputStream os = new ByteArrayOutputStream(1000);
	for(i = 0; i < 1000 * 1000 ; i++ ) {
	    os.reset();
	    GZIPOutputStream gos = new GZIPOutputStream(os);
	    gos.write(bytes,0,bytes.length);
	    gos.finish();
	    if (close)
		gos.close();
	    if (i % 1000 == 0) {
		if (gc) {
		    System.gc();
		    System.runFinalization();
		}
		System.out.println("Iterations: " + i);
	    }
	}
    }
}
----------------------------------------------------------------
The program fails when called with no arguments, while succeeding
when called with either the "gc" or "close" argument, showing
that this is very likely an instance of...

"I think we're seeing the usual problem with Java's inability to
collect non-heap-memory resources in a timely fashion.
The finalizable objects which can unmap the memory will be collected,
but the GC is not aware of the urgency, since they appear to be 
simple small Java objects."

In the absence of a general solution to the non-heap resource
exhaustion problem, users of classes with close() methods should
make sure to call them as soon as possible.

The problem is more thoroughly explained in

5092131: using 1 MB pagesize throws "not enough space" error with 32bit JVM

We don't know whether this is the original submitter's problem, but
I suspect that this is the root cause of all "Zip out of memory"
problems in Tiger.

###@###.### 2004-09-04

To begin with, I produced a smaller version of the customer's
ZipedContentGenerator.java (from the attachments)

---------------------------------------------------------------------
import java.io.ByteArrayOutputStream;
import java.util.Random;
import java.util.zip.DeflaterOutputStream;

public class ZipedContentGenerator {
    public static void main(String[] args) {
	for (int i=0; i<20; i++)
	    new CompressingThread().start();
    }
}

class CompressingThread extends Thread {

    final private static String OUTPUT_CHARS =
	"AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";

    public void run() {
	while (true) {
	    ByteArrayOutputStream baos = new ByteArrayOutputStream();
	    DeflaterOutputStream dos = new DeflaterOutputStream(baos);
	    try {
		dos.write(generateContent().getBytes());
		dos.close();
		dos = null;
		baos = null;
		// ATTENTION !!!!!!!!!!!!!!!!!!!
		// IF you run the finalizers here, the Problem goes away:
		// System.runFinalization();
	    } catch (Exception e) {
		e.printStackTrace();
	    }
	}
    }

    private String generateContent() {
	StringBuffer result = new StringBuffer(100000);
	addRandomOutput(result, 100000, 5);
	return result.toString();
    }

    public void addRandomOutput(StringBuffer output, int size, int iWsPaddingRatio)
    {
	Random rnd = new Random();
	char key;
	char newChar;
	int ratioCounter = 0;
	for (int i = 0; i < size; i++) {
	    if (ratioCounter == iWsPaddingRatio) {
		newChar = OUTPUT_CHARS.charAt(rnd.nextInt(52));
		ratioCounter = 0;
	    } else {
		newChar = ' ';
		ratioCounter ++;
	    }
	    output.append(newChar);
	}
    }
}
---------------------------------------------------------------------
(Even though the code correctly calls the close() method...)
this causes OutOfMemoryErrors when run, for example, using
java -Xmx32m ZipedContentGenerator 
on solaris-sparc.

Then I reduced this to eliminate any complex classes, using
a class with a simple finalize method:

---------------------------------------------------------------------
public class OOM {
    public static void main(String[] args) {
	for (int i=0; i<20; i++)
	    new Thread() { public void run() {
		for (int j=0; ; j++) {
		    new HasFinalizer();
		    //System.runFinalization();
		}}}.start();
    }
}

class HasFinalizer {
    private static int i = 0;
    protected void finalize() { i++; }
}
---------------------------------------------------------------------

This causes an OOME in the same way.  The OOME can be eliminated by
commenting out the finalize() method or by uncommenting the call to
runFinalization().

So... this is a generic GC problem.  There are 20 threads doing
nothing but generating finalizable objects, and only one thread
running the finalizers.  No wonder it can't keep up.

I'll transfer this to the GC team for comment.
###@###.### 2004-12-20 19:43:33 GMT

This is not a GC issue.

We're not going to fix this bug by changing the finalizer infrastructure.
Doing so risks introducing all sorts of nasty performance and scalability
issues.  (We're certainly not going to redesign the finalizer infrastructure in
an update release in order to fix an escalated bug!)

Finalizers are fundamentally broken; the best we can do is to encourage people
not to rely upon them.  This has been our standing policy for many years.

Are you absolutely sure that there isn't a memory leak in our copy of zlib?
How do you know?  Have you tried substituting a newer version of zlib, as SAP
did, and if so did that help?

If user code is correctly closing deflaters/inflaters/zip*streams using
try/finally blocks then finalization should not even be an issue.  If it is an
issue then perhaps we're doing something wrong in our Java code, and somehow
keeping a heap object that refers to native memory alive longer than actually
necessary.  There are well-known techniques for dealing with such problems.

I'm moving this bug back to classes_util_jarzip.

###@###.### 2005-1-11 04:01:05 GMT

This is not a finalization issue at all, at least not for the given test code.

The test runs out of Java heap memory because it spends most of its time with
the garbage collector disabled.  The Deflater class passes its input and output
buffers directly to zlib after acquiring them via the GetPrimitiveArrayCritical
JNI call.  In HotSpot this effectively disables collection for all threads
until the corresponding ReleasePrimitiveArrayCritical calls are made.  When a
thread that's not in one of these critical sections tries to allocate heap
memory while collection is disabled it receives an OutOfMemoryError.  Most of
the time in this program is spent in these critical sections in the native zlib
deflateBytes method, hence the time until the first OutOfMemoryError is thrown
is essentially linear in the amount of memory allocated by the VM at startup.

Invoking the System.runFinalization method at the end of each iteration in each
thread appears to fix the problem only because it introduces some noise into
the system and allows collections to happen more frequently.  The same OOMEs
are thrown even if the finalize method in the Deflater class is removed.

If the test code is an accurate reflection of what's going on in production
then I'm afraid the options are limited.

There's an open RFE (6186200) that would change the behavior of HotSpot in this
situation so that a thread would wait until a collection can be attempted
rather than throw an OOME when the GC is disabled due to another thread being
in a critical section.  That change is, however, highly unlikely to be suitable
for a 1.3.1 update release.

The best workaround I've come up with is to cause each thread to sleep a little
bit after each write to the ByteArrayOutputStream in order to allow other
threads a chance to invoke the collector if they need to.  This can be done by
wrapping a fairly simple filtering stream around the ByteArrayOutputStream,
e.g.,

    static class DallyingOutputStream extends OutputStream {

	private OutputStream dst;

	public DallyingOutputStream(OutputStream dst) {
	    this.dst = dst;
	}

	public void write(int b) throws IOException {
	    dst.write(b);
	}

	public void write(byte[] b, int off, int len) throws IOException {
	    try {
		Thread.currentThread().sleep(50);
	    } catch (InterruptedException x) {
	    }
	    dst.write(b, off, len);
	}

	public void flush() throws IOException {
	    dst.flush();
	}

	public void close() throws IOException {
	    dst.close();
	}

    }

    ...

        ByteArrayOutputStream baos = new ByteArrayOutputStream();
	DallyingOutputStream dyos = new DallyingOutputStream(baos);
	DeflaterOutputStream dos = new DeflaterOutputStream(dyos);

    ...

With this change I've been able to run the test program for several minutes
without failure in a 64MB heap.  This workaround reduces throughput, obviously,
but at least it avoids the OutOfMemoryErrors.

At this point I'm wondering if the test code really is an accurate reflection
of what's going on in production.  Answers to the following questions would be
helpful:

  (1) What is the actual OutOfMemoryError, and stack trace, seen in production?

  (2) Does the production code close each DeflaterOutputStream when finished?

  (3) Does the production code actually convert and compress 100k characters at
      a time?

  (4) Does the production code accumulate output in a ByteArrayOutputStream,
      as in the test, or is the output written to, say, a network connection?

  (5) Does the production code do all this with this many threads (20), or
      more, or less?  On how many processors does it run?  On what OS?

  (6) Exactly which DeflaterOutputStream constructor does the production code
      use, and with what arguments?

###@###.### 2005-1-24 15:51:51 GMT

While looking into this problem two bugs in the DeflaterOutputStream and
Deflater classes were discovered; see 6223075 and 6223076.  Fixing these bugs
appears to help the customer's test case run for a longer time, but still not
indefinitely.  Nevertheless it might be worthwhile to send a test build to the
customer to see if fixing these two bugs helps in production.

###@###.### 2005-1-28 20:27:35 GMT

Customer answered the six questions thous:

  (1) What is the actual OutOfMemoryError, and stack trace, seen in production?

	Please see the attachment (Q1.ZIP) you will find the logs from a
	production environment and some screen shots of the consoles just after
	the out of memory error was reproduced.
	Note: When the screen shots were made the server was running with JDK
	1.3.1.11

  (2) Does the production code close each DeflaterOutputStream when finished?

	Yes, it does.

  (3) Does the production code actually convert and compress 100k characters at
      a time?

	Yes, it does.

  (4) Does the production code accumulate output in a ByteArrayOutputStream,
      as in the test, or is the output written to, say, a network connection?

	To be absolutely precise the web container uses its GzipResponseStream,
	which extends javax.servlet.ServletOutputStream, which extends
	java.io.OutputStream
	The inheritance in the provided test-case is similar.
	java.io.ByteArrayOutputStream extends java.io.OutputStream

  (5) Does the production code do all this with this many threads (20), or
      more, or less?  On how many processors does it run?  On what OS?

	Here I devided your questions in three sub-questions:

	(5.1) Does the production code do all this with this many threads (20), or
	      more, or less?
	
	By default the application threads (each one could serve one request)
	are set to 40. This means that on heavy load the threads could become 40
	(if the default value is not changed).
	
	(5.2) On how many processors does it run?

	SAP J2EE Application Server 6.20 runs only on JDK 1.3.1. One application
	server node could scale, if it uses from 1 to 4 CPUs. There are no
	customer cases where we have more application server nodes than CPUs.

	(5.3) On what OS? 

	The detailed list of supported OS by the J2EE Engine 6.20 could be found
	in the PAM portal
	follow this link https://websmp105.sap-ag.de/pam then from the left menu
	choose consequently 

	1. SAP NetWeaver
	2. SAP NetWeaver Componnets (< SAP NW 04)
	3. SAP WebAS\SAP WEB AS 6.20
	
	then from the tab menu on the left choose JSE Platforms. A list of all
	suported OS will appear.

	The described configuration in Answer 5.1 could be set to any of the 
	operation systems. However the reported problem about the issue
	discussed in this CSN was observed only on Windows and HPUX platforms.


  (6) Exactly which DeflaterOutputStream constructor does the production code
      use, and with what arguments?

	In the production code we use an instance of
	java.util.zip.GZIPOutputStream, which extends
	java.util.zip.DeflaterOutputStream, which extends
	java.io.FilterOutputStream
	Here is a part of the production code showing the instantiation of the
	zipping output stream (FilterOutputStream)

	private void instantiateStream() throws IOException {
		if (SBasic.gzipImpConstructor == null) {
			gzipstream = new GZIPOutputStream(servletoutput);
		} else {
		// initialize the server to use an external zipping library
		}
	}
	
	gzipstream instance is of class java.io.FilterOutputStream
	Actually what you are interested in is the code in the if statement (the
	one in else statement is executed in case the server is going to use an 
	external zipping library).
	Consequently the answer to your question about the constructor is:
	FilterOutputStream gzipstream = new GZIPOutputStream(servletoutput);
	the parameter is a ServletOutputStream

###@###.### 2005-2-04 11:42:36 GMT

The above answers confirm that the supplied test code is not an accurate
reflection of the customer's application.  In particular the customer's code
uses GZIPOutputStream while the test code uses DeflaterOutputStream.

This may be a critical difference.  It turns out that the GZIPOutputStream
class in 1.3.1 has the same problem as the DeflaterOutputStream class in that
release, namely that invoking the close() method will not cause the underlying
deflater's end() method to be invoked since the private usesDefaultDeflater
field is never set.  I've updated 6223075 accordingly.  I suggest sending a new
build to the customer containing fixes for both 6223075 and 6223076 to see if
this makes a difference in production.

###@###.### 2005-2-07 04:51:19 GMT

-----

Tested a 1.3.1xx based binary containing fixes for CR#{6223075 and 6223076}
on both dual and single cpu machines, using the stand alone test case
supplied by Cu. The testing took place for more than 48Hrs. No OOM error
is observed. I've used the same command line options that were used
by Cu.

Here is partial GC output:
/home2/rai/java/1.3.1-14/build/solaris-sparc/bin/java -Xms1024m -Xmx1024m -XX:SurvivorRatio=2 -XX:TargetSurvivorRatio=90 -XX:NewRatio=6 -verbose:gc ZipedContentGenerator
[GC 1158K->1083K(1047936K), 0.0059966 secs]
[GC 5981K->4805K(1047936K), 0.0038203 secs]
.............................................
.............................................
[GC 1043797K->1042921K(1047936K), 0.1272579 secs]
[GC 1044177K->1043667K(1047936K), 0.1291067 secs]
[GC 1044941K->1044419K(1047936K), 0.1716337 secs]
[GC 1045659K->1045035K(1047936K), 0.0235835 secs]
[Full GC 1046484K->9207K(1047936K), 9.4994934 secs]
[GC 10406K->9964K(1047936K), 0.0544921 secs]
[GC 11534K->10689K(1047936K), 0.0254829 secs]
[GC 11811K->11240K(1047936K), 0.0819868 secs]

What is observed is all the minor collections resulted in very minimal 
memory retrieval and one major [Full GC] collection retrieved good amount
of memory.  Ran the same test case with everything same except the jdk binary
which does not has the fixes. And OOM error is reproduced.



###@###.### 2005-2-12 00:27:28 GMT
                                     
2005-02-12
PUBLIC COMMENTS

-
                                     
2004-09-06
SUGGESTED FIX

use a newer|latest zlib
                                     
2004-09-06



Hardware and Software, Engineered to Work Together