FULL PRODUCT VERSION :
1.5.0_06 as well as 1.4.2
A DESCRIPTION OF THE PROBLEM :
An input stream from a URLConnection reading a HTTP 1.1 chunked response body reads data into memory when #available() is called. If the client of the URLConnection object calls #available() repeatedly while reading only little data the whole response ends up in memory.
This case occurs in real life if you use a combination of BufferedInputStream, InputStreamReader and BufferedReader to read the response stream line by line as the second testcase demonstrates. It is important to use a buffer size larger than the default with the BufferedInputStream.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Replace the URL in the testcase with a URL known to point to a large file and known to produce a chunked response. A good way to do this is a servlet that does not set the content length header and simple writes a lot of data.
Run the test case.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The file should be downloaded with minimum heap requirements.
ACTUAL -
You can see the heap increasing, more and more full GCs until you hit OutOfMemory.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.*;
import java.net.URL;
public class HttpUrlConnectionRunsOutOfMemory {
public static void main(String[] args) throws IOException {
InputStream inputStream = new URL("http://localhost:8001/demo/servlet/reportlog?mode=get&name=1151054069229").openStream(); // replace this URL
BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream, 8192); // default buf size will not trigger the bug
InputStreamReader inputStreamReader = new InputStreamReader(bufferedInputStream);
BufferedReader reader = new BufferedReader(inputStreamReader);
while (reader.readLine() != null);
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use a smaller buf size with the buffered input stream.