Bug ID: JDK-6364346 GZIPOutputStream is slower on 1.4.2_11-b02 than on 1.4.2

Type: Bug
Component: core-libs
Sub-Component: java.util.jar
Affected Version: 1.4.2_11,5.0,6

Priority: P3
Status: Resolved
Resolution: Fixed
OS: generic,windows_xp
CPU: generic,x86

Submitted: 2005-12-16
Updated: 2011-03-03
Resolved: 2006-04-11

Other	JDK 6
1.4.2_12 b02Fixed	6Fixed

It seems that the regression reported in bug# 6356456 is not only in GzipInputStream, but also in GzipOutputStream. The regression for GzipInputStream has been fixed in 1.4.2_11, but there is still a performance problem with GzipOutputStream in 1.4.2_11.

Details in the comments section.
Removing the Regression keyword from the synopsis. As this a matter of performance.

SUGGESTED FIX Deflate no more than stride bytes at a time. This avoids excess copying in deflateBytes.
22-03-2006
EVALUATION The problem is that when the client code invokes DeflaterOutputStream.write() with a byte[] that is much larger than the Deflater's buffer size (512 by default), that the client's byte[] can get copied many times in Deflater.c's deflateBytes function.
04-03-2006
EVALUATION Use of Deflater and Inflater becomes inefficient if the input buffer is very large compared to the output buffer, due to repeated O(N*2) copying. Here is, in my opinion, a better benchmark, that clearly illustrates the loss of performance, without using unnecessary higher-level classes like DeflaterOutputStream: ---------------------------------------------------------------- import java.util.; import java.util.zip.; public class Bench { private static byte[] grow(byte[] a, int capacity) { while (a.length < capacity) { byte[] a2 = new byte[a.length 2]; System.arraycopy(a, 0, a2, 0, a.length); a = a2; } return a; } private static byte[] trim(byte[] a, int length) { byte[] res = new byte[length]; System.arraycopy(a, 0, res, 0, length); return res; } private static byte[] deflate(byte[] in) throws Throwable { final Deflater flater = new Deflater(); flater.setInput(in); flater.finish(); final byte[] smallBuffer = new byte[32]; byte[] flated = new byte[32]; int count = 0; int n; while ((n = flater.deflate(smallBuffer)) > 0) { flated = grow(flated, count + n); System.arraycopy(smallBuffer, 0, flated, count, n); count += n; } return trim(flated, count); } private static byte[] inflate(byte[] in) throws Throwable { final Inflater flater = new Inflater(); flater.setInput(in); final byte[] smallBuffer = new byte[32]; byte[] flated = new byte[32]; int count = 0; int n; while ((n = flater.inflate(smallBuffer)) > 0) { flated = grow(flated, count + n); System.arraycopy(smallBuffer, 0, flated, count, n); count += n; } return trim(flated, count); } public static void main(String[] args) throws Throwable { byte[] data = new byte[1024*1024]; new Random().nextBytes(data); byte[] deflated = deflate(data); byte[] inflated = inflate(deflated); if (! Arrays.equals(data,inflated)) throw new Error(); } } ---------------------------------------------------------------- 1.4.2_08 ==> javac -source 1.4 Bench.java ==> java -esa -ea Bench jver $v jr Bench 2.27s user 0.53s system 41% cpu 6.802 total 1.4.2_09 ==> javac -source 1.4 Bench.java ==> java -esa -ea Bench jver $v jr Bench 2.30s user 0.42s system 28% cpu 9.479 total 1.4.2_10 ==> javac -source 1.4 Bench.java ==> java -esa -ea Bench jver $v jr Bench 197.51s user 0.51s system 75% cpu 4:20.80 total 1.4.2_11 ==> javac -source 1.4 Bench.java ==> java -esa -ea Bench jver $v jr Bench 47.22s user 0.56s system 72% cpu 1:05.47 total I agree with the submitter that this performance problem is important to fix.
01-03-2006
WORK AROUND Workaround from Dave Bristor(PDE) and complete explanation from him as well. dout = new DataOutputStream(new GZIPOutputStream(sout)); Summary: We can probably solve the performance issues by changing the above to: dout = new DataOutputStream( new BufferedOutputStream(new GZIPOutputStream(sout), 4096)); I have to guess, but from the fact that they're using a DataOutputStream and from the stack trace you sent earlier, some pretty small things might be written. By adding buffering, I hope to see a performance improvement. What follows is maybe more detail than you need to know. But I had to investigate this some, and so did some testing. As an example of the performance differences, see the attached test. It's an extreme case, doing I/O byte-at-a-time. Here are some results with the same JDK versions for which the customer reported timings. The "bytes" number is byte-at-a-time (no buffering), "sized" means that the GZIP stream was created with a 2nd size parameter, "buffered" means that the GZIP stream was wrapped in a Buffered stream e.g. new BufferedOutputStream(new GZIPOutputStream(...)) and "reversed" means that the wrapping was done backwards, e.g. new GZIPOutputStream(new BufferedOutputStream(...)) % jver 1.4.2_09 java GZIPTest;jver 1.4.2_09 java GZIPTest writing bytes: 9769, for sized: 9491, for buffered: 970, for reversed: 9517 reading bytes: 5993, for sized: 5450, for buffered: 316, for reversed: 5516 writing bytes: 9650, for sized: 9687, for buffered: 964, for reversed: 9572 reading bytes: 5878, for sized: 5513, for buffered: 495, for reversed: 5548 % jver 1.4.2_11 java GZIPTest;jver 1.4.2_11 java GZIPTest writing bytes: 13853, for sized: 13997, for buffered: 1014, for reversed: 13877 reading bytes: 8727, for sized: 8563, for buffered: 345, for reversed: 8461 writing bytes: 14116, for sized: 13842, for buffered: 1026, for reversed: 14215 reading bytes: 8660, for sized: 8633, for buffered: 338, for reversed: 8642 % jver 1.5.0_06 java GZIPTest;jver 1.5.0_06 java GZIPTest writing bytes: 9538, for sized: 9655, for buffered: 1030, for reversed: 9848 reading bytes: 6476, for sized: 6353, for buffered: 369, for reversed: 6429 writing bytes: 9698, for sized: 9660, for buffered: 1033, for reversed: 9689 reading bytes: 6459, for sized: 6518, for buffered: 347, for reversed: 6415 (FWIW: GZIPTest was compiled with 1.4.2_09 for all runs, which were done on a SunBlade 150.) Providing a size to GZIP stream constructors doesn't make much difference (the "sized" times). Also, that wrapping a GZIP stream with a Buffered stream gives about a 10x improvement, and that the differences across JDK versions are small. Importantly, note that the "reversed" case is no better than byte-at-a-time. This means that even though the customer's streams which are being passed to the GZIP stream's constructor have some buffering, performance is still dominated by the deflating and inflating. I.e., I think that the "reversed" case is what the customer is getting with their code. Why? Consider only GZIPOutputStream. Each time GZIPOutputStream.write(byte[], int, int) is invoked, data is compressed. The goal is to invoke that as few times as possible. Only wrapping the GZIP stream with a Buffered stream accomplishes that.
05-01-2006
WORK AROUND Use 1.4.2_09 rather than 1.4.2_11-b02. Customer can't use JRE 5.0 for the SAP Software,
16-12-2005

Duplicate :	JDK-6381820 - Performance regression observed for DaCapo chart benchmark
Duplicate :	JDK-6448468 - ZipOutputStream very very slow in 1.5.0_07
Relates :	JDK-6348045 - REGRESSION: serious performance degradation as GZIPInputStream is slower
Relates :	JDK-6356456 - REGRESSION: GZIPInputStream is slower on 1.4.2_10 than on 1.4.2_09
Relates :	JDK-6399199 - Improve performance of Deflater
Relates :	JDK-6975829 - Perf. of gzip in existing JDKs is too slower than in 1.3.1
Relates :	JDK-6507183 - REGRESSION: slower performance of GZIPInputStream for 5u10 (single threaded testcase)