JDK-6438254 : Very slow performance from GZIPInputStream in JDK 1.5.0_07-b03
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Affected Version: 5.0u7
  • Priority: P3
  • Status: Closed
  • Resolution: Not an Issue
  • OS: linux
  • CPU: x86
  • Submitted: 2006-06-14
  • Updated: 2010-04-02
  • Resolved: 2007-11-08
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.5.0_07"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_07-b03)
Java HotSpot(TM) Server VM (build 1.5.0_07-b03, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
  Verified with Fedora Core 3 and CentOS 4.3 on Xeon and Opteron:

Linux fedoraxeon 2.6.12-1.1378_FC3smp #1 SMP Wed Sep 14 04:52:36 EDT 2005 i686 i686 i386 GNU/Linux

Linux centosxeon 2.6.9-34.0.1.ELsmp #1 SMP Wed May 24 08:14:29 CDT 2006 i686 i686 i386 GNU/Linux

Linux centosopteron 2.6.9-34.0.1.ELsmp #1 SMP Wed May 24 05:28:30 CDT 2006 x86_64 x86_64 x86_64 GNU/Linux

A DESCRIPTION OF THE PROBLEM :
Using GZIPInputStream to decompress a 60 MB file takes 5 or 6 times longer with JDK 1.5.0_07-b03 than with 1.5.0_06-b05 on the same machine.

This looks like Bug ID 6348045 which had introduced slowness in 1.5.0_06 builds 03 and 04.


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Write a simple java program to decompress a file (previously written with GZIPOutputStream) and compare its performance on the current 1.5.0_07-b03 JDK versus 1.5.0_06-b05.


REPRODUCIBILITY :
This bug can be reproduced always.

Release Regression From : 5.0u6
The above release value was the last known release where this 
bug was not reproducible. Since then there has been a regression.

Comments
EVALUATION I was inspired to investigate upon reading 6623164's description. For reasons not entirely clear to me, I can now reproduce this bug to a certain extentt. In reading data, there is some degradation, but not the 5-6x longer noted in the Description. In writing data there is a clear degradation of about 6x. The degradation results from the fix for 6206933, which caused some copying of client data to occur instead of the previous memory pinning. Fortunately, the workaround is straightforward: add buffering. See the attached GZipTestRandomBuffered.java. Below are results comparing the original test on 1.5.0_06 and 1.5.0_06 with the attached. Note that the original test was changed to print the time needed to create the file, and that this is substantially worse without buffering. % jver 1.5.0_06 java GzipTestRandom 10 write: 930 147 102 100 % jver 1.5.0_07 java GzipTestRandom 10 write: 6445 189 129 129 % jver 1.5.0_06 java GzipTestRandomBuffered 10 write: 938 132 75 77 % jver 1.5.0_07 java GzipTestRandomBuffered 10 write: 976 146 106 102
08-11-2007

WORK AROUND Wrap GZIP streams with buffering streams.
08-11-2007

EVALUATION Closing this bug due to inability to reproduce and lack of further information from submitter.
11-07-2006

EVALUATION As per current policy, we would like to close this bug in 2 weeks if the submitter does not provide a reproducible testcase. In this particular instance, please include a test file that demonstrates the performance regression.
26-06-2006

EVALUATION I have been unable to reproduce the problem. Please provide a test case and data file with which we can reproduce the problem. It's not that we haven't tried... I have run comparisons on solaris-sparc, windows-i586, linux-i586 (using a 2.4.9 kernel), and linux-amd64 2.4.19), with 1.5.0_06, 1.5.0_07, 1.5.0_08, and 1.6.0 (a couple different builds), reading files of both 16 and 60 MB, and see a minor, anticipated decrease in performance in some cases, but nothing like the 5-6 times longer noted in the description. I used tests that read the data 1024 bytes at a time, and 1 byte at a time. For data, I first created a gzip of rt.jar (resulting in a 16 MB file), and also a gzip containing 4 copies of rt.jar (~ 60 MB). Thinking perhaps more random data was necessary, I created the attached test program (which creates and then reads a .gz file containing randome data), but it too shows no regression.
20-06-2006

EVALUATION Looks like a dup: 6348045: REGRESSION: serious performance degradation as GZIPInputStream is slower 6356456: REGRESSION: GZIPInputStream is slower on 1.4.2_10 than on 1.4.2_09
14-06-2006