JDK-8242864 : Increase the default, internal buffer size of the Streams in java.util.zip
  • Type: Enhancement
  • Component: performance
  • Sub-Component: libraries
  • Affected Version: 15
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2020-04-15
  • Updated: 2022-10-19
Related Reports
Relates :  
Description
DeflaterInputStream, GZIPInputStream, GZIPOutputStream, InflaterInputStream, DeflaterOutputStream all use an internal byte buffer of 512 bytes by default.

While this buffer size is configurable, increasing the default to a bigger size (e.g. 4096 bytes) can considerably speed up (up to 50%) read and write operations on these streams when they are created with the default buffer size.

Some performance numbers for  InflaterOutputStream can be found in the JMH benchmark attached to JDK-8242848.

Comments
50% performance gain sounds like a very valid case to further drive this default-performance improvement, so I'd like to kindly ask what the current status of the concerns-evaluation is like? If nothing was found that speaks against changing this default (really, it is just a default!) then I would like to set up a PR for this. :-)
19-10-2022

I'll note that ZipFile.getInputStream already explicitly sets the buffer to either 4096 or 8192 when deflating zip entries: case DEFLATED: // Inflater likes a bit of slack // MORE: Compute good size for inflater stream: long size = CENLEN(zsrc.cen, pos) + 2; if (size > 65536) { size = 8192; } if (size <= 0) { size = 4096; } InputStream is = new ZipFileInflaterInputStream(in, res, (int)size);
23-04-2020

Sure, there's no need to hurry, although doing such a change sometime at the beginning of a release cycle is probably best to give some more test exposure.
17-04-2020

Please hold off on this change for a few days to give time for a detail analysis on the implications of this change. Sadly, it will require digging into ancient history and issues that pre-date OpenJDK to uncover some of the rational for the existing code.
17-04-2020

What are your specific concerns? How can existing code rely on these implementation details (except performance wise, which should improve after we do these changes)? The fact that these buffers sizes always have been configurable by the user if he chose to use the corresponding constructor makes it very unlikely in my opinion that another default value can break anything. Claes has proposed to align this with the other internal buffers (e.g. in InputStream, BufferedInputStream, Files, etc.) which currently have a size of 8192 bytes (https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/065856.html). This seems to be a reasonable approach to me.
17-04-2020

This is risky change and will need a lot of eyes to make sure that it won't break existing code.
16-04-2020