JDK-8340729 : GZIPInputStream readTrailer uses faulty available() test for end-of-stream
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Priority: P4
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 11-pool,17-pool,21-pool
  • Submitted: 2024-09-24
  • Updated: 2025-06-25
  • Resolved: 2025-04-28
Related Reports
CSR :  
Duplicate :  
Relates :  
Description
This CSR for 21,17,11 is same as original CSR for 23 - JDK-8327489

Summary
-------

Update `java.util.zip.GZIPInputStream` so it doesn't rely on `java.io.InputStream.available()` method to decide whether or not to read a concatenated GZIP stream from the underlying input stream.

Problem
-------

The `GZIPInputStream` class takes an `InputStream` to read compressed GZIP data from. GZIP format allows for multiple GZIP streams to be concatenated. An undocumented feature of the implementation in `GZIPInputStream` is that it supports reading such concatenated GZIP streams. This is possible because the GZIP format defines a 8 byte trailer representing the end of an individual GZIP stream.

`GZIPInputStream` has a public `read(byte[] buf, int off, int len)` method which returns the uncompressed data after reading from the underlying, possibly concatenated GZIP streams. The current implementation of this method after having read an 8 byte trailer in the underlying stream, calls the `java.io.InputStream.available()` method on the underlying stream to decide whether or not there's a subsequent concatenated GZIP stream data. If the `available()` method call returns `0` then the implementation in `GZIPInputStream.read()` does not read any additional data and marks the `GZIPInputStream` as having reached the end of compressed input stream. Any subsequent calls to `read()` will return `-1` indicating the end of stream.

Relying on the return value of `InputStream.available()` method is not appropriate since the `InputStream.available()` as per its API javadoc states that the return value is merely an estimate of the number of bytes available. That method's API javadoc further states:
```
Note that while some implementations of {@code InputStream} will return the total number of bytes in the stream, many will not.
```
As a result, the current implementation of `GZIPInputStream.read()` which relies on the underlying `InputStream`'s `available()` method can incorrectly consider the GZIP stream to have reached end of stream even when there may be a concatenated GZIP stream. This results in the `GZIPInputStream.read()` ignoring and thus not returning possibly additional uncompressed data of underlying GZIP streams.

Solution
--------

The `GZIPInputStream.read()` will be updated to remove the check on `InputStream.available()`. The implementation, after reading a 8 byte GZIP stream trailer, will now attempt to read a GZIP stream header from the underlying input stream. If the additional `read()`s on the underlying input stream return enough bytes and those bytes represent a GZIP stream header, then the `GZIPInputStream.read()` method will consider that there is a concatenated GZIP stream and it will continue to return the uncompressed data even from the concatenated stream. If however, the `read()`s on the underlying input stream don't return enough bytes or the returned bytes don't represent a GZIP stream header, then the `GZIPInputStream` will be marked as having reached the end of compressed input stream.

Specification
-------------
There are no specification changes.



Comments
21-pool was required to be added, I did miss adding it by mistake. Thanks [~darcy]
25-06-2025

Updating fixVersion field to be consistent with text of the CSR.
24-06-2025

[~rreddy] can you take a look at the Fix Version labels here? They are inconsistent with the text. I can't seem to take any further steps to fix this without the `21-pool` label.
24-06-2025

It looks like an accidental omission that version 21 is not mentioned in the Fix Version/s list. [~darcy] can you please confirm?
24-06-2025

The description of this CSR notes that it covers 21, 17, and 11, but the "Fix Versions" field has 17, 11, and 8. Is that discrepancy intentional? I'd like to perform the backports and want to ensure it can move ahead for 21.
17-06-2025

Moving to Approved.
28-04-2025

As a reminder, the assignee of a CSR is responsible for advancing it when it is in Provisional state.
22-04-2025

Noting here the discussion I had with Ravi some days back about this backport. My opinion is that the backport of this change into update releases is OK. However, I would suggest that we wait a few more months (3 maybe) before we initiate this backport. The original change, that is being backported, was done in JDK 23, which has been just released. Although the change is isolated and as noted the compatibility risk is low, it would be good to allow applications to start using Java 23 and report any issues related to this change.
04-10-2024

Moving to Provisional. [~jpai] or [~rriggs], please review this backport CSR. [~rreddy], please Finalized this CSR after it has been reviewed.
27-09-2024

This CSR is same as original CSR - JDK-8327489
24-09-2024