Bug ID: JDK-4262583 GZIPInputStream throws IOException for very large files

JDK-4262583 : GZIPInputStream throws IOException for very large files

Type: Bug
Component: core-libs
Sub-Component: java.util.jar
Affected Version: 1.1.7,1.4.0,1.4.2,5.0

Priority: P4
Status: Resolved
Resolution: Fixed
OS: generic,windows_2000,windows_xp
CPU: generic,x86

Submitted: 1999-08-15
Updated: 2005-07-07
Resolved: 2003-09-12

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

Other
5.0 tigerFixed

Related Reports

Relates :	JDK-5092263 - GZIPInputStream spuriously reports "Corrupt GZIP trailer" for sizes > 2GB
Relates :	JDK-4418997 - Files between 2 Gb and 4Gb (excluded) are not accepted in a zip file.
Relates :	JDK-6373678 - GZIPInputStream throws Corrupt GZIP Trailer on large files in 1.5.0
Relates :	JDK-6419239 - jar can add files to an archive that it can't later extract
Relates :	JDK-6599383 - Unable to open zip files more than 2GB in size
Relates :	JDK-4795136 - CRC check fails for files over 2 GB

Description

Name: dbT83986			Date: 08/15/99


Uncompressing a file using java.util.zip.GZIPInputStream fails if the uncompressed file is 2GB or larger (it works for files up to 2,147,483,647 bytes).  I suspect that the problem is an overflow, presumably of a variable somewhere in GZIPInputStream that should be declared as long, but is declared as int.

Here is the program that exhibits the problem:

import java.io.FileInputStream;
import java.io.IOException;
import java.util.zip.GZIPInputStream;

public class TestGZIP {
    public static void main(String[] args) {
        try {
            FileInputStream fis = new FileInputStream(args[0]);
            GZIPInputStream gis = new GZIPInputStream(fis);
            while (gis.read() != -1) {}
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Here is a program for producing a large file (full of zeroes) of a specific length:

import java.io.FileOutputStream;
import java.io.IOException;

public class WriteZeros {
    public static void main(String[] args) throws IOException {
        byte[] buf = new byte[8192];
        long n = Long.parseLong(args[0]);
        FileOutputStream fos = new FileOutputStream(args[1]);
        long m = n / buf.length;
        for (long i = 0; i < m; i++) {
            fos.write(buf, 0, buf.length);
        }
        fos.write(buf, 0, (int)(n % buf.length));
        fos.close();
    }
}

% java WriteZeros 2147483647 twogigminusone
% gzip twogigminusone
% java TestGZIP twogigminusone.gz
% java WriteZeros 2147483648 twogig
% gzip twogig
% java TestGZIP twogig.gz
java.io.IOException: Corrupt GZIP trailer
        at java.util.zip.GZIPInputStream.readTrailer (GZIPInputStream.java:173) (pc 85)
        at java.util.zip.GZIPInputStream.read (GZIPInputStream.java:90) (pc 23)
        at java.util.zip.InflaterInputStream.read                (pc 8)
        at TestGZIP.main      (TestGZIP.java:10)      (pc 24)
(Review ID: 93899) 
======================================================================

Comments

EVALUATION We've realized (to our dismay) that in fact the problem described was not completely fixed. However, the changes in fixing 5092263 do resolve this problem. 5092263's fixes are integrated into 1.4.2_12, 1.5.0_03, and mustang.
27-03-2006
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: tiger FIXED IN: tiger INTEGRATED IN: tiger tiger-b20
14-06-2004
WORK AROUND Name: dbT83986 Date: 08/15/99 I don't have a workaround ======================================================================
11-06-2004
PUBLIC COMMENTS Add support for zip files and jar files between 2GB and 4GB in size.
10-06-2004
EVALUATION The minimum fix would require API changes. The current java zip API only supports data sizes up to Integer.MAX, although the native zlib implementation support sizes in unsigned longs. A partial fix to this problem would be to change the return values of the Inflater access methods to java longs instead of java ints. ###@###.### 1999-09-08 File sizes in zip files are generally stored as unsigned 32-bit quantities. which do not fit into a java `int'. A `long' should have been used instead. For example, public synchronized int getTotalOut() should have been public synchronized long getTotalOut() However, it is too late to change the API, especially for such a "small" fix as to increase the size from 2GB to 4GB. Long term, we want to support file sizes larger than 4GB, and this will require major changes to the Zip File format. PKZIP implements such an extension, and we should investigate their file format. However, we _can_ get support for sizes up to 4GB. Although the return type of getTotalOut and getTotalIn are semantically incorrect, there is no data loss. We simply have to reinterpret the value as unsigned. So it looks like we can fix this without making any API changes. Most of the code in the zip package already uses "long" to represent unsigned 32-bit values, so few changes are necessary. One very good reason to fix this is that currently a user can use the "jar" command to create a jar file that cannot be read by "jar". ###@###.### 2003-04-29
29-04-2003