JDK-6599383 : Unable to open zip files more than 2GB in size
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Affected Version: 6u2
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: other
  • Submitted: 2007-08-30
  • Updated: 2010-04-05
  • Resolved: 2009-03-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
6u12Fixed 7 b50Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  

java version "1.6.0_02"
Java(TM) SE Runtime Environment (build 1.6.0_02-b06)
Java HotSpot(TM) Client VM (build 1.6.0_02-b06, mixed mode)


When JDK 6 is used for unzipping a zip file > 2GB in size, the following exception is thrown:

java.util.zip.ZipException: error in opening zip file
  at java.util.zip.ZipFile.open(Native Method)
  at java.util.zip.ZipFile.<init>(ZipFile.java:114)
  at java.util.zip.ZipFile.<init>(ZipFile.java:75)
  at UnZip.main(UnZip.java:65)

Exact steps to reproduce
  1. Create a zip file > 2GB in size.
  2. Write a simple java program to unzip the file using a Java program

The following testcase was obtained from another sun bug:

import java.io.*;
import java.util.*;
import java.util.zip.*;

public class unzip
    public static final void copyInputStream (InputStream in,OutputStream out) throws IOException
        byte[] buffer = new byte [1024];
        int    len;
        while ((len = in.read(buffer)) >= 0)
            out.write(buffer, 0, len);
    public static final void main (String[] args)
        Enumeration entries;
        ZipFile     zipFile=null;
        if (args.length != 1){
            System.err.println("Usage: Unzip zipfile");
        	zipFile = new ZipFile(args[0]);
	}catch(IOException ioe){System.out.println("error opening the zipfile");ioe.printStackTrace();}
        	ZipEntry entry=null;
        	entries = zipFile.entries();
        	while (entries.hasMoreElements()){
                System.out.println("Constructing ZipFile...");
                entry = ( ZipEntry ) entries.nextElement();
        	if (entry.isDirectory()){
        		// Assume directories are stored parents first then children
        		System.err.println("Extracting directory: " + entry.getName());
        		// This is not robust, just for demonstration purposes.
        		(new File(entry.getName())).mkdir();
        	System.err.println("Extracting file: " + entry.getName());
	        	copyInputStream(zipFile.getInputStream(entry),new BufferedOutputStream(new FileOutputStream(entry.getName())));
        		System.out.print("Check memory usage and press 'Enter'...");
        	}catch(IOException ioe){ioe.printStackTrace();}

EVALUATION Unfortunately #define _LARGEFILE64_SOURCE doesn't do the trick. Its definition results in availability of some new types e.g. off64_t. Lo and behold, zip_util.c already has that defined, via sys/feature_tests.h (AFAICT). I've verified that changing occurrences of off_t to off64_t and mmap to mmap64 allows the bug's testcase to pass (existing regression tests also pass); given that this is in zip_util I'll run more tests.

EVALUATION It is very possible that zip_util.c, by virtue of using simple off_t, will fail to open large zip files because of lack of large file support. This can be enabled on most modern 32-bit OSes by appropriate preprocessor hackery. The situation is a little bit confused, but the proper fix might be as simple as adding #undef _FILE_OFFSET_BITS #define _FILE_OFFSET_BITS 64 to the top of zip_util.c. I'm not sure why this particular approach has never been used before. Other places in the JDK define _LARGEFILE64_SOURCE and explicitly access 64-bit variants of basic IO functions, but do not change the meaning of things like off_t directly. (Perhaps this should be done consistently throughout the JDK C source?) A safer option is #undef _LARGEFILE64_SOURCE #define _LARGEFILE64_SOURCE 1 and then changing types and/or functions to the 64-bit versions. E.g. start by changing these occurrences of off_t to off64_t. (That cast to off_t looks bad) 139:ZFILE_Lseek(ZFILE zfd, off_t offset, int whence) { 219: if (ZFILE_Lseek(zfd, (off_t) offset, SEEK_SET) == -1) { 480: off_t offset; Anyways, this problem has been solved before in the JDK. I'll move this to Fix Understood, since it is likely that this is the cause.

EVALUATION Unable to reproduce with 1.6.0_02-b05 (where did you get 1.6.0_02-b06?) when using the attached BigZip to create a 3GB zip file; subsequently running the code provided in the Description works OK. Please provide detailed instructions on how to create a zip file that causes the problem.

EVALUATION I thought that all the issues with zip files between 2GB and 4GB had been addressed in previous fixes: 4262583: GZIPInputStream throws IOException for very large files 4418997: Files between 2 Gb and 4Gb (excluded) are not accepted in a zip file. 4795136: CRC check fails for files over 2 GB 5092263: GZIPInputStream spuriously reports "Corrupt GZIP trailer" for sizes > 2GB