JDK-4828461 : Support Zip files with more than 64k entries
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Affected Version: 1.3.1,1.4.1,1.4.2,5.0u14,6
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS:
    linux,solaris_2.6,solaris_8,windows_2000,windows_xp linux,solaris_2.6,solaris_8,windows_2000,windows_xp
  • CPU: x86,sparc
  • Submitted: 2003-03-06
  • Updated: 2005-02-19
  • Resolved: 2005-02-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6
6 b25Fixed
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Description
Name: gm110360			Date: 03/06/2003


FULL PRODUCT VERSION :
java version "1.4.1_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_02-b06)
Java HotSpot(TM) Client VM (build 1.4.1_02-b06, mixed mode)

appears also in 1.4.1_01 and 1.3.1

FULL OS VERSION :
Linux fettsack 2.4.20kpkg #1 Sun Feb 16 15:46:41 CET 2003 i686 unknown unknown GNU/Linux


A DESCRIPTION OF THE PROBLEM :
If I am trying to read from a .zip file which has more than 65536 entries the ZipFile cannot read these entries.  Testing the behaviour shows an overflow every 65536 entries. Reading the same file via ZipInputStream gives the correct result.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1.) Build a zip file which has more than 65536 entries
2.) try to read the amount of entries in the zipfile or any entry which position is beyond 65536

EXPECTED VERSUS ACTUAL BEHAVIOR :
I did expect that ZipFile.entries() gives the correct result of the amount of zipentries as long as there is no restriction in the zip file format.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
package zipbug;

import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;

public class ZipFileOverflowBug
{
	private static final int NUM_FILES = 70000;
	private static final String TEST_FILE = "test.zip";

	public static void main(String[] args) throws IOException
	{
		// create a test zip file which has enough entries to let ZipFile fail
		ZipOutputStream zOut =
			new ZipOutputStream(
				new BufferedOutputStream(new FileOutputStream(TEST_FILE)));

		for (int i = 1; i <= NUM_FILES; i++)
		{
			zOut.putNextEntry(new ZipEntry(Integer.toString(i)));
			zOut.write(Integer.toString(i).getBytes());
			zOut.closeEntry();
		}
		zOut.close();

		int count = 0;
		System.out.println(
			"expected amount of entries in the zipfile: " + NUM_FILES);

		// test sequential with ZipInputStream to show the correct result
		ZipInputStream in = new ZipInputStream(new FileInputStream(TEST_FILE));
		while (in.getNextEntry() != null)
		{
			count++;
		}
		in.close();
		System.out.println("ZipInputStream got " + count + " entries");

		// now read the same file via ZipFile
		count = 0;
		ZipFile file = new ZipFile(TEST_FILE);
		Enumeration e = file.entries();
		while (e.hasMoreElements())
		{
			ZipEntry entry = (ZipEntry) e.nextElement();
			count++;
		}
		System.out.println("ZipFile got " + count + " entries");
		System.out.println(
			"overflow at "
				+ NUM_FILES
				+ "-"
				+ count
				+ " = "
				+ (NUM_FILES - count));
		file.close();
	}
}

---------- END SOURCE ----------
(Review ID: 182197) 
======================================================================


Another customer writes:

Windows 2000
Professional edition
Service Pack 3

java version "1.4.1_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01)
Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode)

java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
Classic VM (build 1.3.0, J2RE 1.3.0 IBM build cn130-20010502 (JIT enabled:
jitc))

I have an application which produces a lot of xml's, of course with different
names.  All these xml's I put into a zip.  This works very well until the number
of xml's are over 16696.  Then it does not put any more xml's in the zip, and
with no exceptions "pretends" it puts xml's in the zip.

To explain this a little better:  I had 80000 records which was to be
made in to 80000 xml's in my application(of course one by one). When one xml is
finished i put it into the zip, this works fine.  Then the next xml is made and put into the zip, and so on.  when it comes to the 16697th xml file it makes the
xml and "put's"(I have no clue where it goes) it into the zip, no exceptions, and it keeps on until the 80000 xml's are made and put into the zip and the zip
is closed.  The zip file is of a size on the disk which would be right for
the 80000 xml's ca.83 Mb.  Then I open the zip (in a zip program. PKZIP and
WinZip), only the first 16696 xml's are in there, and when I closes the zip
program.  The zip files size on the disk has changed to 12 Mb.  This is of course because there is only 12 Mb of ziped files in the zip file. 

To work around the problem I close the zip file and makes a new one after
10000 xml's have been made, then at the end I have 8 zip files with all the
xml's, and everything is all right.

---------------------------------------------------------------------------
Incorporated from 6190518:
JarFile class is not working properly on a large jar file.
  - Entries past a certain point in the file do not get found.
  - Does not list entries above a certain number

The "jar" command itself works fine on this big jar file.

1) JarTest.java

Looks for BackupHistory.xml which is the last entry in file,
 and never finds it.

javac JarTest.java
java
JarTest Main_COR_Session_FULL_8SEP2004_043614_1.jar

Output:
zip is null
Exception in thread "main" java.lang.NullPointerException
        at java.util.zip.ZipFile.getInputStream(ZipFile.java:201)
        at java.util.jar.JarFile.getInputStream(JarFile.java:359)
        at JarTest.main(JarTest.java:29)


2) ListJar.java

List contents of jar file using enumeration jar.entries()
FAILS - only lists 41633 entries, instead of 107169 entries

javac ListJar.java
java
ListJar Main_COR_Session_FULL_8SEP2004_043614_1.jar > biglist

wc -l biglist
   => 41633 biglist

jar tvf <bigjarfile>
  => 107169 entries

3) ListJar2.java

List Jar file contents using ZipInputStream and getNextEntry
WORKS - lists all entries

javac ListJar2.java
java
ListJar2  Main_COR_Session_FULL_8SEP2004_043614_1.jar > correctbiglist

 wc -l correctbiglist
  => 107170 correctbiglist (includes filename at top)


**************************************************************
import java.util.*;
import java.util.zip.*;
import java.util.jar.*;
import java.io.*;

// Look up entry in huge JarFile.  BackupHistory.xml is at end.  Doesn't find it.

public class JarTest {

 public static void main(String[] args) throws Exception {

    String jarFile = args[0];

    JarFile jar = new JarFile(jarFile);

    // not work - can't find this entry even though it's there in the huge Jar file.
    String jarEntry = "BackupHistory.xml";

    ZipEntry zip = jar.getEntry(jarEntry);
    if (zip == null)
        System.out.println ("zip is null");

    InputStream in = jar.getInputStream(zip);

    System.out.println ("getInputStream returns " + in);
  }
}
**************************************************************
import java.util.*;
import java.util.zip.*;
import java.util.jar.*;
import java.io.*;

// List contents of JarFile using jar.entries() enumeration
// FAILS - only lists 41633 entries, instead of 107169

public class ListJar {
  public static void main(String[] args) throws Exception {
    String jarFile = args[0];

    JarFile jar = new JarFile(jarFile);

    Enumeration enum = null;

    enum = jar.entries();

    ZipEntry entry;

    while(enum.hasMoreElements()){

        entry = (ZipEntry)enum.nextElement();

        String entryName = entry.getName();

        System.out.println("  " + entryName);

    }
  }
}
**********************************************************
import java.util.*;
import java.util.zip.*;
import java.util.jar.*;
import java.io.*;

// List Jar file contents using ZipInputStream and getNextEntry
// WORKS - lists all entries

public class ListJar2 {
  public static void main(String[] args) throws Exception {

        String jarFile = args[0];
        System.out.println("Jar File = '" + jarFile + "'");

        String jarEntry = "BackupHistory.xml";

        FileInputStream in = new FileInputStream(jarFile);
        if (in == null) {System.out.println("FileInputStream 'in' is null");};

        ZipInputStream zip = new ZipInputStream(in);
        if (zip == null) {System.out.println("ZipInputStream 'zip' is null");};

        ZipEntry entry;
        while ((entry = zip.getNextEntry()) != null) {
            String entryName = entry.getName();
            System.out.println("  " + entryName);
        }
    }
}
**********************************************************

Please note that the Main_COR_Session_FULL_8SEP2004_043614_1.jar is jar file of 806510725 bytes. It is located at:

/java/jle_build/sko/hp/jarfile_bug/Main_COR_Session_FULL_8SEP2004_043614_1.jar

###@###.### 11/4/04 01:44 GMT
###@###.### 11/4/04 01:54 GMT
###@###.### 11/4/04 01:54 GMT
###@###.### 2005-1-15 20:09:54 GMT

Comments
SUGGESTED FIX Treat the ENDTOT field in the end header as a hint, or simply ignore it. When reading the central directory, keep track of the count. If necessary, grow dynamic data structures such as the comments array. These changes should be localizable to zip_util.[ch] ###@###.### 2005-1-15 20:09:55 GMT
15-01-2005

EVALUATION The overflow isn't occuring at the java level, it's in the zip file format which allows 2 bytes for the number of entries in the central directory. ###@###.### 2003-03-06 There is a 2-byte field in the zip file (ENDTOT) containing the number of entries. This field cannot hold the number of entries in this case. A zip implementation does not need to use this field, since it can simply count the entries in the "central directory". The central directory must be read anyways as part of initializing a JarFile, so computing the count independently is not a hardship. In fact, the count is already being computed in the code, but limited by ENDTOT. The fix is to ignore ENDTOT, or use it as simply a hint. Strictly speaking, zip files with more than 64k entries are not valid zip files, but zip implementations can support such zip files without too great a hardship. The Info-Zip implementation is one implementation that does this. It seems that the de-facto zip file format does not in fact have the limit on the number of entries, and Java should be fixed up to match. ###@###.### 2005-1-15 20:09:55 GMT
15-01-2005