JDK-8226530 : ZipFile reads wrong entry size from ZIP64 entries
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Affected Version: 9,11,13,14
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-06-20
  • Updated: 2021-06-29
  • Resolved: 2019-08-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11.0.11.0.2-oracleFixed
Related Reports
Relates :  
Description
There is a regression in ZipFile support caused by JDK-8145260.

The simplest reproducer is this:

import java.io.*;
import java.util.*;
import java.util.zip.*;

public class ZipFileTest {
  public static void main(String[] args) throws Exception {
    File file = new File(args[0]);

    System.out.println("Trying with ZipInputStream:");
    try (FileInputStream fis = new FileInputStream(file);
      ZipInputStream zis = new ZipInputStream(fis)) {
      ZipEntry entry;
      while ((entry = zis.getNextEntry()) != null) {
        System.out.println(entry.getName() + ": " + entry.getSize());
      }
    }

    System.out.println("Trying with ZipFile:");
    try (ZipFile zip = new ZipFile(file)) {
      Enumeration<? extends ZipEntry> entries = zip.entries();
      while (entries.hasMoreElements()) {
        ZipEntry entry = entries.nextElement();
        System.out.println(entry.getName() + ": " + entry.getSize());
      }
    }
  }
}

This is how you run it:
 $ dd if=/dev/zero of=bigfile bs=1K count=5M
 $ dd if=/dev/zero of=smallfile bs=1K count=1K
 $ zip -1 test.zip bigfile smallfile
 $ javac ZipFileTest.java
 $ java ZipFileTest test.zip 

Trying with ZipInputStream:
bigfile: 5368709120
smallfile: 1048576
Trying with ZipFile:
bigfile: 4294967295 <--- bad!
smallfile: 1048576

That 4294967295 is actually ZIP64_MAGICVAL. ZipFile.getEntry does not handle it, in contrast to ZipInputStream that reaches for extended attributes to get the real size.

Remarkably, the internal ZipFileInputStream inside ZipFile.java does handle ZIP64 right, so this hack works:

diff -r 12e8433e2581 src/java.base/share/classes/java/util/zip/ZipFile.java
--- a/src/java.base/share/classes/java/util/zip/ZipFile.java    Thu Jun 20 08:02:41 2019 +0000
+++ b/src/java.base/share/classes/java/util/zip/ZipFile.java    Thu Jun 20 19:55:16 2019 +0200
@@ -681,10 +681,15 @@
                 e.comment = zc.toStringUTF8(cen, start, clen);
             } else {
                 e.comment = zc.toString(cen, start, clen);
             }
         }
+
+        // Hack: ZipFileInputStream knows how to deal with ZIP64.
+        ZipFileInputStream zfis = new ZipFileInputStream(cen, pos);
+        e.size = zfis.size;
+
         lastEntryName = e.name;
         lastEntryPos = pos;
         return e;
     }
 
$ java ZipFileTest test.zip 
Trying with ZipInputStream:
bigfile: 5368709120
smallfile: 1048576
Trying with ZipFile:
bigfile: 5368709120 <--- good!
smallfile: 1048576


We have observed it with smaller archives as well. The bug requires zip entries to have sizes recorded with ZIP64 extensions. Linux zip seems to generate old 4-byte size for entries that fit. This is explicitly allowed by spec:

      4.3.9.2 When compressing files, compressed and uncompressed sizes 
      SHOULD be stored in ZIP64 format (as 8 byte values) when a 
      file's size exceeds 0xFFFFFFFF.   However ZIP64 format MAY be 
      used regardless of the size of a file.  When extracting, if 
      the zip64 extended information extra field is present for 
      the file the compressed and uncompressed sizes will be 8
      byte values.  
Comments
Fix Request (13u) Same reason as for 11: This solves serious ZIP64 regression. Patch applies cleanly to 11u. New test fails without the patch, passes with it. Original tests from description start to pass with the patch. The whole thing passes tier1 and tier2 suites.
09-08-2019

Fix Request (11u) This solves serious ZIP64 regression. Patch applies cleanly to 11u. New test fails without the patch, passes with it. Original tests from description start to pass with the patch. The whole thing passes tier1 and tier2 suites. We'll wait a bit for jdk/jdk to get more testing and/or discover any follow up issues.
08-08-2019

URL: https://hg.openjdk.java.net/jdk/jdk/rev/cffcc4c5a5ba User: lancea Date: 2019-08-07 18:04:44 +0000
07-08-2019

The issue is caused by setExtra0 always assuming it is called for a LOC header and that the size when EXTID_ZIP64 is set is 16 but they may not be the case for a CEN header. So updated setExtra0 to keep the existing code when called while processing a LOC Header, but in the case of setExtra0 being called while processing a CEN header it, add the same validation as done in ZIP FS which allows the test to pass. The reason ZipInputStream worked for the size vs ZipFile is due to the fact ZipFile leverages the CEN where ZipInputStream leverages the LOC. ZipFileInputStream does not use setExtra0 and does checking somewhat similar to ZIP FS and still leverages the CEN
02-08-2019

Test + hack fix is here: http://cr.openjdk.java.net/~shade/8226530/webrev.00/
03-07-2019