JDK-8347712 : IllegalStateException on multithreaded ZipFile access with non-UTF8 charset
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Affected Version: 15
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux_ubuntu
  • CPU: x86_64
  • Submitted: 2025-01-14
  • Updated: 2025-05-21
  • Resolved: 2025-05-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 25
25 b23Fixed
Related Reports
Causes :  
Description
ADDITIONAL SYSTEM INFORMATION :
Windows is likely affected, too.

observed on Linux 6.8.0-51-generic #52~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Dec  9 15:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

A DESCRIPTION OF THE PROBLEM :
If multiple threads each instantiate their own java.util.zip.ZipFile instance using some java.nio.charset.Charset other than UTF_8 then they share the identical java.nio.charset.CharsetDecoder instance in Java 17, 21, 23, and 25 Build 5 (2025/1/9). The shared decoder is found following the private fields from ZipFile named res, zsrc, zc, dec. Iterating over the entries from both threads often yields an exception with the following stack trace (Java 17 in this case):

java.lang.IllegalStateException: Current state = RESET, new state = FLUSHED
	at java.base/java.nio.charset.CharsetDecoder.throwIllegalStateException(CharsetDecoder.java:996)
	at java.base/java.nio.charset.CharsetDecoder.flush(CharsetDecoder.java:679)
	at java.base/java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:808)
	at java.base/java.util.zip.ZipCoder.toString(ZipCoder.java:59)
	at java.base/java.util.zip.ZipFile.getZipEntry(ZipFile.java:658)
	at java.base/java.util.zip.ZipFile$ZipEntryIterator.next(ZipFile.java:517)
	at java.base/java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:505)
	at java.base/java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:483)

Attached to this bug report you find a minimal example based on Charset.forName("IBM437"). Note that the example creates a temporary zip file using java.nio.file.Files#createTempFile. If everything works fine then both threads keep iterating forever. However, on affected Java versions, above IllegalStateException can be reproduced within a few seconds, which causes the program to terminate.

REGRESSION : Last worked in version 11.0.25

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the following command with an affected Java version (17, 21, 23, or 25 Build 5 (2025/1/9)) on Linux:

java ZipFileMultithreadingExample.java

Windows is likely to be affected, too (untested).

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The program runs forever (as it does with unaffected Java 11).
ACTUAL -
The program prints the following stack trace and stops:

java.lang.IllegalStateException: Current state = RESET, new state = FLUSHED
	at java.base/java.nio.charset.CharsetDecoder.throwIllegalStateException(CharsetDecoder.java:996)
	at java.base/java.nio.charset.CharsetDecoder.flush(CharsetDecoder.java:679)
	at java.base/java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:808)
	at java.base/java.util.zip.ZipCoder.toString(ZipCoder.java:59)
	at java.base/java.util.zip.ZipFile.getZipEntry(ZipFile.java:658)
	at java.base/java.util.zip.ZipFile$ZipEntryIterator.next(ZipFile.java:517)
	at java.base/java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:505)
	at java.base/java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:483)
	at ZipFileMultithreadingExample.iterateZipEntries(ZipFileMultithreadingExample.java:118)
	at ZipFileMultithreadingExample.keepIteratingZipEntries(ZipFileMultithreadingExample.java:105)
	at ZipFileMultithreadingExample.lambda$main$0(ZipFileMultithreadingExample.java:49)
	at java.base/java.lang.Thread.run(Thread.java:840)

---------- BEGIN SOURCE ----------
import static java.lang.Thread.currentThread;
import static java.nio.charset.StandardCharsets.UTF_8;
import static java.nio.file.Files.createTempFile;
import static java.util.Arrays.asList;
import static java.util.Collections.synchronizedList;
import static java.util.stream.Collectors.toList;

import java.io.IOException;
import java.io.OutputStream;
import java.io.UncheckedIOException;
import java.lang.reflect.Field;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.attribute.FileTime;
import java.time.Instant;
import java.util.ArrayList;
import java.util.Enumeration;
import java.util.List;
import java.util.UUID;
import java.util.concurrent.BrokenBarrierException;
import java.util.concurrent.CyclicBarrier;
import java.util.stream.Stream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
import java.util.zip.ZipOutputStream;

public class ZipFileMultithreadingExample {
  private static final Charset IBM_437 = Charset.forName("IBM437");

  /**
   * @apiNote Tested with Java 17, 21, 23, and 25 Build 5 (2025/1/9). It is NOT required to
   *     reproduce the bug, but it helps to understand what is going on. Needs the following VM
   *     option: --add-opens java.base/java.util.zip=ALL-UNNAMED
   */
  private static final boolean PRINT_DEBUG_INFO_SHOWING_MULTITHREADED_USAGE_OF_DECODER = false;

  public static void main(String[] args) throws Exception {
    int numberOfThreads = 2;
    int numberOfEntries = 64;

    Path zip = createTempFile(ZipFileMultithreadingExample.class.getSimpleName() + ".", ".zip");
    zip.toFile().deleteOnExit();
    fillZipWithEntries(zip, numberOfEntries);

    CyclicBarrier before = new CyclicBarrier(numberOfThreads + 1);
    CyclicBarrier after = new CyclicBarrier(numberOfThreads + 1);
    List<Throwable> problems = synchronizedList(new ArrayList<>());

    Runnable runnable = () -> keepIteratingZipEntries(before, after, zip, problems);

    List<Thread> threads =
        Stream.generate(() -> new Thread(runnable)).limit(numberOfThreads).collect(toList());

    for (Thread thread : threads) thread.start();

    int counter = 0;
    while (problems.isEmpty()) {
      before.await();
      after.await();
      System.out.println(counter + " " + problems.size());
      ++counter;
    }

    for (Thread thread : threads) thread.interrupt();
    for (Thread thread : threads) thread.join();

    for (Throwable problem : problems) {
      problem.printStackTrace();
    }
  }

  private static void fillZipWithEntries(Path dstFile, int numberOfEntries) throws IOException {
    FileTime now = FileTime.from(Instant.now());
    try (OutputStream outputStream = Files.newOutputStream(dstFile);
        ZipOutputStream zipOutputStream = new ZipOutputStream(outputStream, IBM_437)) {
      while (numberOfEntries-- > 0) {
        String uuidString = UUID.randomUUID().toString();

        String name =
            String.join(
                "/",
                asList(
                    uuidString.substring(0, 2), //
                    uuidString.substring(2, 4),
                    uuidString.substring(4) + ".txt"));

        ZipEntry zipEntry = new ZipEntry(name);
        zipEntry.setCreationTime(now); // Java needs times for proper UTF-8 support
        zipEntry.setLastModifiedTime(now);
        zipEntry.setLastAccessTime(now);

        zipOutputStream.putNextEntry(zipEntry);
        zipOutputStream.write(uuidString.getBytes(UTF_8));
        zipOutputStream.write('\n');
        zipOutputStream.closeEntry();
      }
    }
  }

  private static void keepIteratingZipEntries(
      CyclicBarrier before, CyclicBarrier after, Path zip, List<Throwable> problems) {
    while (!currentThread().isInterrupted()) {
      if (!tryAwait(before)) return;
      try {
        iterateZipEntries(zip);
      } catch (Throwable t) {
        problems.add(t);
      }
      if (!tryAwait(after)) return;
    }
  }

  private static void iterateZipEntries(Path zip) {
    try (ZipFile zipFile = new ZipFile(zip.toFile(), IBM_437)) {
      if (PRINT_DEBUG_INFO_SHOWING_MULTITHREADED_USAGE_OF_DECODER) printDebugInfo(zipFile);
      Enumeration<? extends ZipEntry> entries = zipFile.entries();
      while (entries.hasMoreElements()) {
        entries.nextElement();
      }
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    }
  }

  @SuppressWarnings("BooleanMethodIsAlwaysInverted")
  private static boolean tryAwait(CyclicBarrier after) {
    try {
      after.await();
      return true;
    } catch (BrokenBarrierException e) {
      return false;
    } catch (InterruptedException e) {
      currentThread().interrupt();
      return false;
    }
  }

  private static void printDebugInfo(ZipFile zipFile) {
    try {
      Field resField = ZipFile.class.getDeclaredField("res");
      resField.setAccessible(true);
      Object res = resField.get(zipFile);
      Field zsrcField = res.getClass().getDeclaredField("zsrc");
      zsrcField.setAccessible(true);
      Object zsrc = zsrcField.get(res);
      Field zcField = zsrc.getClass().getDeclaredField("zc");
      zcField.setAccessible(true);
      Object zc = zcField.get(zsrc);
      Field decField = zc.getClass().getDeclaredField("dec");
      decField.setAccessible(true);
      Object dec = decField.get(zc);
      System.out.println(currentThread() + " uses " + dec);
    } catch (Throwable t) {
      System.out.println(
          "Failed to print debug info. In Java 17, 21, 23, and 25 Build 5 (2025/1/9), the following"
              + " VM option is required: --add-opens java.base/java.util.zip=ALL-UNNAMED"
              + (" - Original error: " + t));
    }
  }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
There are multiple possibilities for avoiding this bug:
- use Java 11
- use StandardCharsets#UTF_8 as the second argument of the ZipFile constructor (or omit the argument)
- externally synchronize access to all ZipFile instances

I believe that also the following helps (do to internal synchronization Based on ZipFile.this):
- Share the identical ZipFile instance among all participating threads. Note that in the example, each thread creates its own ZipFile instance to the same zip file on disk.


FREQUENCY : often



Comments
Changeset: 2c4e8d21 Branch: master Author: Jaikiran Pai <jpai@openjdk.org> Date: 2025-05-14 01:53:19 +0000 URL: https://git.openjdk.org/jdk/commit/2c4e8d211a030c85488e656a9a851d10dd0f9c11
14-05-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/23986 Date: 2025-03-11 14:23:02 +0000
11-03-2025

Additional Information from submitter: ========================================== I observe this bug in a real-world application where each of multiple threads uses its own ZipFile instance for reading from the same ZIP file on disk. Counterintuitively, the real-world application passes a stress test if all threads share the same ZipFile instance. Inspecting the code, I see a synchronized(ZipFile.this) block that helps in the shared case, but is clearly useless in the one-exclusive-ZipFile-instance-per-thread case. So in my use case, sharing the ZipFile is a viable temporary workaround (although ZipFile is not officially threadsafe). The analysis regarding sharing of a ZipFile$Source across multiple ZipFile instances is correct. The performance optimizations of JDK-8243469 also added the decoder to that shared resource, which also caused the related bug JDK-8260010 (same as this one, but UTF8 and solved).
17-01-2025

Tested on Windows: 24 - Failed 23.0.1 - Failed 21.0.5 - Failed 17.0.13 - Failed 11.0.25 - Passed 8u431 - Passed
16-01-2025

Looking at the test and experimenting with a similar test locally, I think this is a genuine issue. The problem here is that even if individual threads of the application are working on different ZipFile instances, they could still run into this issue, if those instances all work against the same underlying ZIP file. This is due to an internal detail in ZipFile$Source and the way we construct and use a ZipCoder (and thus the CharsetDecoder and CharsetEncoder). We end up reusing the same decoder/encoder instance (which aren't thread safe) across ZipFile instances.
15-01-2025

ZipFile is not specified to be thread safe but it's likely there is code in the eco system that assumes it is, so this will require attention as Charset encoders/decoders are not thread safe.
14-01-2025