JDK-8156484 : ZipFile retains too much native memory from caching Inflaters
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Priority: P3
  • Status: In Progress
  • Resolution: Unresolved
  • Submitted: 2016-05-06
  • Updated: 2016-05-19
Related Reports
Relates :  
Description
When a ZipFile creates an Inflater to read a compressed zip entry, the Inflater is retained in a cache for future uses.  Each Inflater retains around 32kb of native memory in its zlib data structures.  When this cache is in heavy use, such as when iterating through a compressed zip file with many small entries, this optimization is barely positive.  For its intended use of speeding up class loading, it is not really noticeable.  On the other hand, for modern Java applications with many small jar files, the retained memory is a serious problem.  We see classpaths on the order of 10,000 jar files, each of which has a long lived ZipFile object, resulting in memory bloat on the order of 1GB.

It's reasonable to remove the Inflater cache entirely, but it's also reasonable to have a small cache to make ZipFile iteration a wee bit faster.  But native resource cleanup is always a headache.
Comments
A microbenchmark trying to show the best possible case for Inflater caching. Removing Inflater caching completely makes this about 25% slower. But note that all the entries here are tiny. In the real world (class loading) the regression is almost unnoticeable. Let's not pay too high a price for the cache in complexity and memory bloat. import java.io.*; import java.nio.file.*; import java.util.*; import java.util.concurrent.*; import java.util.zip.*; public class Benchmark60kMulti { public static void main(String[] args) throws Throwable { final int REPS = 20; final int THREADS = 8; final byte[] data = "123456789".getBytes("UTF-8"); final File file = new File("60k.zip"); if (!file.exists()) { try (FileOutputStream fos = new FileOutputStream(file); BufferedOutputStream bos = new BufferedOutputStream(fos); ZipOutputStream zos = new ZipOutputStream(bos)) { for (int i = 0; i < 60_000; i++) { ZipEntry ze = new ZipEntry(i + ".txt"); ze.setMethod(ZipEntry.DEFLATED); ze.setSize(0); ze.setCrc(0); zos.putNextEntry(ze); zos.write(data); } } } Path target = Paths.get("60k.zip"); for (int i = 0; i < THREADS; i++) { Path link = Paths.get("symlink-" + i + ".zip"); if (!link.toFile().exists()) Files.createSymbolicLink(link, target); } ExecutorService pool = Executors.newFixedThreadPool(THREADS); ArrayList<Future<?>> futures = new ArrayList<>(); for (int i = 0; i < THREADS; i++) { final Path link = Paths.get("symlink-" + i + ".zip"); final Runnable task = () -> { try (ZipFile zipFile = new ZipFile(link.toFile())) { for (int j = 0; j < REPS; j++) { byte[] buf = new byte[100]; Enumeration<? extends ZipEntry> entries = zipFile.entries(); while (entries.hasMoreElements()) { ZipEntry zentry = entries.nextElement(); try (InputStream is = zipFile.getInputStream(zentry)) { is.read(buf); } } } } catch (Throwable t) { throw new Error(t); } }; futures.add(pool.submit(task)); } pool.shutdown(); for (Future future : futures) future.get(); pool.awaitTermination(1, TimeUnit.DAYS); } }
19-05-2016

http://cr.openjdk.java.net/~martin/webrevs/openjdk9/ZipFile-Inflater-cache/
19-05-2016