JDK-8243469 : Lazily encode name in ZipFile.getEntryPos
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-04-23
  • Updated: 2025-03-11
  • Resolved: 2020-04-27
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 15
15 b21Fixed
Related Reports
Causes :  
Relates :  
Relates :  
Description
Current implementation of ZipFile.getEntryPos takes the encoded byte[] of the String we're looking up. Which means when looking up entries across multiple jar files, we allocate the byte[] over and over for each jar file searched.

By refactoring the ZipFile hash table to save a String-normalized hash value rather than a hash of the encoded value, we can avoid this allocation most of the time when the entry is not found in the jar/zip, while also getting the benefit of the caching of String.hashCode. 

For jar files or any zip using UTF-8 encoding, calculating such hashes during open (initCEN) is easy to get as fast for ASCII entries. We could make the cost negligible for non-ASCII.

Experimental patch: http://cr.openjdk.java.net/~redestad/scratch/zipfile_string_hash.01

Instrumenting cost of ZipFile.getEntry shows a ~120ms improvement on Spring PetClinic, roughly 2.5-3% of total.
Comments
URL: https://hg.openjdk.java.net/jdk/jdk/rev/b570f0fe081e User: redestad Date: 2020-04-27 15:26:34 +0000
27-04-2020