JDK-6496274 : jar seems to use more CPU than it should
  • Type: Bug
  • Component: tools
  • Sub-Component: jar
  • Affected Version: 6,7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2006-11-21
  • Updated: 2011-07-15
  • Resolved: 2011-07-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u18Fixed 7 b44Fixed OpenJDK6Fixed
Related Reports
Relates :  
Description
While debugging a problem recently I unjarred rt.jar, copied over a few classes, and rejarred it with

  jar cv0fm ../rt-new.jar META-INF/MANIFEST.MF com java javax ...

Doing this I found that the CPU went to 100% busy for quite some time, perhaps a minute or more on my laptop, before starting to hit the disk to create the jar archive. This behavior occurs pretty identically on both Windows and Solaris. On Solaris a couple of thread dumps from JDK 1.4.2 while the CPU is busy look like

"main" prio=5 tid=0x00032e98 nid=0x1 runnable [0xffbfd000..0xffbfe510]
        at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
        at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
        at java.io.File.isFile(File.java:725)
        at sun.tools.jar.Main.expand(Main.java:365)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:346)
        at sun.tools.jar.Main.run(Main.java:143)
        - locked <0xf1ff0e78> (a sun.tools.jar.Main)
        at sun.tools.jar.Main.main(Main.java:904)

and

"main" prio=5 tid=0x00032e98 nid=0x1 runnable [0xffbfd000..0xffbfe510]
        at java.io.UnixFileSystem.compare(UnixFileSystem.java:290)
        at java.io.File.compareTo(File.java:1463)
        at java.io.File.equals(File.java:1509)
        at java.util.Hashtable.contains(Hashtable.java:274)
        - locked <0xf1ff0ed8> (a java.util.Hashtable)
        at sun.tools.jar.Main.expand(Main.java:366)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:346)
        at sun.tools.jar.Main.run(Main.java:143)
        - locked <0xf1ff0e78> (a sun.tools.jar.Main)
        at sun.tools.jar.Main.main(Main.java:904) 

It seems to me that whatever stat()s and hashes are occurring are taking longer than necessary. I would expect jar to behave much more like tar in starting to hit the disk and create the archive more quickly. Given that we have found inefficient algorithms in portions of the platform before, I think the above area of code warrants some more investigation to see whether there is an inefficient algorithm in use. This high CPU use for a long time period has been present since 1.4.2 and appears in 5.0 and 6 as well.

Comments
EVALUATION jar cf is much faster now, especially for larget jar file such as rt.jar. The speed of jaring the rt.jar with cf0m is roughly boosted from 1:20+ to under 15s on my "old" sparc & linux machine. Though still slower than tar cf:-) should consider to backport to 6ux release.
18-12-2008

EVALUATION lots of places to improve, the biggest bottleneck is the usage of java.util.Hashtable.contains() in expand(...), we can walk the file tree (collect the list of files) much, much faster if we add a simple HashSet just for contains lookup. Experiment shows the speed of "expand" can be 40 times faster when jar/create the rt.jar (not including the file read/write time, but the time we now spend on "expand" is about 30%-50% of the tatal time, which is unacceptable).
11-12-2008