United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6496274 jar seems to use more CPU than it should
JDK-6496274 : jar seems to use more CPU than it should

Details
Type:
Bug
Submit Date:
2006-11-21
Status:
Closed
Updated Date:
2011-07-15
Project Name:
JDK
Resolved Date:
2011-07-15
Component:
tools
OS:
generic
Sub-Component:
jar
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
6,7
Fixed Versions:

Related Reports
Backport:
Backport:
Relates:

Sub Tasks

Description
While debugging a problem recently I unjarred rt.jar, copied over a few classes, and rejarred it with

  jar cv0fm ../rt-new.jar META-INF/MANIFEST.MF com java javax ...

Doing this I found that the CPU went to 100% busy for quite some time, perhaps a minute or more on my laptop, before starting to hit the disk to create the jar archive. This behavior occurs pretty identically on both Windows and Solaris. On Solaris a couple of thread dumps from JDK 1.4.2 while the CPU is busy look like

"main" prio=5 tid=0x00032e98 nid=0x1 runnable [0xffbfd000..0xffbfe510]
        at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
        at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
        at java.io.File.isFile(File.java:725)
        at sun.tools.jar.Main.expand(Main.java:365)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:346)
        at sun.tools.jar.Main.run(Main.java:143)
        - locked <0xf1ff0e78> (a sun.tools.jar.Main)
        at sun.tools.jar.Main.main(Main.java:904)

and

"main" prio=5 tid=0x00032e98 nid=0x1 runnable [0xffbfd000..0xffbfe510]
        at java.io.UnixFileSystem.compare(UnixFileSystem.java:290)
        at java.io.File.compareTo(File.java:1463)
        at java.io.File.equals(File.java:1509)
        at java.util.Hashtable.contains(Hashtable.java:274)
        - locked <0xf1ff0ed8> (a java.util.Hashtable)
        at sun.tools.jar.Main.expand(Main.java:366)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:376)
        at sun.tools.jar.Main.expand(Main.java:346)
        at sun.tools.jar.Main.run(Main.java:143)
        - locked <0xf1ff0e78> (a sun.tools.jar.Main)
        at sun.tools.jar.Main.main(Main.java:904) 

It seems to me that whatever stat()s and hashes are occurring are taking longer than necessary. I would expect jar to behave much more like tar in starting to hit the disk and create the archive more quickly. Given that we have found inefficient algorithms in portions of the platform before, I think the above area of code warrants some more investigation to see whether there is an inefficient algorithm in use. This high CPU use for a long time period has been present since 1.4.2 and appears in 5.0 and 6 as well.

                                    

Comments
EVALUATION

lots of places to improve, the biggest bottleneck is the usage of java.util.Hashtable.contains() in expand(...), we can walk the file tree (collect the list of files) much, much faster if we add a simple HashSet just for contains lookup. Experiment shows the speed of "expand" can be 40 times faster when jar/create the rt.jar (not including the file read/write time, but the time we now spend on "expand" is about 30%-50% of the tatal time, which is unacceptable).
                                     
2008-12-11
EVALUATION

jar cf is much faster now, especially for larget jar file such as rt.jar. The speed of jaring the rt.jar with cf0m is roughly boosted from 1:20+ to under 15s on my "old" sparc & linux machine. Though still slower than tar cf:-)

should consider to backport to 6ux release.
                                     
2008-12-18



Hardware and Software, Engineered to Work Together