United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-4770745 Iterating over ZipFile entries is very slow when using network file systems
JDK-4770745 : Iterating over ZipFile entries is very slow when using network file systems

Details
Type:
Enhancement
Submit Date:
2002-10-29
Status:
Resolved
Updated Date:
2005-02-19
Project Name:
JDK
Resolved Date:
2005-02-19
Component:
core-libs
OS:
windows_2000
Sub-Component:
java.util.jar
CPU:
x86
Priority:
P4
Resolution:
Fixed
Affected Versions:
1.4.1
Fixed Versions:

Related Reports
Relates:
Relates:
Relates:
Relates:

Sub Tasks

Description
Windows 2000 SP3 javac compiler performance for a Networke Mapped drive is very slow for JDK 1.3.1 and JDK 1.4.1 as compared to JDK 1.2.2

Customer moving from JDK 1.2.2 to 1.3.1 the compile time for their code has increased from   4 hours(JDK 1.2.2)  to 17-18 hours(JDK 1.3.1)
 
A simple test case consisting of 14 ".java" files is attached (JavaTest.zip)


The following are the test results for the ".java" files in the test case on a Windows 2000 SP3 machine.using JDK 1.2.2,JDK 1.3.1and JDK 1.4.1
---------------------------------------------------
                      Local          Network Mapped
JDK-1.2.2_013         18994ms        13092 ms
JDK1.3.1_05-b02       9864ms         19712ms
JDK1.4.1-b21          15766ms        35064ms
---------------------------------------------------

The javac -verbose output details are attached(Logging.zip)

                                    

Comments
WORK AROUND

Using a Local drive resolves the issue.
Since a lot of people are working together on the project using a Local drive is not feasible.

--------------------------------------------
It is sufficient to copy the rt.jar to a local drive.
When performing many compilations, one might consider modifying makefiles
to do:
cp $NETWORKDIR/rt.jar $TMPDIR/rt.jar
javac -bootclasspath $TMPDIR/rt.jar ...
javac -bootclasspath $TMPDIR/rt.jar ...
javac -bootclasspath $TMPDIR/rt.jar ...
...
rm $TMPDIR/rt.jar

###@###.### 2003-04-29
                                     
2003-04-29
PUBLIC COMMENTS

javac is slow when installed on a network drive.
                                     
2004-06-10
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
dragon


                                     
2004-06-14
EVALUATION

OK, I'll look at this.  We intend to overhaul I/O in 1.5 to use
the new java.nio APIs.

###@###.### 2002-10-29

according to ###@###.###, the problem we are interested
in is why the loading of rt.jar takes a long time in 1.3.x and 1.4.x javac
when the rt.jar is stored in a network drive. compiling the test files
using javac -verbose. during the time its loading rt.jar(observed 
by filemon) 1.4.x javac pause for about 6-7 seconds, 1.2.x javac
pause for only 2 seconds.

it seems the performance difference is due to following
- the new v8 javac tries to get a whole listing of the jar file
  by using things like "e = ZipFile.entries(), e.nextElement()"
  oldjavac doesnt seem to do this. i guess it uses ZipFile.getEntry(String)
  instead

- the above said "e = ZipFile.entries(), e.nextElement()" way of listing
  zip file seems to be slower in 1.4.x than in 1.2.x
  use the attached simple java program(ZipList.java) on the same
  networked rt.jar file
  1.4.x takes about 6 second, 1.2.2 takes 2 seconds.
  this seems to be caused by some change made in 1.3.x in zip.dll, such that
  now to get the listing, we now need to get the local header for 
  each compressed file in the zip. in 1.2.2 only reading in the
  central directory is enough.

###@###.### 2003-04-15

Issue (1) is necessary for correctness.  However, the second issue should
be investigated in the context of the jar/zp libraries.

###@###.### 2003-04-16

It appears that the slowdown iterating through a jar file
on a slow network drive is a result of
4218006: indexing rt.jar costs 400 K
which improved startup footprint.

It should be possible to get both small startup footprint for
general java programs as well as efficient traversal of Jar files
for use by programs like javac by careful coding.

javac only needs the names of the entries in a jar file, and that can
be obtained by accessing only the "Central directory" of the jar files.
Accessing the LOC headers should not be necessary.

###@###.### 2003-04-29


Sorry, it doesn't look like we're going to be able to get this into
Tiger.  It's pretty risky to do it late.  The good news is that
a performance fix like this can be done in 1.5.1.
###@###.### 2003-11-04

I now understand how to fix this, based on an investigation of
4927845: Footprint regressions in Tiger b20
The analysis of that bug contains the strategy for fixing this one.
It's too late for Tiger, however.

###@###.### 2004-01-26

This should give dramatic performance improvements for any app, like javac, that
extracts metadata like file names from zip files.
Performance improvement for samba drives is an order of magnitude;
for NFS the improvements are more modest; perhaps 30%.
###@###.### 2005-1-19 07:25:34 GMT

Here's some more data:

An javac-like app can be reduced down to this canonical pattern:
 
   public static void main(String[] args) throws Exception {
	JarFile jar = new JarFile(args[0]);
	Enumeration entries = jar.entries();

	while(entries.hasMoreElements()) {
	    ZipEntry entry = (ZipEntry)entries.nextElement();
	    System.out.println(entry.getName());
	}
    }

If you run this code against an rt.jar on a Samba drive, you get

JAR=//MYHOST/root//java/re/jdk/1.4.2/archive/fcs/binaries/solaris-sparc/jre/lib/rt.jar; repeat 3 time jws foot jr -source 1.5 JarList.java $JAR >/dev/null; repeat 3 time jws mustang jr -source 1.5 JarList.java $JAR /dev/null
jws foot jr -source 1.5 JarList.java $JAR  0.28s user 0.06s system 14% cpu 2.302 total
jws foot jr -source 1.5 JarList.java $JAR  0.23s user 0.09s system 19% cpu 1.637 total
jws foot jr -source 1.5 JarList.java $JAR  0.23s user 0.09s system 19% cpu 1.665 total
jws mustang jr -source 1.5 JarList.java $JAR  0.23s user 0.06s system 2% cpu 12.531 total
jws mustang jr -source 1.5 JarList.java $JAR  0.23s user 0.11s system 4% cpu 7.594 total
jws mustang jr -source 1.5 JarList.java $JAR  0.23s user 0.07s system 3% cpu 7.792 total

which gives you a factor of 5 performance improvement on a warm start.

On Solaris NFS, results are more modest:

JAR=/java/re/jdk/1.4.2/archive/fcs/binaries/solaris-sparc/jre/lib/rt.jar; repeat 3 time jws foot jr -source 1.5 JarList.java $JAR >/dev/null; repeat 3 time jws mustang jr -source 1.5 JarList.java $JAR >/dev/null
jws foot jr -source 1.5 JarList.java $JAR > /dev/null  3.51s user 0.44s system 98% cpu 4.019 total
jws foot jr -source 1.5 JarList.java $JAR > /dev/null  3.69s user 0.33s system 99% cpu 4.042 total
jws foot jr -source 1.5 JarList.java $JAR > /dev/null  3.53s user 0.38s system 102% cpu 3.831 total
jws mustang jr -source 1.5 JarList.java $JAR > /dev/null  3.74s user 0.63s system 63% cpu 6.881 total
jws mustang jr -source 1.5 JarList.java $JAR > /dev/null  3.95s user 0.43s system 91% cpu 4.775 total
jws mustang jr -source 1.5 JarList.java $JAR > /dev/null  3.75s user 0.57s system 83% cpu 5.182 total

But 25% is still not a bad result.
###@###.### 2005-1-20 02:54:37 GMT
                                     
2005-01-20



Hardware and Software, Engineered to Work Together