JDK-4927845 : Footprint regressions in Tiger b20
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.jar
  • Affected Version: 5.0
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_9
  • CPU: generic
  • Submitted: 2003-09-25
  • Updated: 2013-11-01
  • Resolved: 2004-02-06
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
5.0 b38Fixed
Related Reports
Relates :  
Relates :  
Description
There is a 2% regression in footprint and 1% regression in startup. 

http://alacrity.sfbay/nq/nes.jsp?base=1842,1843,1844,1845,1846,1847,1848,1850&build=2424,2425,2426,2427,2428,2430,2431,2461&pval=0.01

The regression could be caused by J2D integration that is pulling in new listeners via the awt class ComponentAccessory.

/net/hyori/export/logs/tiger/b20/b19b20.xframer.pdf

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: tiger-beta2 FIXED IN: tiger-beta2 INTEGRATED IN: tiger-b38 tiger-beta2
14-06-2004

SUGGESTED FIX --- /u/martin/ws/jzcell/webrev/src/share/native/java/util/zip/zip_util.h- 2004-01-26 21:49:16.577802000 -0800 +++ zip_util.h 2004-01-26 21:49:12.425860000 -0800 @@ -125,19 +125,19 @@ * every entry in every active JAR. * Note that in order to save space we don't keep the name in memory, * but merely remember a 32 bit hash. */ typedef struct jzcell { - jlong pos; /* Offset of LOC within ZIP file */ + unsigned int pos; /* Offset of LOC within ZIP file */ unsigned int hash; /* 32 bit hashcode on name */ unsigned short nelen; /* length of name and extra data */ unsigned short next; /* hash chain: index into jzfile->entries */ - jlong size; /* Uncompressed size */ - jlong csize; /* Compressed size */ + unsigned int size; /* Uncompressed size */ + unsigned int csize; /* Compressed size */ jint crc; unsigned short elen; /* length of extra data in CEN */ - jlong cenpos; /* Offset of file headers in CEN */ + unsigned int cenpos; /* Offset of file headers in CEN */ } jzcell; /* * Descriptor for a ZIP file. */
11-06-2004

EVALUATION According to the nightly trends, it looks like the regression happened around 9/12, which would mean that the T&L putback is the most likely cause of the footprint regression. The startup regression is much smaller and harder to locate, so should be filed as a separate bug. ###@###.### 2003-09-26 It is very likely that my fix for large zip files introduced the footprint regression. 4910572: jarFile.getInputStream dumps core if JarFile is closed. [iag] 4262583: GZIPInputStream throws IOException for very large files [iag] 4418997: Files between 2 Gb and 4Gb (excluded) are not accepted in a zip file. [iag] 4795136: CRC check fails for files over 2 GB [iag] When I implemented that fix, I was not aware that a struct jzcell is created for every entry in every jar file at startup time. Every 32-bit word overhead adds 50k to the size of the internal jar entry hash table, since currently rt.jar contains 12k entries. Although all quantities related to file sizes and file offsets should logically be jlong, for Tiger we should revert just the fields in struct jzcell (not jzentry) to being "unsigned int". There are other steps we can take as well to reduce the overhead of jzcell, with increasing level of risk and effort. We don't really need to store the uncompressed size, since it can be obtained easily if we know the compressed size, for example. ###@###.### 2003-12-31 There is no need for struct jzcell to contain as much information as it does, since it replicates information in the CEN header. I have a fix that shrinks struct jzcell down from 32 bytes to 12 bytes, giving a saving of 20 bytes * number of jzcell entries. That's 250k just for rt.jar. Alacrity on linux-i586 measures a 8% footprint improvement, confirming that this is indeed the fix. By having ZipFile.entries().getNextEntry() read the CEN header rather than the LOC header, we also have huge wins in file read locality. This is not noticeable if the zip file is already cached in memory, but gives a huge win if the zip file is on a non-cached slow network drive, as on a Samba drive accessed from windows. So this is also a fix for 4770745. ###@###.### 2004-01-04 Fixing this requires replacing the readLOC function. En passant, it will fix this memory leak: FREE_AND_RETURN_NULL: #ifndef USE_MMAP if (ze != NULL) { if (ze->extra != NULL) free(ze->extra); if (ze->name != NULL) free(ze->name); free(ze); } if (locbuf != NULL) free(locbuf); #endif This will leak memory if USE_MMAP is defined. ###@###.### 2004-01-24 It is too risky to fix all of these problems together in Tiger, so we are making small safe fixes for Tiger, and will make a more comprehensive change based on the ideas above when this code is reworked for Mustang. The simple memory leak bug above has been assigned to bug 4983832: Memory leak on Windows when zip errors occur in zip_util.c:readLOC() The footprint regression will be fixed by changing the "jlong"s in struct jzcell to "unsigned int". This gives us a win of about 2% in the "footprint" benchmark used internally at Sun. "struct jzcell" is still bigger than it needs to be; that will be addressed later. ###@###.### 2004-01-26
26-01-2004