JDK-6498139 : Buffer allocations in SJSXP introduce large constant factors
  • Type: Enhancement
  • Component: xml
  • Sub-Component: javax.xml.stream
  • Affected Version: 1.4.0
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_10
  • CPU: sparc
  • Submitted: 2006-11-28
  • Updated: 2012-04-25
  • Resolved: 2006-11-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7
1.4.0 1.4Fixed 6-poolResolved 7Fixed
Every time we encounter a new Entity (the document itself is treated as an entity), XMLEntityManager.startEntity method is called. This method in turn creates new instances of RewindableInputStream, Entity.ScannedEntity and a ???Reader (???=UTF8/ASCII/UCS etc.).

The RewindableInputStream wraps the InputStream in a buffer that can be rewound and re-read. The RewindableInputStream allocates a byte array of size DEFAULT_XMLDECL_BUFFER_SIZE = 64. ScannedEntity creates a new char array of either 8192 or 1024 bytes depending upon whether the entity is external or internal, respectively. The Reader (UTF8Reader is what I experimented with) creates a byte array, typically of same size as the ScannedEntity's char array (8192 for the document entity). Thus for every Entity, we allocate three buffers which can turn out to be relatively expensive operations, especially, if the documents are pretty small. One way to avoid these allocations are to use a buffer pool to allocate these buffers and return them when the parsing of entities are completed. The key issue is identifying when to release the buffers. 

Preliminary performance results indicate that 2-3X better performance is achievable on small messages (< 1KB) by implementing a better buffer allocation strategy.

EVALUATION A new buffer allocation strategy has been implemented. The throughput for small messages (< 1KB) has doubled by significantly reducing GC time.