Customer is running Solaris 8 on eight CPU system.
When they experience a Full GC using 1.4.2, their transaction server
throws DataValidationExceptions complaining the integrity of the data has
changed after the collection is finished. This causes rollbacks in
transaction requests and trades will go unfilled and money is lost.
The server is only using parallelgc at that time for cleaning the
The only change they say in their trade environment is switching out
1.4.1_03 and using 1.4.2_03. The transaction server is written in c++
so they have natives threads referencing Java Objects. Though turning
on the CMS collection with the UseParGC seems to hide the problem. They
never see the update exceptions after a Full GC using these options
The customer has run a couple of test with the UseParNewGC collector for several
hours and did not experience any UpdateExceptions from their transaction
server after Full GC occurred. The heap options are listed below:
-server -showversion -Xms512m
-Xmx512m -XX:NewSize=500m -XX:MaxNewSize=500m -XX:InitialSurvivorRatio=4
-XX:TargetSurvivorRatio=100 -XX:+PrintCompilation -XX:+UseParNewGC
-XX:MaxPermSize=256MB -XX:PermSize=3m -XX:MinPermHeapExpansion=1m
-XX:+PrintHeapAtGC -verbose:gc -XX:+PrintGCTimeStamps -Xnoclassgc
Why would there be such a difference in behavior?
The application is more like 98% Java, and 2% C++.
The C++ code handles some of their ORB transport code (using
a C-API to a 3rd-party sockets vendor). When the Java code talks
to another process, it calls down to the C++ layer. This C++ code
establishes the connection to the outside, and creates a
"Receive" thread to receive messages from the newly created socket.
Or, if another process initiates contact, the same receive thread is
created for incoming messages.
When a new message is received from a remote process, very
minimal processing is done at the C++ layer before the JNI UpCall
takes place. The Java code invoked from JNI, processes the ORB message
and figures out which handling thread (a pure Java thread) the
message should be dispatched to. The message is just put onto
an internal queue, and then the dispatch thread picks it up and
calls application code (like plug in code) to actually do
Objects in Question
So, the C++ stuff is pretty thin, and just interacts with the
older C-API to the 3rd party vendor software (which itself is
really just a layer on top of sockets). The C++ threads that
are created are connected to the JVM so they can make calls to
the VM to create buffers, which the incoming messages are copied
into. That buffer is basically the only Java object that the
C++ thread creates, and it is passed up during the JNI up-call.
This buffer is copied into separate objects created by the ORB
code, so after the JNI call, the C++ created buffers are no
So, the objects that get modified (unexpectedly) after the few
FullGC's in 1.4.2, are not C++ created, nor are they
stored/referenced in the C++ code.
The GC output and log information is available in the attachments.