United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6326935 : Exception message's size is more than doubled everytime an exception is thrown

Details
Type:
Bug
Submit Date:
2005-09-21
Status:
Resolved
Updated Date:
2010-12-08
Project Name:
JDK
Resolved Date:
2005-10-28
Component:
hotspot
OS:
solaris_9
Sub-Component:
compiler
CPU:
sparc
Priority:
P2
Resolution:
Fixed
Affected Versions:
1.3.1_16
Fixed Versions:
1.3.1_17 (b01)

Related Reports
Backport:
Backport:
Backport:
Relates:
Relates:

Sub Tasks

Description
This is a customer escalation 1-11839641, case # 64729037.

This is likely a bug in the hotspot optimisation. The testcase generates repeated
NULL pointer exceptions on purpose. It is observed that the size of the exception message grows ~2x the size of the previous exception. JVM eventually run out of memory resulting in an OOM.
Interesting but true. 

Here are test results from different configurations on the Weblogic's JVM:

JVM version    options                               problem reproducible
--------------------------------------------------------------------------
1.3.1_16       -server                                           yes
1.3.1_16       -server, -XX:-OmitStackTraceInFastThrow           no
1.3.1_16       -client                                           no
1.3.1_16       -Xint                                             no

1.4.2_09       -server                                           yes
1.4.2_09       -server, -XX:-OmitStackTraceInFastThrow           no
1.4.2_09       -client                                           no
1.4.2_09       -Xint                                             no


A known workaround for the customer is to go with 1.3.1_16 with -XX:-OmitStackTraceInFastThrow, given a slight performance penalty.
(Note Weblogic 7.0 SP4 does not support 1.4.2)  

The original files for the testcase is in
/net/cores.central/cores/64729037/CONFIGURATION,
note the OOMTest2 is modified to use a single bean instead of N bean which N
is specified in the command line.

Original Weblogic logs which track the exception message sizes and GC output is in
/net/cores.central/cores/64729037/LOGS


Here are steps to reproduce the problem:

1. logon to khalsi.red.iplanet.com
   % setenv TESTHOME /home/lc3092

2. % cd $TESTHOME/bea/weblogic700/server/bin
   % sh startWLS.sh  (as root)
   Note you can change JAVA_HOME, JAVA_VM, or JAVA_OPTIONS settings inside 
   this script
 
3. wait for weblogic to start up, you will see something about server started
   in RUNNING mode:

<Sep 21, 2005 3:22:46 PM PDT> <Notice> <WebLogicServer> <000360> <Server started in RUNNING mode> 

4. % cd $TESTHOME/client
   % setenv CLASSPATH .:$TESTHOME/bea/weblogic700/server/lib/weblogic.jar
   % java OOMTest2 10 khalsi.red.iplanet.com


Here is the output observed from weblogic server:
(note the increasing exception message's length)

<...lots of stuff deleted...>

ejbCreate called
------- Server side ex.msg.len = 122, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 369, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 863, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 1851, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 3827, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 7779, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 15683, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 31491, ex.class=class
java.lang.NullPointerException


Here's output from client:

<...lots of stuff deleted...>

java.lang.NullPointerException:
Start server side stack trace:
java.lang.NullPointerException
	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
<<----------------
------- Main  ex.msg.len = 157933, ex.class=class java.rmi.RemoteException
I created a tarball of the web server with the deployed EJB as
an alternative way to reproduce:

1. untar /net/cores.central/cores/64729037/testcase.tar into $TESTHOME

2. % cd $TESTHOME/bea/weblogic700/server/bin
   % sh startWLS.sh
   Note you can change JAVA_HOME, JAVA_VM, or JAVA_OPTIONS settings inside
   this script

3. wait for weblogic to start up, you will see something about server started
   in RUNNING mode:

<Sep 21, 2005 3:22:46 PM PDT> <Notice> <WebLogicServer> <000360> <Server started in RUNNING mode>

4. % cd $TESTHOME/client
   % setenv CLASSPATH .:$TESTHOME/bea/weblogic700/server/lib/weblogic.jar
   % java OOMTest2 10 <server>
   where server name is the server where you start weblogic,
   for eg. you logged on to titan.sfbay.sun.com and run startWLS.sh
   then run java OOMTest2 10 titan.sfbay.sun.com

                                    

Comments
EVALUATION

2005-09-22:
     Fui-Shien ran the following tests using the testbed setup :
       1) print com/verizon/enterprise/workflow/middleware/router/RouterBean testOOM
        without -XX:-OmitStackTraceInFastThrow
        output attached as problem.txt
        with -XX:-OmitStackTraceInFastThrow
        output attached as omitstacktrace.txt
       2) -XX:+TraceExceptions -XX:+PrintCompilation
         2 outputs attached: 
            traceexceptions.txt (full output)
            traceexceptionsext.txt (tail of previous)
      Don't see any evidence of power of 2 expansion of
      exception string in these:
------- Server side ex.msg.len = 369, ex.class=class java.lang.NullPointerException
Exception <a 'java/lang/NullPointerException'> thrown in interpreter method <{method} 'testOOM' '()V' in 'com/verizon/en
terprise/workflow/middleware/router/RouterBean'> for thread 0x2aefb8
...
421  !   com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM (107 bytes)
 11% !   com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM @ 5
 (107 bytes)
------- Server side ex.msg.len = 863, ex.class=class java.lang.NullPointerException
1953 [Exception (): a 'java/lang/NullPointerException' in <osr-adapter> at 0xfb412610]
Exception <a 'java/lang/NullPointerException'> thrown in interpreter method <{method} 'testOOM' '()V' in 'com/verizon/en
terprise/workflow/middleware/router/RouterBean_1kqmmn_EOImpl'> for thread 0x2aefb8

        Still looking... will also run above with-XX:-OmitStackTraceInFastThrow
CKP

2005-09-22:1130pmEDT:

So the jvm's exception data doesn't contain the expanding data. Its only at the java level. This is whats in the client message for the 2nd iteration:
------- Main  ex.msg.len = 469, ex.class=class java.rmi.RemoteException
{java.rmi.RemoteException: EJB Exception: ; nested exception is: 
        java.lang.NullPointerException

Start server side stack trace:
java.rmi.RemoteException: EJB Exception: ; nested exception is: 
        java.lang.NullPointerException
java.lang.NullPointerException
        <<no stack trace available>>
End  server side stack trace
; nested exception is: 
        java.lang.NullPointerException: 
Start server side stack trace:
java.lang.NullPointerException
        <<no stack trace available>>
End  server side stack trace
}

So it looks like some code that should reset 
the exception msg buffer is not being run.
 RMI is involved. Maybe we don't execute a finally clause?
I will look at the RMI code involving remote exceptions 
more tomorrow.
In the mean while does this trigger any lightbulbs 
amongst the compiler group ? Im still in the dark.

Chris


Fui-Shien Choong  points out the following:
src/share/vm/opto/parse3.cpp
   746  //------------------------------athrow-----------------------------------------
   747
   748  void Parse::do_athrow(int klass_index, int bci) {
   749    debug_only(int sp = _sp;)
   750
   751    // get exception node
   752    Node* ex_node = NULL;
   753    switch (klass_index) {
   754      case Deoptimization::Deopt_null_check:
   755      case Deoptimization::Deopt_div0_check:
   756        // Note: This is currently broken in the way that no new instance is
   757        //       created for each null pointer/arithmetic exception, and thus
   758        //       no stack trace is filled in. However, creating a new exception
   759        //       here is cause for some of the big slowdows in compilation due
   760        //       to the many runtime calls that need full debugging information
   761        //       and thus blow up register allocation time.
   762        //
   763        //       THIS NEEDS TO BE FIXED
   764        if (!OmitStackTraceInFastThrow) {
   765          // From (Tiger & 1.4.2_05) workaround for 4292742.
   766          // Do not discard backtraces if -XX:-OmitStackTraceInFastThrow is given.
   767          uncommon_trap(klass_index, bci);
   768          return;
   769        }
   770        { ciInstance* ex_obj =
   771            klass_index == Deoptimization::Deopt_null_check
   772              ? env()->NullPointerException_instance()
   773              : env()->ArithmeticException_instance();
   774          const TypeOopPtr* t = TypeOopPtr::make(ex_obj);
   775          ex_node = _gvn.transform(new(1)ConPNode(t));
   776        }
   777        break;
   778      case Deoptimization::Deopt_athrow:
   779        ex_node = peek(); // do not change expression stack
   780        break;
   781      default:
   782        ShouldNotReachHere();
   783    }
and we do indeed:
setSessionContext called
ejbCreate called
 11% !   com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM @ 5
 (107 bytes)
Uncommon trap occurred in com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM (@0xfb4adae8) unloaded_class_index = -8, thread = 22
Deoptimization: deoptimize frame (pc=0xfb4adae8, sp=0xe677f328)
 - frames stored in vframeArray(0x41ca30)
 - turned (nmethod*)0xfb4ad850 into a non_entrant nmethod
Deoptimization: unpacking vframeArray(0x41ca30)
      >> {method} 'testOOM' '()V' in 'com/verizon/enterprise/workflow/middleware/router/RouterBean'
Exception <a 'java/lang/NullPointerException'> thrown [/net/altair.east/eng/chrisphi/6326935/ws/hotspot/src/share/vm/prims/jvm.cpp, line 82] for thread 0x28f410

Steve Goldman: Any ideas on how to fix this?

Chris

PS
Following change does avoid the problem(with perfomace cost?):
*** parse3.cpp-OLD      Wed Sep 21 12:44:45 2005
--- parse3.cpp  Fri Sep 23 07:23:25 2005
***************
*** 761,767 ****
        //       and thus blow up register allocation time.
        //
        //       THIS NEEDS TO BE FIXED
!       if (!OmitStackTraceInFastThrow) {
          // From (Tiger & 1.4.2_05) workaround for 4292742.
          // Do not discard backtraces if -XX:-OmitStackTraceInFastThrow is given.
          uncommon_trap(klass_index, bci);
--- 761,769 ----
        //       and thus blow up register allocation time.
        //
        //       THIS NEEDS TO BE FIXED
!       // if (!OmitStackTraceInFastThrow) {
!       {
!       // ALWAYS
          // From (Tiger & 1.4.2_05) workaround for 4292742.
          // Do not discard backtraces if -XX:-OmitStackTraceInFastThrow is given.
          uncommon_trap(klass_index, bci);


CKP

Tom Rodriguez wrote:
I think the problem is that weblogic is mutating the detail message of exceptions using reflection and since C2 is throwing the same exception repeatedly the exception message grows. weblogic.rmi.utils.Utilities.getThrowableWithStackTrace digs out the detail message of exceptions and wraps some text around them.  Here's the decompiled source for the beginning of it.

    public static Throwable getThrowableWithStackTrace(Throwable throwable,
                                                       boolean bool) {
        String string = throwable.getMessage();
        if (string == null || bool || string.indexOf(marker) == -1) {
            String string_5_ = StackTraceUtils.throwable2StackTrace(throwable);
            String string_6_
                = (string == null ? startBanner + string_5_ + endBanner
                   : string + "\n" + startBanner + string_5_ + endBanner);
            try {
                Field field
                    = Throwable.class.getDeclaredField("detailMessage");
                field.setAccessible(true);
                field.set(throwable, string_6_);
                return throwable;


startBanner and endBanner are the "Start server side stack trace" and "End server side stack trace" strings you see in the output.  This seems like a pretty dubious piece of code.  Mutating the private fields of a system class is a bad idea to my thinking.  I don't know for sure that this code is being executed in the test case but it would explain the problem and it appears to be the only source of the banner strings in the message output.  As far as fixing it, we chould null out the detail message on every throw of the preallocated exception to make sure it doesn't grow without bounds.  Actually I think the preallocated exceptions should have a detail message explaining what they are instead of having null detail messages.  This might help with diagnosing problems related to them more quickly.

tom

I've added a testcase Bug6326935.java that when run as follows reproduces
the failure in 1.3.1,1.4.2,5u6:
before the fix:
java -server -showversion -Xcomp Bug6326935
java version "1.3.1_16"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_16-b06)
Java HotSpot(TM) Server VM (build 1.3.1_16-b06, compiled mode)

java.lang.reflect.InvocationTargetException: java.lang.OutOfMemoryError
        <<no stack trace available>>
- Main  ex.msg.len = 0, ex.class=class java.lang.reflect.InvocationTargetException

after the suggested fix:
java -server -showversion -Xcomp Bug6326935
java version "1.3.1_16"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_16-b06)
Java HotSpot(TM) Server VM (build 1.3.1_17ea-fix_04-esc1-11839641cr6326935-2005.09.26-chrisphi, compiled mode)

(no OOM).
                                     
2005-09-22
SUGGESTED FIX

--- src/share/vm/memory/javaClasses.cpp-        Sat Feb  9 03:54:13 2002
+++ src/share/vm/memory/javaClasses.cpp Mon Sep 26 16:34:20 2005
@@ -514,15 +514,18 @@
 
 oop java_lang_Throwable::message(oop throwable) {
   return throwable->obj_field(oopDesc::header_size() + detailMessage_offset);
 }
 
-
 oop java_lang_Throwable::message(Handle throwable) {
   return throwable->obj_field(oopDesc::header_size() + detailMessage_offset);
 }
 
+int java_lang_Throwable::detailMessage_offset_in_bytes() {
+  return (oopDesc::header_size() + detailMessage_offset) * wordSize;
+}
+
 void java_lang_Throwable::set_message(oop throwable, oop value) {
   throwable->obj_field_put(oopDesc::header_size() + detailMessage_offset, value);
 }

--- src/share/vm/memory/javaClasses.hpp-        Sat Feb  9 03:54:14 2002
+++ src/share/vm/memory/javaClasses.hpp Mon Sep 26 16:34:20 2005
@@ -215,10 +215,11 @@
   static oop backtrace(oop throwable);
   static void set_backtrace(oop throwable, oop value);
   // Message
   static oop message(oop throwable);
   static oop message(Handle throwable);
+  static int detailMessage_offset_in_bytes();
   static void set_message(oop throwable, oop value);
   // Print stack trace stored in exception by call-back to Java
   static void print_stack_trace(oop throwable, oop print_stream);
   static void print_stack_element(Handle stream, methodOop method, int bci);
   static void print_stack_usage(Handle stream);

--- src/share/vm/opto/parse3.cpp-       Wed Oct 20 12:09:10 2004
+++ src/share/vm/opto/parse3.cpp        Mon Sep 26 17:59:29 2005
@@ -769,12 +769,22 @@
       }
       { ciInstance* ex_obj =
           klass_index == Deoptimization::Deopt_null_check
             ? env()->NullPointerException_instance()
             : env()->ArithmeticException_instance();
+
         const TypeOopPtr* t = TypeOopPtr::make(ex_obj);
         ex_node = _gvn.transform(new(1)ConPNode(t));
+
+       int offset = java_lang_Throwable::detailMessage_offset_in_bytes();
+       uint alias_idx = C->get_alias_index( t->add_offset( offset ) );
+
+       Node *mem = memory(alias_idx);
+       Node *adr = basic_plus_adr(ex_node, ex_node, offset);
+       set_memory( _gvn.transform( new (4) StorePNode(control(), mem, adr, _null,
+                                                      Type::MEMORY ) ), alias_idx );
+       store_barrier(ex_node, _null);
       }
       break;
     case Deoptimization::Deopt_athrow:
       ex_node = peek(); // do not change expression stack
       break;
                                     
2005-09-27



Hardware and Software, Engineered to Work Together