JDK-6326935 : Exception message's size is more than doubled everytime an exception is thrown
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 1.3.1_16
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_9
  • CPU: sparc
  • Submitted: 2005-09-21
  • Updated: 2010-12-08
  • Resolved: 2005-10-28
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other Other JDK 6
1.3.1_17 b01Fixed 1.4.2_11Fixed 6Fixed
Related Reports
Relates :  
Relates :  
Description
This is a customer escalation 1-11839641, case # 64729037.

This is likely a bug in the hotspot optimisation. The testcase generates repeated
NULL pointer exceptions on purpose. It is observed that the size of the exception message grows ~2x the size of the previous exception. JVM eventually run out of memory resulting in an OOM.
Interesting but true. 

Here are test results from different configurations on the Weblogic's JVM:

JVM version    options                               problem reproducible
--------------------------------------------------------------------------
1.3.1_16       -server                                           yes
1.3.1_16       -server, -XX:-OmitStackTraceInFastThrow           no
1.3.1_16       -client                                           no
1.3.1_16       -Xint                                             no

1.4.2_09       -server                                           yes
1.4.2_09       -server, -XX:-OmitStackTraceInFastThrow           no
1.4.2_09       -client                                           no
1.4.2_09       -Xint                                             no


A known workaround for the customer is to go with 1.3.1_16 with -XX:-OmitStackTraceInFastThrow, given a slight performance penalty.
(Note Weblogic 7.0 SP4 does not support 1.4.2)  

The original files for the testcase is in
/net/cores.central/cores/64729037/CONFIGURATION,
note the OOMTest2 is modified to use a single bean instead of N bean which N
is specified in the command line.

Original Weblogic logs which track the exception message sizes and GC output is in
/net/cores.central/cores/64729037/LOGS


Here are steps to reproduce the problem:

1. logon to khalsi.red.iplanet.com
   % setenv TESTHOME /home/lc3092

2. % cd $TESTHOME/bea/weblogic700/server/bin
   % sh startWLS.sh  (as root)
   Note you can change JAVA_HOME, JAVA_VM, or JAVA_OPTIONS settings inside 
   this script
 
3. wait for weblogic to start up, you will see something about server started
   in RUNNING mode:

<Sep 21, 2005 3:22:46 PM PDT> <Notice> <WebLogicServer> <000360> <Server started in RUNNING mode> 

4. % cd $TESTHOME/client
   % setenv CLASSPATH .:$TESTHOME/bea/weblogic700/server/lib/weblogic.jar
   % java OOMTest2 10 khalsi.red.iplanet.com


Here is the output observed from weblogic server:
(note the increasing exception message's length)

<...lots of stuff deleted...>

ejbCreate called
------- Server side ex.msg.len = 122, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 369, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 863, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 1851, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 3827, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 7779, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 15683, ex.class=class
java.lang.NullPointerException
setSessionContext called
ejbCreate called
------- Server side ex.msg.len = 31491, ex.class=class
java.lang.NullPointerException


Here's output from client:

<...lots of stuff deleted...>

java.lang.NullPointerException:
Start server side stack trace:
java.lang.NullPointerException
	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
End  server side stack trace

	<<no stack trace available>>
<<----------------
------- Main  ex.msg.len = 157933, ex.class=class java.rmi.RemoteException
I created a tarball of the web server with the deployed EJB as
an alternative way to reproduce:

1. untar /net/cores.central/cores/64729037/testcase.tar into $TESTHOME

2. % cd $TESTHOME/bea/weblogic700/server/bin
   % sh startWLS.sh
   Note you can change JAVA_HOME, JAVA_VM, or JAVA_OPTIONS settings inside
   this script

3. wait for weblogic to start up, you will see something about server started
   in RUNNING mode:

<Sep 21, 2005 3:22:46 PM PDT> <Notice> <WebLogicServer> <000360> <Server started in RUNNING mode>

4. % cd $TESTHOME/client
   % setenv CLASSPATH .:$TESTHOME/bea/weblogic700/server/lib/weblogic.jar
   % java OOMTest2 10 <server>
   where server name is the server where you start weblogic,
   for eg. you logged on to titan.sfbay.sun.com and run startWLS.sh
   then run java OOMTest2 10 titan.sfbay.sun.com

Comments
SUGGESTED FIX --- src/share/vm/memory/javaClasses.cpp- Sat Feb 9 03:54:13 2002 +++ src/share/vm/memory/javaClasses.cpp Mon Sep 26 16:34:20 2005 @@ -514,15 +514,18 @@ oop java_lang_Throwable::message(oop throwable) { return throwable->obj_field(oopDesc::header_size() + detailMessage_offset); } - oop java_lang_Throwable::message(Handle throwable) { return throwable->obj_field(oopDesc::header_size() + detailMessage_offset); } +int java_lang_Throwable::detailMessage_offset_in_bytes() { + return (oopDesc::header_size() + detailMessage_offset) * wordSize; +} + void java_lang_Throwable::set_message(oop throwable, oop value) { throwable->obj_field_put(oopDesc::header_size() + detailMessage_offset, value); } --- src/share/vm/memory/javaClasses.hpp- Sat Feb 9 03:54:14 2002 +++ src/share/vm/memory/javaClasses.hpp Mon Sep 26 16:34:20 2005 @@ -215,10 +215,11 @@ static oop backtrace(oop throwable); static void set_backtrace(oop throwable, oop value); // Message static oop message(oop throwable); static oop message(Handle throwable); + static int detailMessage_offset_in_bytes(); static void set_message(oop throwable, oop value); // Print stack trace stored in exception by call-back to Java static void print_stack_trace(oop throwable, oop print_stream); static void print_stack_element(Handle stream, methodOop method, int bci); static void print_stack_usage(Handle stream); --- src/share/vm/opto/parse3.cpp- Wed Oct 20 12:09:10 2004 +++ src/share/vm/opto/parse3.cpp Mon Sep 26 17:59:29 2005 @@ -769,12 +769,22 @@ } { ciInstance* ex_obj = klass_index == Deoptimization::Deopt_null_check ? env()->NullPointerException_instance() : env()->ArithmeticException_instance(); + const TypeOopPtr* t = TypeOopPtr::make(ex_obj); ex_node = _gvn.transform(new(1)ConPNode(t)); + + int offset = java_lang_Throwable::detailMessage_offset_in_bytes(); + uint alias_idx = C->get_alias_index( t->add_offset( offset ) ); + + Node *mem = memory(alias_idx); + Node *adr = basic_plus_adr(ex_node, ex_node, offset); + set_memory( _gvn.transform( new (4) StorePNode(control(), mem, adr, _null, + Type::MEMORY ) ), alias_idx ); + store_barrier(ex_node, _null); } break; case Deoptimization::Deopt_athrow: ex_node = peek(); // do not change expression stack break;
27-09-2005

EVALUATION 2005-09-22: Fui-Shien ran the following tests using the testbed setup : 1) print com/verizon/enterprise/workflow/middleware/router/RouterBean testOOM without -XX:-OmitStackTraceInFastThrow output attached as problem.txt with -XX:-OmitStackTraceInFastThrow output attached as omitstacktrace.txt 2) -XX:+TraceExceptions -XX:+PrintCompilation 2 outputs attached: traceexceptions.txt (full output) traceexceptionsext.txt (tail of previous) Don't see any evidence of power of 2 expansion of exception string in these: ------- Server side ex.msg.len = 369, ex.class=class java.lang.NullPointerException Exception <a 'java/lang/NullPointerException'> thrown in interpreter method <{method} 'testOOM' '()V' in 'com/verizon/en terprise/workflow/middleware/router/RouterBean'> for thread 0x2aefb8 ... 421 ! com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM (107 bytes) 11% ! com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM @ 5 (107 bytes) ------- Server side ex.msg.len = 863, ex.class=class java.lang.NullPointerException 1953 [Exception (): a 'java/lang/NullPointerException' in <osr-adapter> at 0xfb412610] Exception <a 'java/lang/NullPointerException'> thrown in interpreter method <{method} 'testOOM' '()V' in 'com/verizon/en terprise/workflow/middleware/router/RouterBean_1kqmmn_EOImpl'> for thread 0x2aefb8 Still looking... will also run above with-XX:-OmitStackTraceInFastThrow CKP 2005-09-22:1130pmEDT: So the jvm's exception data doesn't contain the expanding data. Its only at the java level. This is whats in the client message for the 2nd iteration: ------- Main ex.msg.len = 469, ex.class=class java.rmi.RemoteException {java.rmi.RemoteException: EJB Exception: ; nested exception is: java.lang.NullPointerException Start server side stack trace: java.rmi.RemoteException: EJB Exception: ; nested exception is: java.lang.NullPointerException java.lang.NullPointerException <<no stack trace available>> End server side stack trace ; nested exception is: java.lang.NullPointerException: Start server side stack trace: java.lang.NullPointerException <<no stack trace available>> End server side stack trace } So it looks like some code that should reset the exception msg buffer is not being run. RMI is involved. Maybe we don't execute a finally clause? I will look at the RMI code involving remote exceptions more tomorrow. In the mean while does this trigger any lightbulbs amongst the compiler group ? Im still in the dark. Chris Fui-Shien Choong points out the following: src/share/vm/opto/parse3.cpp 746 //------------------------------athrow----------------------------------------- 747 748 void Parse::do_athrow(int klass_index, int bci) { 749 debug_only(int sp = _sp;) 750 751 // get exception node 752 Node* ex_node = NULL; 753 switch (klass_index) { 754 case Deoptimization::Deopt_null_check: 755 case Deoptimization::Deopt_div0_check: 756 // Note: This is currently broken in the way that no new instance is 757 // created for each null pointer/arithmetic exception, and thus 758 // no stack trace is filled in. However, creating a new exception 759 // here is cause for some of the big slowdows in compilation due 760 // to the many runtime calls that need full debugging information 761 // and thus blow up register allocation time. 762 // 763 // THIS NEEDS TO BE FIXED 764 if (!OmitStackTraceInFastThrow) { 765 // From (Tiger & 1.4.2_05) workaround for 4292742. 766 // Do not discard backtraces if -XX:-OmitStackTraceInFastThrow is given. 767 uncommon_trap(klass_index, bci); 768 return; 769 } 770 { ciInstance* ex_obj = 771 klass_index == Deoptimization::Deopt_null_check 772 ? env()->NullPointerException_instance() 773 : env()->ArithmeticException_instance(); 774 const TypeOopPtr* t = TypeOopPtr::make(ex_obj); 775 ex_node = _gvn.transform(new(1)ConPNode(t)); 776 } 777 break; 778 case Deoptimization::Deopt_athrow: 779 ex_node = peek(); // do not change expression stack 780 break; 781 default: 782 ShouldNotReachHere(); 783 } and we do indeed: setSessionContext called ejbCreate called 11% ! com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM @ 5 (107 bytes) Uncommon trap occurred in com.verizon.enterprise.workflow.middleware.router.RouterBean::testOOM (@0xfb4adae8) unloaded_class_index = -8, thread = 22 Deoptimization: deoptimize frame (pc=0xfb4adae8, sp=0xe677f328) - frames stored in vframeArray(0x41ca30) - turned (nmethod*)0xfb4ad850 into a non_entrant nmethod Deoptimization: unpacking vframeArray(0x41ca30) >> {method} 'testOOM' '()V' in 'com/verizon/enterprise/workflow/middleware/router/RouterBean' Exception <a 'java/lang/NullPointerException'> thrown [/net/altair.east/eng/chrisphi/6326935/ws/hotspot/src/share/vm/prims/jvm.cpp, line 82] for thread 0x28f410 Steve Goldman: Any ideas on how to fix this? Chris PS Following change does avoid the problem(with perfomace cost?): *** parse3.cpp-OLD Wed Sep 21 12:44:45 2005 --- parse3.cpp Fri Sep 23 07:23:25 2005 *************** *** 761,767 **** // and thus blow up register allocation time. // // THIS NEEDS TO BE FIXED ! if (!OmitStackTraceInFastThrow) { // From (Tiger & 1.4.2_05) workaround for 4292742. // Do not discard backtraces if -XX:-OmitStackTraceInFastThrow is given. uncommon_trap(klass_index, bci); --- 761,769 ---- // and thus blow up register allocation time. // // THIS NEEDS TO BE FIXED ! // if (!OmitStackTraceInFastThrow) { ! { ! // ALWAYS // From (Tiger & 1.4.2_05) workaround for 4292742. // Do not discard backtraces if -XX:-OmitStackTraceInFastThrow is given. uncommon_trap(klass_index, bci); CKP Tom Rodriguez wrote: I think the problem is that weblogic is mutating the detail message of exceptions using reflection and since C2 is throwing the same exception repeatedly the exception message grows. weblogic.rmi.utils.Utilities.getThrowableWithStackTrace digs out the detail message of exceptions and wraps some text around them. Here's the decompiled source for the beginning of it. public static Throwable getThrowableWithStackTrace(Throwable throwable, boolean bool) { String string = throwable.getMessage(); if (string == null || bool || string.indexOf(marker) == -1) { String string_5_ = StackTraceUtils.throwable2StackTrace(throwable); String string_6_ = (string == null ? startBanner + string_5_ + endBanner : string + "\n" + startBanner + string_5_ + endBanner); try { Field field = Throwable.class.getDeclaredField("detailMessage"); field.setAccessible(true); field.set(throwable, string_6_); return throwable; startBanner and endBanner are the "Start server side stack trace" and "End server side stack trace" strings you see in the output. This seems like a pretty dubious piece of code. Mutating the private fields of a system class is a bad idea to my thinking. I don't know for sure that this code is being executed in the test case but it would explain the problem and it appears to be the only source of the banner strings in the message output. As far as fixing it, we chould null out the detail message on every throw of the preallocated exception to make sure it doesn't grow without bounds. Actually I think the preallocated exceptions should have a detail message explaining what they are instead of having null detail messages. This might help with diagnosing problems related to them more quickly. tom I've added a testcase Bug6326935.java that when run as follows reproduces the failure in 1.3.1,1.4.2,5u6: before the fix: java -server -showversion -Xcomp Bug6326935 java version "1.3.1_16" Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_16-b06) Java HotSpot(TM) Server VM (build 1.3.1_16-b06, compiled mode) java.lang.reflect.InvocationTargetException: java.lang.OutOfMemoryError <<no stack trace available>> - Main ex.msg.len = 0, ex.class=class java.lang.reflect.InvocationTargetException after the suggested fix: java -server -showversion -Xcomp Bug6326935 java version "1.3.1_16" Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_16-b06) Java HotSpot(TM) Server VM (build 1.3.1_17ea-fix_04-esc1-11839641cr6326935-2005.09.26-chrisphi, compiled mode) (no OOM).
22-09-2005