United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-7012088 jump to 0 address because of lack of memory ordering in SignatureHandlerLibrary::add
JDK-7012088 : jump to 0 address because of lack of memory ordering in SignatureHandlerLibrary::add

Details
Type:
Bug
Submit Date:
2011-01-13
Status:
Closed
Updated Date:
2011-03-08
Project Name:
JDK
Resolved Date:
2011-03-08
Component:
hotspot
OS:
solaris_10
Sub-Component:
runtime
CPU:
sparc
Priority:
P3
Resolution:
Fixed
Affected Versions:
5.0u14
Fixed Versions:
hs21 (b02)

Related Reports
Backport:
Relates:

Sub Tasks

Description
A CU found out JVM jumps to 0 address and terminates abnormally.

CONFIGURATION :
JDK : JDK5u14 server
OS : Solaris 10

PHENOMENA:
The following is the stack trace when the abort occurs.

#0  0xfe33092c in JVM_handle_solaris_signal
#1  <signal handler called>
#2  0x00000000 in ??
#3  0xf880c1d8 in %%% sun.misc.GC#maxObjectInspectionAge()J
#4  0xf8805904 in %%% sun.misc.GC$Daemon#run()V
#5  0xf8800220 in === BufferBlob[StubRoutines (1)] @ 0xf8800108 ===
#6  0xfe1a2e74 in void JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*)
#7  0xfe37f424 in void JavaCalls::call_virtual(JavaValue*,Handle,KlassHandle,symbolHandle,symbolHandle,Thread*)
#8  0xfe39f9f0 in void thread_entry(JavaThread*,Thread*)
#9  0xfe39b2e4 in void JavaThread::run()
#10 0xfe7a6968 in void*_start(void*)
#11 0xfeac8c70 in _lwp_start (0x0, 0x0, 0x0, 0x0, 0x0, 0x0) from root/usr/lib/libc.so.1

(%%% means interpreter is running.)

JVM jumps to 0 address in "call" instruction at the following line# 0xf880c224.
This is the code corresponding to
  ...
method entry point (kind = native) [0xf880c070, 0xf880c4c0]  1104 bytes not safepoint safe
  ...
in interpreter codelet

0xf880c1d0:     call  0xfe33fcc8 <void InterpreterRuntime::prepare_native_call(JavaThread*,methodOopDesc*)>
 0xf880c1d4:     mov  %g2, %o0
 0xf880c1d8:     mov  %l7, %g2
 0xf880c1dc:     clr  [ %g2 + 0x120 ]
 0xf880c1e0:     clr  [ %g2 + 0x124 ]
 0xf880c1e4:     clr  [ %g2 + 0x12c ]
 0xf880c1e8:     ld  [ %g2 + 4 ], %g5
 0xf880c1ec:     tst  %g5
 0xf880c1f0:     be  %icc, 0xf880c200
 0xf880c1f4:     nop
 0xf880c1f8:     call  0xf8800140
 0xf880c1fc:     nop
 0xf880c200:     ld  [ %l2 + 0x4c ], %g3
 0xf880c204:     st  %l2, [ %sp + 8 ]
 0xf880c208:     mov  %l3, %o1
 0xf880c20c:     add  %fp, -24, %o2
 0xf880c210:     sub  %sp, %fp, %o3
 0xf880c214:     save  %sp, %o3, %sp
 0xf880c218:     ld  [ %fp + 8 ], %l2
 0xf880c21c:     mov  %i1, %l3
 0xf880c220:     mov  %i2, %l6
 0xf880c224:     call  %g3                 //  NOTE !!

signature_handler of methodOop should have been set to %3 at the line 0xf880c224,
but actually 0 is set.
signature_handler should have been set in InterpreterRuntime::prepare_native_call at the line 0xf880c1d0
but have not.

The code of SignatureHandlerLibrary::add called from InterpreterRuntime::prepare_native_call is as follows.

    if (method->signature_handler() == NULL) {
    ...
      int handler_index = -1;
    ...
      if (UseFastSignatureHandlers && method->size_of_parameters() <= Fingerprinter::max_size_of_parameters) {
    ...
           MutexLocker mu(SignatureHandlerLibrary_lock);
    ...
           uint64_t fingerprint = Fingerprinter(method).fingerprint();
           handler_index = _fingerprints->find(fingerprint);
    ...
           address handler = set_handler(buffer);
    ...
           _fingerprints->append(fingerprint);
           _handlers->append(handler);               // *1
    ...
      }
    ...
      method->set_signature_handler(_handlers->at(handler_index));   // *2
    }

The program is locked until the line *1 by MutexLocker mu, but not at the line *2.
When a thread is accessing to the line *2,  another thread executes the line *1
and GrowableArray::grow is called, GrowableArray::_data might be overwritten.
The thread which is executing _handlers->at(handler_index) at *2
possibly refers to different data.

The following code is extarcted from GrowableArray::grow,

  for (int i = 0; i < _len; i++) newData[i] = _data[i];
  ...
  _data = newData;

The above program set new data to  _data variable after copying
old _data. Some memory ordering code seems needed here.

When the crash occurs, the status at *2 seems,
  new _data can be seen but copied elements can not be seen yet.
(This is similar to 6704010.)


Please see "comment" section also.

                                    

Comments
PUBLIC COMMENTS

This seems to be a basic race condition.

 method->set_signature_handler(_handlers->at(handler_index)); 

must be executed with the SignatureHandlerLibrary_lock held otherwise, as per the description, more than one thread can create and add a handler. Further, one thread can be accessing _handlers at the same time it is being grown and the crash occurs as described.

An assertion failure in this code exposed this very issue only recently but the application to non-debug code was not realized at the time - see 6704010
                                     
2011-01-13
SUGGESTED FIX

I've been removing fixes for this all day.
                                     
2011-01-25
EVALUATION

Summary: Write method signature handler under lock to prevent race with growable
 array resizing
Reviewed-by: dsamersoff, dholmes
                                     
2011-02-03



Hardware and Software, Engineered to Work Together