United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4871438 : methodOopDesc::set_fingerprint isn't thread safe

Details
Type:
Bug
Submit Date:
2003-05-29
Status:
Closed
Updated Date:
2003-12-01
Project Name:
JDK
Resolved Date:
2003-11-19
Component:
hotspot
OS:
generic
Sub-Component:
runtime
CPU:
generic
Priority:
P2
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:
5.0 (b20)

Related Reports
Backport:
Relates:

Sub Tasks

Description
With tiger b07, vtest crashed with -client -Xcomp
Test machine: j2se-app.west
j2se-app# uname -a
SunOS j2se-app 5.9 Generic_112233-07 sun4u sparc SUNW,Sun-Fire
j2se-app# /usr/j2se_b07/bin/java -version
java version "1.5.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta-b07)
Java HotSpot(TM) Client VM (build 1.5.0-beta-b07, mixed mode)

core file is under /bt/VolanoTestrun.20029.-client-Xcomp

=>[1] _lwp_kill(0x0, 0x6, 0x0, 0xff33c000, 0x81010100, 0xff00), at 0xff31e444
  [2] raise(0x6, 0x0, 0xdc77eb48, 0x0, 0x1, 0xdc77e3c4), at 0xff2ccd88
  [3] abort(0x0, 0xdc77ebd8, 0x0, 0xfffffff8, 0x0, 0xdc77ec01), at 0xff2b5c70
  [4] os::abort(0x1, 0xfefb1c56, 0xdc77f460, 0xfefdd08d, 0xfefdd039, 0xff00), at 0xfef3409c
  [5] report_error(0xfefb1a20, 0x1, 0x1000, 0xfeffd8d4, 0x13d4, 0xfefd2e57), at 0xfee33a44
  [6] SignatureIterator::iterate_parameters(0xdc77f658, 0x1522, 0x0, 0xfefee000, 0x4000, 0x350b70), at 0xfecc88dc
  [7] Runtime1::prepare_interpreter_call(0x2f2c48, 0xf5536d10, 0x2f2c48, 0x2f2c48, 0x31e1b8, 0xfeccfb88), at 0xfed39ef4
  [8] 0xf986fbf8(0xed51b6b8, 0xf553b138, 0xf553c858, 0xed504eb0, 0xf5442ba8, 0xee230000), at 0xf986fbf7
  [9] 0xf9996de4(0xed51b6b8, 0xf553b138, 0xed504eb0, 0xee230000, 0x1, 0xfece2d50), at 0xf9996de3
  [10] 0xf995467c(0xed505648, 0x0, 0x1, 0xdc77f850, 0xdc77f848, 0x0), at 0xf995467b
  [11] 0xf9954400(0xed505660, 0x0, 0xdc77f964, 0xdc77fa04, 0x0, 0xfecdcd54), at 0xf99543ff
  [12] 0xf995291c(0xed5054c8, 0x0, 0xdc77fa0c, 0x1, 0xdc77fa04, 0xed511798), at 0xf995291b
  [13] 0xf994a30c(0xf553b138, 0x1, 0xee242968, 0xf98146c0, 0x0, 0x73), at 0xf994a30b
  [14] 0xf9976c54(0xed403d58, 0xffffffeb, 0xdc77fb2c, 0x0, 0x0, 0xdc77fa40), at 0xf9976c53
  [15] 0xf9805b10(0xed403d58, 0x0, 0x0, 0xf9815270, 0x0, 0xdc77fad0), at 0xf9805b0f
  [16] 0xf9912a08(0xed47d870, 0x0, 0x0, 0x39eca0, 0xfef3217c, 0xfecb6fc4), at 0xf9912a07
  [17] 0xf9800118(0xdc77fc2c, 0xdc77fe98, 0xa, 0xf5420118, 0xf980aae0, 0xdc77fdb8), at 0xf9800117
  [18] JavaCalls::call_helper(0xdc77fe90, 0xdc77fcf8, 0xdc77fdb0, 0x2f2c48, 0x2f2c48, 0xdc77fd08), at 0xfecc4330
  [19] JavaCalls::call_virtual(0xfefee000, 0x2f31e8, 0xdc77fda4, 0xdc77fda0, 0xdc77fdb0, 0x2f2c48), at 0xfecd3ce4
  [20] JavaCalls::call_virtual(0xdc77fe90, 0xdc77fe8c, 0xdc77fe84, 0xdc77fe7c, 0xdc77fe74, 0x2f2c48), at 0xfecd3ba0
  [21] thread_entry(0x2f2c48, 0x2f2c48, 0x2e5b48, 0x2f31e8, 0x31a638, 0xfecd38c4), at 0xfecd3b28
  [22] JavaThread::run(0x2f2c48, 0xffffffe2, 0xff00cdc8, 0xffff8000, 0x0, 0x0), at 0xfecd38ec
  [23] _start(0x2f2c48, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfecc2bb4

###@###.### 2003-05-29

The bug shows up more frequently in build 25 CMS run.
in atg test run with -client flag, it happened 9 times in 8 days
 frequency: 9/1762 = 0.5%
in volanomark test run with -client -Xcomp flag, it happened 14 times in 8 days
frequency: 14/172239 = 0.01%

###@###.### 2003-10-30

The failure happened more frequently in CMS run, ie, with the combination of
-client -Xcomp -XX:+UseConcMarkSweepGC
With b26, vtest failed 20 times in 72 hours ( 23610 iterations  20/23610 = 0.1% ) the flags used are: -client -Xcomp -XX:+UseConcMarkSweepGC

###@###.### 2003-11-03

                                    

Comments
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
tiger-beta

FIXED IN:
tiger-beta

INTEGRATED IN:
tiger-b20
tiger-beta

VERIFIED IN:
tiger-beta


                                     
2004-06-14
SUGGESTED FIX

%  diff -c $rt/src/share/vm/oops/constMethodOop.hpp constMethodOop.hpp
*** /net/altair/export/space3/fastlane/rt_baseline/src/share/vm/oops/constMethodOop.hpp Mon Oct 13 22:08:32 2003
--- constMethodOop.hpp  Tue Nov  4 15:58:48 2003
***************
*** 1,5 ****
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "@(#)constMethodOop.hpp 1.1 03/10/02 13:52:13 JVM"
  #endif
  /*
   * Copyright 1993-2002 Sun Microsystems, Inc.  All rights reserved.
--- 1,5 ----
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "%W% %E% %U% JVM"
  #endif
  /*
   * Copyright 1993-2002 Sun Microsystems, Inc.  All rights reserved.
***************
*** 139,145 ****
  
    uint64_t fingerprint() const                   { return _fingerprint; }
    uint64_t set_fingerprint(uint64_t fingerprint) {
!     return _fingerprint = fingerprint;
    }
  
    // name
--- 139,147 ----
  
    uint64_t fingerprint() const                   { return _fingerprint; }
    uint64_t set_fingerprint(uint64_t fingerprint) {
!     jlong fp = Atomic::cmpxchg((jlong)fingerprint, (jlong*) &_fingerprint, 0L);
!     assert (fp == 0 || fp == (jlong)fingerprint, "fingerprint cannot change");
!     return fingerprint;
    }
  
    // name
                                     
2004-06-11
PUBLIC COMMENTS

verified in tiger b29.
verified on j2se-app.west and j2se-bigapps.west ( ran successfully for over
a week )

###@###.### 2003-12-01
                                     
2003-12-01
EVALUATION

Has this occurred since then?
###@###.### 2003-08-14

It still happened as of build b15.

###@###.### 2003-08-19

This seems to be an impossible failure in the signature iterator.  I've added debug code which would tell us what happened if it occurs again but it hasn't been so I'm closing this as not reproducible for now.
###@###.### 2003-10-02

The bug is reproducible with tiger b23 after 3 days 9 hours ( I ran
4 bigapps in parallel ) 

###@###.### 2003-10-14

In the failing case the fingerprint as reported by the printing code I added to iterate_paramters is 0x00000000532ab338 but from looking at the core file, the fingerprint in the methodOop is 0x00000001532ab338, so the high word at the time it was read was still 0.  methodOopDesc::set_fingerprint isn't thread safe since longs aren't guaranteed to be stored in a single atomic instruction.  One thread can see half of the store and pass the test below, returning half of the fingerprint with the other half 0.

 if ( mh->fingerprint() != CONST64(0) ) return mh->fingerprint();


###@###.### 2003-10-16

Using Atomic::cmpxchg (compare and exchange) jlong version to update
fingerprint field and check that the value either matches the fingerprint
value given or zero.  There's no Atomic::store(jlong) defined for x86 so that's
why I didn't use that.  I don't think I need an atomic read operation but maybe
_fingerprint field should be made volatile.

###@###.### 2003-11-04



4871438 methodOopDesc::set_fingerprint isn't thread safe

The C++ compiler doesn't generate atomic instructions for 64 bit sized
fields.  One such field is _fingerprint in constMethodOopDesc.  This field
is updated during execution when it is needed.  It doesn't use a lock
because once the value is set, any other setters will reset it to the same
value if there is a race.  Unfortunately, the setting instruction was not
atomic.  If the compiler had generated a ldd for the field, the V9 manual
says it is atomic.  Instead with -xarch=v8 (-client setting because client
still has to support v8) and -xarch=v8plus (-server setting), the C++
compiler generates two 32 bit loads to store the field.

The C++ compiler default is -xmemalign=4s (align to 4 bytes, signal unaligned).
They realize that they could generate better code and change the default
but are evaluating the effect on user libraries.  We could change the VM
to use -xmemalign=8s but there might be problems with our code that reads
class files.  We could try this out next release.  (Thanks to 
###@###.### for the explanation of this).

To fix, I use a known initial value and check both words against that
value to determine if the entire write has taken place before returning
the fingerprint.  Otherwise, it returns zero.  This doesn't rely on Atomic
operations (which doesn't work on LInux - another bug) or the alignment
of the field (which should always be aligned).


###@###.### 2003-11-14
                                     
2003-11-14



Hardware and Software, Engineered to Work Together