JDK-4871438 : methodOopDesc::set_fingerprint isn't thread safe
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 5.0
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2003-05-29
  • Updated: 2003-12-01
  • Resolved: 2003-11-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.2_13Fixed
Related Reports
Relates :  
Description
With tiger b07, vtest crashed with -client -Xcomp
Test machine: j2se-app.west
j2se-app# uname -a
SunOS j2se-app 5.9 Generic_112233-07 sun4u sparc SUNW,Sun-Fire
j2se-app# /usr/j2se_b07/bin/java -version
java version "1.5.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta-b07)
Java HotSpot(TM) Client VM (build 1.5.0-beta-b07, mixed mode)

core file is under /bt/VolanoTestrun.20029.-client-Xcomp

=>[1] _lwp_kill(0x0, 0x6, 0x0, 0xff33c000, 0x81010100, 0xff00), at 0xff31e444
  [2] raise(0x6, 0x0, 0xdc77eb48, 0x0, 0x1, 0xdc77e3c4), at 0xff2ccd88
  [3] abort(0x0, 0xdc77ebd8, 0x0, 0xfffffff8, 0x0, 0xdc77ec01), at 0xff2b5c70
  [4] os::abort(0x1, 0xfefb1c56, 0xdc77f460, 0xfefdd08d, 0xfefdd039, 0xff00), at 0xfef3409c
  [5] report_error(0xfefb1a20, 0x1, 0x1000, 0xfeffd8d4, 0x13d4, 0xfefd2e57), at 0xfee33a44
  [6] SignatureIterator::iterate_parameters(0xdc77f658, 0x1522, 0x0, 0xfefee000, 0x4000, 0x350b70), at 0xfecc88dc
  [7] Runtime1::prepare_interpreter_call(0x2f2c48, 0xf5536d10, 0x2f2c48, 0x2f2c48, 0x31e1b8, 0xfeccfb88), at 0xfed39ef4
  [8] 0xf986fbf8(0xed51b6b8, 0xf553b138, 0xf553c858, 0xed504eb0, 0xf5442ba8, 0xee230000), at 0xf986fbf7
  [9] 0xf9996de4(0xed51b6b8, 0xf553b138, 0xed504eb0, 0xee230000, 0x1, 0xfece2d50), at 0xf9996de3
  [10] 0xf995467c(0xed505648, 0x0, 0x1, 0xdc77f850, 0xdc77f848, 0x0), at 0xf995467b
  [11] 0xf9954400(0xed505660, 0x0, 0xdc77f964, 0xdc77fa04, 0x0, 0xfecdcd54), at 0xf99543ff
  [12] 0xf995291c(0xed5054c8, 0x0, 0xdc77fa0c, 0x1, 0xdc77fa04, 0xed511798), at 0xf995291b
  [13] 0xf994a30c(0xf553b138, 0x1, 0xee242968, 0xf98146c0, 0x0, 0x73), at 0xf994a30b
  [14] 0xf9976c54(0xed403d58, 0xffffffeb, 0xdc77fb2c, 0x0, 0x0, 0xdc77fa40), at 0xf9976c53
  [15] 0xf9805b10(0xed403d58, 0x0, 0x0, 0xf9815270, 0x0, 0xdc77fad0), at 0xf9805b0f
  [16] 0xf9912a08(0xed47d870, 0x0, 0x0, 0x39eca0, 0xfef3217c, 0xfecb6fc4), at 0xf9912a07
  [17] 0xf9800118(0xdc77fc2c, 0xdc77fe98, 0xa, 0xf5420118, 0xf980aae0, 0xdc77fdb8), at 0xf9800117
  [18] JavaCalls::call_helper(0xdc77fe90, 0xdc77fcf8, 0xdc77fdb0, 0x2f2c48, 0x2f2c48, 0xdc77fd08), at 0xfecc4330
  [19] JavaCalls::call_virtual(0xfefee000, 0x2f31e8, 0xdc77fda4, 0xdc77fda0, 0xdc77fdb0, 0x2f2c48), at 0xfecd3ce4
  [20] JavaCalls::call_virtual(0xdc77fe90, 0xdc77fe8c, 0xdc77fe84, 0xdc77fe7c, 0xdc77fe74, 0x2f2c48), at 0xfecd3ba0
  [21] thread_entry(0x2f2c48, 0x2f2c48, 0x2e5b48, 0x2f31e8, 0x31a638, 0xfecd38c4), at 0xfecd3b28
  [22] JavaThread::run(0x2f2c48, 0xffffffe2, 0xff00cdc8, 0xffff8000, 0x0, 0x0), at 0xfecd38ec
  [23] _start(0x2f2c48, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfecc2bb4

###@###.### 2003-05-29

The bug shows up more frequently in build 25 CMS run.
in atg test run with -client flag, it happened 9 times in 8 days
 frequency: 9/1762 = 0.5%
in volanomark test run with -client -Xcomp flag, it happened 14 times in 8 days
frequency: 14/172239 = 0.01%

###@###.### 2003-10-30

The failure happened more frequently in CMS run, ie, with the combination of
-client -Xcomp -XX:+UseConcMarkSweepGC
With b26, vtest failed 20 times in 72 hours ( 23610 iterations  20/23610 = 0.1% ) the flags used are: -client -Xcomp -XX:+UseConcMarkSweepGC

###@###.### 2003-11-03

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: tiger-beta FIXED IN: tiger-beta INTEGRATED IN: tiger-b20 tiger-beta VERIFIED IN: tiger-beta
14-06-2004

SUGGESTED FIX % diff -c $rt/src/share/vm/oops/constMethodOop.hpp constMethodOop.hpp *** /net/altair/export/space3/fastlane/rt_baseline/src/share/vm/oops/constMethodOop.hpp Mon Oct 13 22:08:32 2003 --- constMethodOop.hpp Tue Nov 4 15:58:48 2003 *************** *** 1,5 **** #ifdef USE_PRAGMA_IDENT_SRC ! #pragma ident "@(#)constMethodOop.hpp 1.1 03/10/02 13:52:13 JVM" #endif /* * Copyright 1993-2002 Sun Microsystems, Inc. All rights reserved. --- 1,5 ---- #ifdef USE_PRAGMA_IDENT_SRC ! #pragma ident "%W% %E% %U% JVM" #endif /* * Copyright 1993-2002 Sun Microsystems, Inc. All rights reserved. *************** *** 139,145 **** uint64_t fingerprint() const { return _fingerprint; } uint64_t set_fingerprint(uint64_t fingerprint) { ! return _fingerprint = fingerprint; } // name --- 139,147 ---- uint64_t fingerprint() const { return _fingerprint; } uint64_t set_fingerprint(uint64_t fingerprint) { ! jlong fp = Atomic::cmpxchg((jlong)fingerprint, (jlong*) &_fingerprint, 0L); ! assert (fp == 0 || fp == (jlong)fingerprint, "fingerprint cannot change"); ! return fingerprint; } // name
11-06-2004

PUBLIC COMMENTS verified in tiger b29. verified on j2se-app.west and j2se-bigapps.west ( ran successfully for over a week ) ###@###.### 2003-12-01
01-12-2003

EVALUATION Has this occurred since then? ###@###.### 2003-08-14 It still happened as of build b15. ###@###.### 2003-08-19 This seems to be an impossible failure in the signature iterator. I've added debug code which would tell us what happened if it occurs again but it hasn't been so I'm closing this as not reproducible for now. ###@###.### 2003-10-02 The bug is reproducible with tiger b23 after 3 days 9 hours ( I ran 4 bigapps in parallel ) ###@###.### 2003-10-14 In the failing case the fingerprint as reported by the printing code I added to iterate_paramters is 0x00000000532ab338 but from looking at the core file, the fingerprint in the methodOop is 0x00000001532ab338, so the high word at the time it was read was still 0. methodOopDesc::set_fingerprint isn't thread safe since longs aren't guaranteed to be stored in a single atomic instruction. One thread can see half of the store and pass the test below, returning half of the fingerprint with the other half 0. if ( mh->fingerprint() != CONST64(0) ) return mh->fingerprint(); ###@###.### 2003-10-16 Using Atomic::cmpxchg (compare and exchange) jlong version to update fingerprint field and check that the value either matches the fingerprint value given or zero. There's no Atomic::store(jlong) defined for x86 so that's why I didn't use that. I don't think I need an atomic read operation but maybe _fingerprint field should be made volatile. ###@###.### 2003-11-04 4871438 methodOopDesc::set_fingerprint isn't thread safe The C++ compiler doesn't generate atomic instructions for 64 bit sized fields. One such field is _fingerprint in constMethodOopDesc. This field is updated during execution when it is needed. It doesn't use a lock because once the value is set, any other setters will reset it to the same value if there is a race. Unfortunately, the setting instruction was not atomic. If the compiler had generated a ldd for the field, the V9 manual says it is atomic. Instead with -xarch=v8 (-client setting because client still has to support v8) and -xarch=v8plus (-server setting), the C++ compiler generates two 32 bit loads to store the field. The C++ compiler default is -xmemalign=4s (align to 4 bytes, signal unaligned). They realize that they could generate better code and change the default but are evaluating the effect on user libraries. We could change the VM to use -xmemalign=8s but there might be problems with our code that reads class files. We could try this out next release. (Thanks to ###@###.### for the explanation of this). To fix, I use a known initial value and check both words against that value to determine if the entire write has taken place before returning the fingerprint. Otherwise, it returns zero. This doesn't rely on Atomic operations (which doesn't work on LInux - another bug) or the alignment of the field (which should always be aligned). ###@###.### 2003-11-14
14-11-2003