JDK-8000805 : JMM issue: short loads are non-atomic
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 7u7
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux,solaris
  • CPU: x86,sparc
  • Submitted: 2012-10-11
  • Updated: 2023-08-16
  • Resolved: 2012-10-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 Other
7u40Fixed 8Fixed hs24Fixed
Originally found with 

Steps to reproduce:
 $ git clone https://github.com/shipilev/java-concurrency-torture.git
 $ cd java-concurrency-torture.git
 $ mvn clean install
 $ java -jar target/concurrency-torture.jar -t ".*primitiveAtomicity.*ShortAtomicityTest"

Expected result: 
  Only [0, 0] and [-1, -1] states present.

Actual result:
  [0,0], [-1, -1] are present.
  One of [0, -1] or [-1, 0] states is present, which means the short value was torn away.
URL: http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/67f4c477c9ab User: amurillo Date: 2012-10-26 23:51:00 +0000

URL: http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/67f4c477c9ab User: kvn Date: 2012-10-22 23:11:33 +0000

> (does it also happen if s.x is volatile?). It does. There is a sister test for that case: $ ~/Install.32/jdk7u10/bin/java -jar target/concurrency-torture.jar -t ".*primitiveAtomicity.*ShortAtomicityTest" Java Concurrency Torture Tests --------------------------------------------------------------------------------- Running in forked mode... Running net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest Warmup ..... Observed state Occurrences Expectation Interpretation [0, 0] ( 2320030) REQUIRED Default value for the field. Observers are allowed to see th... [-1, -1] ( 7164201) REQUIRED The value set by the actor thread. Observer sees the complet... [-1, 0] ( 12359) ABSENT This case is not expected. Running net.shipilev.concurrent.torture.tests.primitiveAtomicity.VolatileShortAtomicityTest Warmup ..... Observed state Occurrences Expectation Interpretation [0, 0] ( 3895625) REQUIRED Default value for the field. Observers are allowed to see th... [-1, -1] ( 3948889) REQUIRED The value set by the actor thread. Observer sees the complet... [-1, 0] ( 44346) ABSENT This case is not expected.

Only one load from the field. Then it is JIT compilation problem.

javap shows: public void observe(TestSpecimen$Specimen, byte[]); Code: 0: aload_1 1: getfield #2 // Field TestSpecimen$Specimen.x:S 4: istore_3 5: aload_2 6: iconst_0 7: iload_3 8: iconst_0 9: ishr 10: sipush 255 13: iand 14: i2b 15: bastore 16: aload_2 17: iconst_1 18: iload_3 19: bipush 8 21: ishr 22: sipush 255 25: iand 26: i2b 27: bastore 28: return This was using javac from JDK7 and JDK8.

We need to verify that bytecode have only one load from s.x. C2 has IGVN optimizations which could lead to this situation but I looked and it is not used for such case. Two loads in assembler code are different: one unsigned and an other is signed. They correspond to LoadS and LoadUS ideal nodes in C2. But what surprise me that C1 has this problem also.

This issue strays into the greyest corners of the platform specification and the Java Memory Model. The basic issue is whether these two pieces of code should be considered equivalent: short t = s.x; result[0] = (byte) ((t >> 0) & 0xFF); result[1] = (byte) ((t >> 8) & 0xFF); and short t = s.x; result[0] = (byte) ((s.x >> 0) & 0xFF); result[1] = (byte) ((s.x >> 8) & 0xFF); As a programmer familiar with concurrent programming I immediately say they are not equivalent, because I know that if s.x were modified concurrently then result[0] and result[1] need not have consistent values in this second case. Consequently I would then argue that given short t = s.x; result[0] = (byte) ((t >> 0) & 0xFF); result[1] = (byte) ((t >> 8) & 0xFF); it would be invalid for javac to replace the use of t with two separate GETFIELD invocations for s.x; and that consequently it would also be invalid for the VM (interpreter, C1 or C2) to make a similar change. Hence the observed behaviour is a bug. Some might argue however that a problem can only occur if s.x is updated concurrently and so the observed situation would only be a bug if it occurs when s.x is volatile (does it also happen if s.x is volatile?). Otherwise it is difficult to demonstrate, via the JLS, that the two forms of the code are not allowed to be used interchangeably. One of the key problems faced when defining the new Java Memory Model for Java 5 was that there does not exist a compilation specification for the Java language - so there is nothing that formally states that javac, for example, can not replace use of t with a GETFIELD for s.x. Instead we had to assume/rely that the "obvious", 'simplest", "intuitive" translation would be used - and such a translation would, in my opinion, only issue a GETFIELD where an actual field access expression occurred in the language. If Specimen.x is a volatile field, and this same issues occurs, then the case for this being a bug is much stronger as it can be easily argued that the programmer is fully aware of potential races and so requires that the field only be read once. Indeed lock-free concurrent programming would be impossible if javac or the VM replaced local variable accesses with re-reads of the original field.

Doug Lea reports this issue is reproducible on at least intel-i7/linux, amd-617x/linux, intel-i7/solaris, sparc-t2/solaris, in both client and server mode, not -Xint. I had reproduced it on intel-i5/linux, in both client and server mode.

Further analysis reveals that Hotspot re-reads short field for two expressions using the local in which the field was loaded. Test source code in question: https://github.com/shipilev/java-concurrency-torture/blob/02ad97fcfb767c6da54b7a01baa15490ac8d3aad/src/main/java/net/shipilev/concurrent/torture/tests/primitiveAtomicity/ShortAtomicityTest.java#L43 The problematic compilation is at the method observe(): public static class Specimen { short x; } @Override public void observe(Specimen s, byte[] result) { short t = s.x; result[0] = (byte) ((t >> 0) & 0xFF); result[1] = (byte) ((t >> 8) & 0xFF); } Below is the excerpt from the dissassembly: // %r11 is the object address, reading the field once into %r10 0x00007fa9290a2320: movzwl 0xc(%r11),%r10d 0x00007fa9290a2325: mov 0x18(%rsp),%r9 // storing the first byte into array[0] 0x00007fa9290a232a: mov %r10b,(%r9) ;*bastore ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@15 (line 46) ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26) ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186) // SECOND READ! 0x00007fa9290a232d: movswl 0xc(%r11),%r11d ;*getfield x ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@1 (line 44) ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26) ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186) // shifting 0x00007fa9290a2332: shr $0x8,%r11d 0x00007fa9290a2336: mov 0x38(%rsp),%r10 // storing the second byte into array[1] 0x00007fa9290a233b: mov %r11b,0x11(%r10) ;*bastore ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@27 (line 47) ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26) ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186) Under the racy write to s.x it is possible to read the first byte from the one previously stored short value, and the second part from the other, thus violating Java memory model.