United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-8000805 JMM issue: short loads are non-atomic
JDK-8000805 : JMM issue: short loads are non-atomic

Details
Type:
Bug
Submit Date:
2012-10-11
Status:
Resolved
Updated Date:
2013-04-25
Project Name:
JDK
Resolved Date:
2012-10-22
Component:
hotspot
OS:
solaris,linux
Sub-Component:
compiler
CPU:
x86,sparc
Priority:
P3
Resolution:
Fixed
Affected Versions:
7u7
Fixed Versions:
hs25 (b07)

Related Reports
Backport:
Backport:
Backport:
Backport:
Backport:

Sub Tasks

Description
Originally found with 
https://github.com/shipilev/java-concurrency-torture

Steps to reproduce:
 $ git clone https://github.com/shipilev/java-concurrency-torture.git
 $ cd java-concurrency-torture.git
 $ mvn clean install
 $ java -jar target/concurrency-torture.jar -t ".*primitiveAtomicity.*ShortAtomicityTest"

Expected result: 
  Only [0, 0] and [-1, -1] states present.

Actual result:
  [0,0], [-1, -1] are present.
  One of [0, -1] or [-1, 0] states is present, which means the short value was torn away.
                                    

Comments
Further analysis reveals that Hotspot re-reads short field for two expressions using the local in which the field was loaded.
Test source code in question: https://github.com/shipilev/java-concurrency-torture/blob/02ad97fcfb767c6da54b7a01baa15490ac8d3aad/src/main/java/net/shipilev/concurrent/torture/tests/primitiveAtomicity/ShortAtomicityTest.java#L43

The problematic compilation is at the method observe():

    public static class Specimen {
        short x;
    }

    @Override
    public void observe(Specimen s, byte[] result) {
        short t = s.x;
        result[0] = (byte) ((t >> 0) & 0xFF);
        result[1] = (byte) ((t >> 8) & 0xFF);
    }

Below is the excerpt from the dissassembly:

  // %r11 is the object address, reading the field once into %r10
  0x00007fa9290a2320: movzwl 0xc(%r11),%r10d
  0x00007fa9290a2325: mov    0x18(%rsp),%r9

  // storing the first byte into array[0]
  0x00007fa9290a232a: mov    %r10b,(%r9)        ;*bastore
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@15 (line 46)
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26)
                                                ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186)

  // SECOND READ! 
  0x00007fa9290a232d: movswl 0xc(%r11),%r11d    ;*getfield x
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@1 (line 44)
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26)
                                                ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186)

  // shifting
  0x00007fa9290a2332: shr    $0x8,%r11d
  0x00007fa9290a2336: mov    0x38(%rsp),%r10

  // storing the second byte into array[1]
  0x00007fa9290a233b: mov    %r11b,0x11(%r10)   ;*bastore
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@27 (line 47)
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26)
                                                ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186)


Under the racy write to s.x it is possible to read the first byte from the one previously stored short value, and the second part from the other, thus violating Java memory model.
                                     
2012-10-11
Doug Lea reports this issue is reproducible on at least intel-i7/linux, amd-617x/linux, intel-i7/solaris, sparc-t2/solaris, in both client and server mode, not -Xint. 
I had reproduced it on intel-i5/linux, in both client and server mode.
                                     
2012-10-11
This issue strays into the greyest corners of the platform specification and the Java Memory Model. The basic issue is whether these two pieces of code should be considered equivalent:

        short t = s.x;
        result[0] = (byte) ((t >> 0) & 0xFF);
        result[1] = (byte) ((t >> 8) & 0xFF); 

and

        short t = s.x;
        result[0] = (byte) ((s.x >> 0) & 0xFF);
        result[1] = (byte) ((s.x >> 8) & 0xFF); 

As a programmer familiar with concurrent programming I immediately say they are not equivalent, because I know that if s.x were modified concurrently then result[0] and result[1] need not have consistent values in this second case. Consequently I would then argue that given

        short t = s.x;
        result[0] = (byte) ((t >> 0) & 0xFF);
        result[1] = (byte) ((t >> 8) & 0xFF); 

it would be invalid for javac to replace the use of t with two separate GETFIELD invocations for s.x; and that consequently it would also be invalid for the VM (interpreter, C1 or C2) to make a similar change. Hence the observed behaviour is a bug.

Some might argue however that a problem can only occur if s.x is updated concurrently and so the observed situation would only be a bug if it occurs when s.x is volatile (does it also happen if s.x is volatile?). Otherwise it is difficult to demonstrate, via the JLS, that the two forms of the code are not allowed to be used interchangeably.

One of the key problems faced when defining the new Java Memory Model for Java 5 was that there does not exist a compilation specification for the Java language - so there is nothing that formally states that javac, for example, can not replace use of t with a GETFIELD for s.x. Instead we had to assume/rely that the "obvious", 'simplest", "intuitive" translation would be used - and such a translation would, in my opinion, only issue a GETFIELD where an actual field access expression occurred in the language.

If Specimen.x is a volatile field, and this same issues occurs, then the case for this being a bug is much stronger as it can be easily argued that the programmer is fully aware of potential races and so requires that the field only be read once. Indeed lock-free concurrent programming would be impossible if javac or the VM replaced local variable accesses with re-reads of the original field.

                                     
2012-10-12
We need to verify that bytecode have only one load from s.x. C2 has IGVN optimizations which could lead to this situation but I looked and it is not used for such case. Two loads in assembler code are different: one unsigned and an other is signed. They correspond to LoadS and LoadUS ideal nodes in C2. But what surprise me that C1 has this problem also.
                                     
2012-10-12
javap shows:

 public void observe(TestSpecimen$Specimen, byte[]);
    Code:
       0: aload_1
       1: getfield      #2                  // Field TestSpecimen$Specimen.x:S
       4: istore_3
       5: aload_2
       6: iconst_0
       7: iload_3
       8: iconst_0
       9: ishr
      10: sipush        255
      13: iand
      14: i2b
      15: bastore
      16: aload_2
      17: iconst_1
      18: iload_3
      19: bipush        8
      21: ishr
      22: sipush        255
      25: iand
      26: i2b
      27: bastore
      28: return

This was using javac from JDK7 and JDK8.
                                     
2012-10-12
Only one load from the field. Then it is JIT compilation problem.
                                     
2012-10-12
> (does it also happen if s.x is volatile?).
It does. There is a sister test for that case:

$ ~/Install.32/jdk7u10/bin/java -jar target/concurrency-torture.jar -t ".*primitiveAtomicity.*ShortAtomicityTest"
Java Concurrency Torture Tests
---------------------------------------------------------------------------------
Running in forked mode...

Running net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest
Warmup .....
                     Observed state  Occurrences          Expectation Interpretation      
                             [0, 0] (   2320030)             REQUIRED Default value for the field. Observers are allowed to see th...
                           [-1, -1] (   7164201)             REQUIRED The value set by the actor thread. Observer sees the complet...
                            [-1, 0] (     12359)               ABSENT This case is not expected.              

Running net.shipilev.concurrent.torture.tests.primitiveAtomicity.VolatileShortAtomicityTest
Warmup .....
                     Observed state  Occurrences          Expectation Interpretation      
                             [0, 0] (   3895625)             REQUIRED Default value for the field. Observers are allowed to see th...
                           [-1, -1] (   3948889)             REQUIRED The value set by the actor thread. Observer sees the complet...
                            [-1, 0] (     44346)               ABSENT This case is not expected.  
                                     
2012-10-12
URL:   http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/67f4c477c9ab
User:  kvn
Date:  2012-10-22 23:11:33 +0000

                                     
2012-10-22
URL:   http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/67f4c477c9ab
User:  amurillo
Date:  2012-10-26 23:51:00 +0000

                                     
2012-10-26



Hardware and Software, Engineered to Work Together