United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-8000805 : JMM issue: short loads are non-atomic

Submit Date:
Updated Date:
Project Name:
Resolved Date:
Affected Versions:
Fixed Versions:
hs25 (b07)

Related Reports

Sub Tasks

Originally found with 

Steps to reproduce:
 $ git clone https://github.com/shipilev/java-concurrency-torture.git
 $ cd java-concurrency-torture.git
 $ mvn clean install
 $ java -jar target/concurrency-torture.jar -t ".*primitiveAtomicity.*ShortAtomicityTest"

Expected result: 
  Only [0, 0] and [-1, -1] states present.

Actual result:
  [0,0], [-1, -1] are present.
  One of [0, -1] or [-1, 0] states is present, which means the short value was torn away.

> (does it also happen if s.x is volatile?).
It does. There is a sister test for that case:

$ ~/Install.32/jdk7u10/bin/java -jar target/concurrency-torture.jar -t ".*primitiveAtomicity.*ShortAtomicityTest"
Java Concurrency Torture Tests
Running in forked mode...

Running net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest
Warmup .....
                     Observed state  Occurrences          Expectation Interpretation      
                             [0, 0] (   2320030)             REQUIRED Default value for the field. Observers are allowed to see th...
                           [-1, -1] (   7164201)             REQUIRED The value set by the actor thread. Observer sees the complet...
                            [-1, 0] (     12359)               ABSENT This case is not expected.              

Running net.shipilev.concurrent.torture.tests.primitiveAtomicity.VolatileShortAtomicityTest
Warmup .....
                     Observed state  Occurrences          Expectation Interpretation      
                             [0, 0] (   3895625)             REQUIRED Default value for the field. Observers are allowed to see th...
                           [-1, -1] (   3948889)             REQUIRED The value set by the actor thread. Observer sees the complet...
                            [-1, 0] (     44346)               ABSENT This case is not expected.  
Only one load from the field. Then it is JIT compilation problem.
javap shows:

 public void observe(TestSpecimen$Specimen, byte[]);
       0: aload_1
       1: getfield      #2                  // Field TestSpecimen$Specimen.x:S
       4: istore_3
       5: aload_2
       6: iconst_0
       7: iload_3
       8: iconst_0
       9: ishr
      10: sipush        255
      13: iand
      14: i2b
      15: bastore
      16: aload_2
      17: iconst_1
      18: iload_3
      19: bipush        8
      21: ishr
      22: sipush        255
      25: iand
      26: i2b
      27: bastore
      28: return

This was using javac from JDK7 and JDK8.
We need to verify that bytecode have only one load from s.x. C2 has IGVN optimizations which could lead to this situation but I looked and it is not used for such case. Two loads in assembler code are different: one unsigned and an other is signed. They correspond to LoadS and LoadUS ideal nodes in C2. But what surprise me that C1 has this problem also.
This issue strays into the greyest corners of the platform specification and the Java Memory Model. The basic issue is whether these two pieces of code should be considered equivalent:

        short t = s.x;
        result[0] = (byte) ((t >> 0) & 0xFF);
        result[1] = (byte) ((t >> 8) & 0xFF); 


        short t = s.x;
        result[0] = (byte) ((s.x >> 0) & 0xFF);
        result[1] = (byte) ((s.x >> 8) & 0xFF); 

As a programmer familiar with concurrent programming I immediately say they are not equivalent, because I know that if s.x were modified concurrently then result[0] and result[1] need not have consistent values in this second case. Consequently I would then argue that given

        short t = s.x;
        result[0] = (byte) ((t >> 0) & 0xFF);
        result[1] = (byte) ((t >> 8) & 0xFF); 

it would be invalid for javac to replace the use of t with two separate GETFIELD invocations for s.x; and that consequently it would also be invalid for the VM (interpreter, C1 or C2) to make a similar change. Hence the observed behaviour is a bug.

Some might argue however that a problem can only occur if s.x is updated concurrently and so the observed situation would only be a bug if it occurs when s.x is volatile (does it also happen if s.x is volatile?). Otherwise it is difficult to demonstrate, via the JLS, that the two forms of the code are not allowed to be used interchangeably.

One of the key problems faced when defining the new Java Memory Model for Java 5 was that there does not exist a compilation specification for the Java language - so there is nothing that formally states that javac, for example, can not replace use of t with a GETFIELD for s.x. Instead we had to assume/rely that the "obvious", 'simplest", "intuitive" translation would be used - and such a translation would, in my opinion, only issue a GETFIELD where an actual field access expression occurred in the language.

If Specimen.x is a volatile field, and this same issues occurs, then the case for this being a bug is much stronger as it can be easily argued that the programmer is fully aware of potential races and so requires that the field only be read once. Indeed lock-free concurrent programming would be impossible if javac or the VM replaced local variable accesses with re-reads of the original field.

Doug Lea reports this issue is reproducible on at least intel-i7/linux, amd-617x/linux, intel-i7/solaris, sparc-t2/solaris, in both client and server mode, not -Xint. 
I had reproduced it on intel-i5/linux, in both client and server mode.
Further analysis reveals that Hotspot re-reads short field for two expressions using the local in which the field was loaded.
Test source code in question: https://github.com/shipilev/java-concurrency-torture/blob/02ad97fcfb767c6da54b7a01baa15490ac8d3aad/src/main/java/net/shipilev/concurrent/torture/tests/primitiveAtomicity/ShortAtomicityTest.java#L43

The problematic compilation is at the method observe():

    public static class Specimen {
        short x;

    public void observe(Specimen s, byte[] result) {
        short t = s.x;
        result[0] = (byte) ((t >> 0) & 0xFF);
        result[1] = (byte) ((t >> 8) & 0xFF);

Below is the excerpt from the dissassembly:

  // %r11 is the object address, reading the field once into %r10
  0x00007fa9290a2320: movzwl 0xc(%r11),%r10d
  0x00007fa9290a2325: mov    0x18(%rsp),%r9

  // storing the first byte into array[0]
  0x00007fa9290a232a: mov    %r10b,(%r9)        ;*bastore
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@15 (line 46)
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26)
                                                ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186)

  0x00007fa9290a232d: movswl 0xc(%r11),%r11d    ;*getfield x
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@1 (line 44)
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26)
                                                ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186)

  // shifting
  0x00007fa9290a2332: shr    $0x8,%r11d
  0x00007fa9290a2336: mov    0x38(%rsp),%r10

  // storing the second byte into array[1]
  0x00007fa9290a233b: mov    %r11b,0x11(%r10)   ;*bastore
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@27 (line 47)
                                                ; - net.shipilev.concurrent.torture.tests.primitiveAtomicity.ShortAtomicityTest::observe@6 (line 26)
                                                ; - net.shipilev.concurrent.torture.Runner$3::call@78 (line 186)

Under the racy write to s.x it is possible to read the first byte from the one previously stored short value, and the second part from the other, thus violating Java memory model.

Hardware and Software, Engineered to Work Together