JDK-6973329 : C2 with Zero based COOP produces code with broken anti-dependency on x86
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: hs17
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_10
  • CPU: x86
  • Submitted: 2010-07-29
  • Updated: 2016-03-15
  • Resolved: 2011-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u21pFixed 7Fixed hs19Fixed
Related Reports
Relates :  
Description
For the next code:

  void test (A new_next) {
    A prev_next = a.next;
    a.next = new_next;
    if (prev_next == null) {
      a.n = a.get_n();
    }
  }

C2 prodes next incorrect assembler code (64 bit x86) with Zero based COOP produces:

02c     movl    RBP, [RSI + #12 (8-bit)]        # compressed ptr ! Field Test.a
030     cmpl    R12, [R12 + RBP << 3 + #16] (compressed oop addressing) # compressed ptr (R12_heapbase==0)
035     NullCheck RBP
035
035   B2: #     B4 B3 <- B1  Freq: 0.999999
035     decode_heap_oop_not_null RSI,RBP
039     encode_heap_oop R11,RDX
040     movl    [R12 + RBP << 3 + #16] (compressed oop addressing), R11 # compressed ptr ! Field A.next
045     movq    R10, RSI        # ptr -> long
048     shrq    R10, #9
04c     movq    R11, 0xfffffd7ff4a1c000 # ptr
056     movb    [R11 + R10], R12        # short/char (R12_heapbase==0)
05a     cmpl    R12, [R12 + RBP << 3 + #16] (compressed oop addressing) # compressed ptr (R12_heapbase==0)
05f     je,s   B4  P=0.100000 C=-1.000000
05f

The test in 05a checks wrong (new_next) value.

% /java/re/jdk/6u21/latest/binaries/solaris-amd64/bin/java -d64 -XX:+UseCompressedOops -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompressedOopsMode -Xbatch -XX:+PrintCompilation -Xcomp -XX:CompileOnly=Test Test

heap address: 0x00000003fc800000, zero based Compressed Oops

  1   b   Test::main (91 bytes)
  1   made not entrant  Test::main (91 bytes)
  2   b   Test::<init> (5 bytes)
  3   b   Test::test (35 bytes)
Wrong value: 1 expected: 2

I will include the test into the C2 regression tests.

Comments
EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/6c9cc03d8726
14-08-2010

PUBLIC COMMENTS Main problem: RA ignores anti-dependence when placing a clone of a node which produces flags (or any rematerializable nodes). The code generating implicit_null_check may move such nodes above nodes modifying flags which will force RA to clone it. Solution: Recompile without subsuming loads if RA try to clone a node with anti_dependence. Do not use nodes which produce flags in implicit null checks. Added regression test based on failure I saw in SPECjEnterprise2010. JPRT, SPECjEnterprise2010 Notes: I collected statistic about how many dependences are found per each call to insert_anti_dependences(). And how many recompilation without subsuming loads and total bailout happened in new RA code. CTW rt.jar: 1431559 made anti_dependence checks, 5497392 found anti_dependences (384%), 128477 did not find anti_dependences ( 8%) jvm2008: 200020 made anti_dependence checks, 724530 found anti_dependences (362%), 10339 did not find anti_dependences ( 5%) Originally I thought about adding new Node flag has_anti_dependence which I would set in insert_anti_dependences() and check it in the clone_node() in RA. But looking on this statistic I decided to use existing flag needs_anti_dependence_check. After that number of recompilation without subsuming loads changed from 115 to 127 (in CTW rt.jar). I think it is acceptable. Next I found that this recompilation number could be significantly reduced if I exclude nodes which produce flags from implicit null checks. Before: CTW rt.jar: RA: 127 recompile without subsume_loads, 0 bailout compilation jvm2008: RA: 140 recompile without subsume_loads, 0 bailout compilation After: CTW rt.jar: RA: 39 recompile without subsume_loads, 0 bailout compilation jvm2008: RA: 3 recompile without subsume_loads, 0 bailout compilation
11-08-2010

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/6c9cc03d8726
11-08-2010

EVALUATION This case exposed few problems in Hotspot: 1. RA will place a clone of the node which produces flags (or any rematerializable nodes) without anti-dependency check. 2. The code generating implicit_null_check may move such nodes above nodes modifying flags which will force RA to clone it. 3. Matcher does not check anti-dependency when folding loads into address expressions relying on GCM to bail out if it can't place nodes which triggers recompilation without folding loads into address. But even when GCM can place nodes it may lead to the problem described in 1. if nodes are placed above nodes modifying flags. We could do simple anti-dependency checks in match_into_reg() to avoid the problem in simple cases. Or do full anti-dependency analysis and use it result in GCM without repeating it.
30-07-2010