United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-8023472 : C2 optimization breaks with G1

Details
Type:
Bug
Submit Date:
2013-08-21
Status:
Closed
Updated Date:
2014-01-14
Project Name:
JDK
Resolved Date:
2013-08-24
Component:
hotspot
OS:
Sub-Component:
compiler
CPU:
Priority:
P2
Resolution:
Fixed
Affected Versions:
hs25
Fixed Versions:
hs25 (b49)

Related Reports
Backport:
Backport:

Sub Tasks

Description
On the hotspot-dev mailing list on OpenJDK it was reported by Punya Biswal that a small Java program can trigger a crash with hotspot.

It seems like this is a C2 optimization that goes wrong. For some reason it only seem to happen with G1, so maybe it is related to some thing special that G1 does (the extra write barries?) but it is not a GC bug since the program does not trigger any GCs at all.

The issue is reported on 7u25, but it exists in the latest JDK8 build too (b103).

It is likely a C2 issue since running with -Xint or tiered compilation enabled does not cause the crash. Note that since tiered compilation is default in the later JDK8 builds it needs to be explicitly turned off for the reproducer to crash.

Making this a P2. Motivation:

Impact: High - crash
Likelihood: Low - Unsure about this. It seems to be a special case. But with the reproducer it happens every time.
Workaround: High. Don't know yet if particular tuning can be done. Currently the only workaround is to run with C1.

ILW=HLH -> P2
                                    

Comments
Changes has jtreg test/compiler/gcbarriers/G1CrashTest.java
                                     
2013-10-18
URL:   http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/b17d8f6d9ed7
User:  jcoomes
Date:  2013-09-06 21:27:38 +0000

                                     
2013-09-06
terminus:refworkload$ rwcompare  -r ref.server0 ref.server1
============================================================================
ref.server0
  Benchmark         Samples        Mean     Stdev
  scimark                20      303.83      1.24
    LU                   20      609.84      5.72
    FFT                  20       20.12      0.03
    Monte                20       99.27      0.13
    SOR                  20      565.12      1.33
    Sparse               20      224.80      0.55
  scimark_small          20      448.84      5.02
    LU                   20      841.25     24.13
    FFT                  20      344.15      0.49
    Monte                20       98.93      0.41
    SOR                  20      569.24      0.86
    Sparse               20      390.61      0.72
  specjvm98              20      279.28      1.71
    compress             20      283.49      4.72
    javac                20      189.25      1.45
    db                   20       74.47      0.58
    jack                 20      190.18      2.06
    mtrt                 20     1005.64     31.57
    jess                 20      305.45      6.00
    mpegaudio            20      568.14      2.82
============================================================================
ref.server1
  Benchmark         Samples        Mean     Stdev   %Diff    P   Significant
  scimark                20      303.63      1.23   -0.07 0.616            *
    LU                   20      608.89      5.44   -0.16 0.593            *
    FFT                  20       20.12      0.03    0.00 0.925            *
    Monte                20       99.26      0.12   -0.01 0.893            *
    SOR                  20      565.11      1.33   -0.00 0.981            *
    Sparse               20      224.79      0.56   -0.00 0.955            *
  scimark_small          20      448.46      4.54   -0.08 0.807            *
    LU                   20      838.92     22.93   -0.28 0.757            *
    FFT                  20      344.25      0.37    0.03 0.463            *
    Monte                20       98.89      0.48   -0.04 0.760            *
    SOR                  20      569.50      0.54    0.05 0.262            *
    Sparse               20      390.75      0.53    0.04 0.472            *
  specjvm98              20      279.43      1.03    0.06 0.732            *
    compress             20      284.47      4.66    0.35 0.512            *
    javac                20      188.70      1.23   -0.29 0.202            *
    db                   20       74.51      0.63    0.05 0.854            *
    jack                 20      188.29      2.93   -0.99 0.024            *
    mtrt                 20     1008.67     28.74    0.30 0.753            *
    jess                 20      308.44      2.25    0.98 0.047            *
    mpegaudio            20      568.12      2.36   -0.00 0.982            *
============================================================================

terminus:refworkload$ rwcompare  -r ref.jbb0 ref.jbb1
============================================================================
ref.jbb0
  Benchmark         Samples        Mean     Stdev
  specjbb2000            10   140589.66    530.02
    Last_Warehouse       10   140589.66    530.02
    First_Warehouse      10    36337.10    532.36
  specjbb2005            10    39942.54    241.82
    last                 10    39942.53    241.82
    interval_average     10     4438.00     26.85
    peak                 10    39974.66    254.64
    overall_average      10    33414.17    249.10
    last_warehouse       10        8.00      0.00
    peak_warehouse       10        7.50      0.53
    first                10    18559.16    159.48
============================================================================
ref.jbb1
  Benchmark         Samples        Mean     Stdev   %Diff    P   Significant
  specjbb2000            10   141176.78    220.27    0.42 0.007          Yes
    Last_Warehouse       10   141176.78    220.27    0.42 0.007          Yes
    First_Warehouse      10    36167.09    755.92   -0.47 0.569            *
  specjbb2005            10    39831.50    310.51   -0.28 0.385            *
    last                 10    39831.50    310.51   -0.28 0.385            *
    interval_average     10     4425.70     34.49   -0.28 0.386            *
    peak                 10    39926.88    302.00   -0.12 0.707            *
    overall_average      10    33430.61    211.66    0.05 0.875            *
    last_warehouse       10        8.00      0.00   -0.00 0.000          Yes
    peak_warehouse       10        7.50      0.53   -0.00 1.000            *
    first                10    18525.29    264.61   -0.18 0.734            *
============================================================================

                                     
2013-08-26
URL:   http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/b17d8f6d9ed7
User:  kvn
Date:  2013-08-24 05:21:00 +0000

                                     
2013-08-24
The load of previous value in G1 pre-barrier code (method GraphKit::g1_write_barrier_pre())  does not have control:

    if (do_load) {
      // load original value
      // alias_idx correct??
      pre_val = __ load(no_ctrl, adr, val_type, bt, alias_idx);
    }

Because of that it can be scheduled very early. But usually GCM put it near use at lowest freq block.
For some reasons (I am looking) it did not happen in this case: the load from set[firstRemoved]  was scheduled above the check when firstRemoved==-1:

                if (firstRemoved != -1)
                    set[firstRemoved] = "dead";

                                     
2013-08-21
The previous value load was scheduled high (commoned) because of 2 stores to set[firstRemoved]  : inside loop and after the loop:

            if (cur == null) {
                if (firstRemoved != -1)
                    set[firstRemoved] = "dead";
                else
                    set[index] = key;
                return;
            }
        } while (index != loopIndex);
        if (firstRemoved != -1)
            set[firstRemoved] = null;

                                     
2013-08-21
Original email from Punya:


We're running into a JVM segfault when using a third-party library
(Elasticsearch) alongside the G1 garbage collector. This issue has been
reported elsewhere (for example,
https://jira.terracotta.org/jira/browse/CDV-1651) and might already be on
the OpenJDK roadmap. It's usually associated with the use of GNU Trove.

We've been able to reduce the amount of code required to reproduce the
crash to a short program that reliably crashes on my computer
(a MacBook Pro) using only JDK core classes. I've also included the
hs_err_* diagnostic file; both are at
https://gist.github.com/punya/6287943 .

Does this reduced repro help understand what's going on?


A few points of interest:

* running with -XX:+UseG1GC -Dcount=100000 always crashes
* running with -XX:+UseG1GC -Xint -Dcount=100000 never crashes
* running with -Xint -Dcount=100000 never crashes


Using -ea and/or -server don't any effect these results. With regard to
JVM versions,

* it never crashes on JDK 6u51
* it crashes on JDK 7u25 and JDK 7u43 (prerelease)
* it never crashes on JDK 8 (prerelease)


I'm happy to provide a link to a core dump if that helps.
                                     
2013-08-21
Attached the test case and the hs_err file from github.
                                     
2013-08-21
The issue can be reproduced with a debug build. I had to increase the "count" to get it to reproduce. So the command line for JDK8 b103 debug builds is:

$ java -XX:-TieredCompilation -XX:+UseG1GC -Dcount=1000000

Adding -XX:+PrintGC shows that no GC happens.

Adding -XX:+PrintCompilation shows that we have compiled G1CrashTest::insertKey, which seems to be the method in which we crash.

    936    1             java.lang.String::hashCode (55 bytes)
    972    2             java.lang.String::indexOf (70 bytes)
   1057    3             java.lang.String::charAt (29 bytes)
   1064    4             java.lang.String::equals (81 bytes)
   1072    5             java.lang.Integer::parseInt (261 bytes)
   1112    6             java.lang.CharacterData::of (120 bytes)
   1117    7             java.lang.CharacterDataLatin1::getProperties (11 bytes)
   1119    8             java.lang.Character::digit (6 bytes)
   1129    9             java.lang.Character::digit (10 bytes)
   1146   10             java.lang.CharacterDataLatin1::digit (91 bytes)
   1153   11             java.lang.String::startsWith (72 bytes)
   1162   12             java.lang.Object::<init> (1 bytes)
   1173   13     n       java.lang.Object::hashCode (native)   
   1173   14             G1CrashTest::insertKey (42 bytes)
   1179   15 %           G1CrashTest::main @ 65 (102 bytes)
======================================================   1253 =====  16 ===      ==     ====== java.lang.String::======length==
 (6 bytes)
Unexpected Error
------------------------------------------------------------------------------
SIGSEGV (0xb) at pc=0x000000010b118318, pid=55878, tid=6403



From the hs_err file:

# Problematic frame:
# J  G1CrashTest.insertKey(Ljava/lang/Object;)V

                                     
2013-08-21
Problematic compile:
    574   14    b        G1CrashTest::insertKey (42 bytes)
                            @ 1   java.lang.Object::hashCode (0 bytes)   (intrinsic, virtual)
                            @ 38   G1CrashTest::insertKeyRehash (83 bytes)   inline (hot)

Crash occurs at the following instruction: 
 
  0x000000010ea4c058: mov    0x0(%r13),%r9d     ;*aastore
                                                ; - G1CrashTest::insertKeyRehash@52 (line 41)
                                                ; - G1CrashTest::insertKey@38 (line 26)

which corresponds to: 

    41                     set[firstRemoved] = "dead";                                           

  static void insertKeyRehash(java.lang.Object, int, int, java.lang.Object);
      ...
        45: getstatic     #7                  // Field set:[Ljava/lang/Object;
        48: iload         5
        50: ldc           #11                 // String dead
        52: aastore       

                                     
2013-08-21
Vladimir, can you paste more instruction around the crashing PC?
                                     
2013-08-21
The problem is that R10 is not sign extended and wrong - should not be negative (it is index to array):

2ed     leaq    R13, [RSI + #16 + R10 << #2]    # ptr posidxscaleoff
2f2     movl    RCX, [RBX]      # compressed ptr
2f4     movl    R9, [R13]       # compressed ptr  <<<<<<<<<<< CRUSH

R10=0x00000000ffffffff
                                     
2013-08-21



Hardware and Software, Engineered to Work Together