JDK-6419652 : Missed performance opportunity in bimorphic inlining cases
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 6
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_9
  • CPU: sparc
  • Submitted: 2006-04-28
  • Updated: 2013-11-01
  • Resolved: 2006-11-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u4Fixed 7Fixed hs10Fixed
Description
The best generated code for bimorphic call site is the next

  if class == Receiver1
      inlined Receiver1::method1()
  else if class == Receiver2
      inlined Receiver2::method2()
  else
      uncommon trap

The uncommon trap does not clog the class information we
are getting from class checks. It could be also moved from hot loop.

But in most bimorphic cases now we generate a virtual call instead of
an uncommon trap. This is due to the check we have to verify if
there were already class_check traps in this bytecode:

  if ((profile.morphism() == 1 || next_hit_cg != NULL) &&
      !too_many_traps(jvms->method(), jvms->bci(), Deoptimization::Reason_class_check)) {
    miss_cg = CallGenerator::for_uncommon_trap(call_method,
                Deoptimization::Reason_class_check,
                Deoptimization::Action_maybe_recompile);
  } else {
    miss_cg = CallGenerator::for_virtual_call(call_method, vtable_index);
  }

And in bimorphic case we most likely will have class_check trap since
there was the transition to it from monomorphic case for which we
did first compilation of the method.

Also we should allow to run that bytecode in interpreter for some time
before recompilation after class_check traps. So we collect more
profiling information for the call site.

I did the next changes:

  if (profile.morphism() == 1 &&
      !too_many_traps(jvms->method(), jvms->bci(), Deoptimization::Reason_class_check) ||
      next_hit_cg != NULL &&  // Bimorphic case
      // Check only total number of traps per method to allow
      // the transition from monomorphic to bimorphic case between 
      // compilations without falling into virtual call.
      !too_many_traps(Deoptimization::Reason_class_check)) {
    // Generate uncommon trap for class check failure path
    // in case of monomorphic or bimorphic virtual call site.
    miss_cg = CallGenerator::for_uncommon_trap(call_method,
                Deoptimization::Reason_class_check,
                Deoptimization::Action_reinterpret);
  } else {
    // Generate virtual call for class check failure path
    // in case of polymorphic virtual call site.
    miss_cg = CallGenerator::for_virtual_call(call_method, vtable_index);
  }

And I got the next results on SP2500 (-server -XX:+AggressiveOpts):

jaberwocky% ~kbr/bin/common/rwcompare ref.bimorph ref.bimorph2
============================================================================
ref.bimorph: reference_server
  Benchmark         Samples        Mean     Stdev
  jetstream              10       34.72      0.32
  scimark                10       65.44      1.20
  specjbb2000            10    21979.97     89.39
  specjbb2005            10     7576.85     63.53
  specjvm98              10      118.94      0.71
  volano25               10    11916.20    134.85
  --------------------------------------------------------------------------
  Weighted Geomean              1493.24
============================================================================
ref.bimorph2: reference_server
  Benchmark         Samples        Mean     Stdev   %Diff    P   Significant
  jetstream              10       34.89      0.36    0.49 0.282            *
  scimark                10       67.54      1.77    3.22 0.007          Yes
  specjbb2000            10    22025.16     69.69    0.21 0.224            *
  specjbb2005            10     7585.57     44.26    0.11 0.727            *
  specjvm98              10      119.92      0.59    0.82 0.004          Yes
  volano25               10    11816.50    112.83   -0.84 0.090            *
  --------------------------------------------------------------------------
  Weighted Geomean              1501.30              0.54
============================================================================
jaberwocky% ~kbr/bin/common/rwcompare -r ref.bimorph ref.bimorph2
============================================================================
ref.bimorph: reference_server
  Benchmark         Samples        Mean     Stdev
  jetstream              10       34.72      0.32
    Copy                 10      447.80      2.62
    Parse                10      343.60      1.71
    Read                 10      116.60      3.47
    Write                10      384.00      3.56
  scimark                10       65.44      1.20
    Sparse               10       42.53      1.33
    LU                   10       76.90      1.31
    SOR                  10      166.42      5.22
    FFT                  10       14.71      0.25
    Monte                10       26.63      0.03
  specjbb2000            10    21979.97     89.39
    First_Warehouse      10    12375.97     91.24
    Last_Warehouse       10    21979.98     89.39
  specjbb2005            10     7576.85     63.53
    peak                 10     8006.33     61.25
    peak_warehouse       10        3.40      0.97
    last                 10     7576.86     63.53
    interval_average     10      841.80      7.02
    first                10     3141.21     47.72
    overall_average      10     7107.71     44.59
    last_warehouse       10        8.00      0.00
  specjvm98              10      118.94      0.71
    javac                10       70.03      1.02
    db                   10       33.18      0.57
    jess                 10      137.84      1.47
    jack                 10       95.00      2.18
    compress             10      155.14      0.90
    mtrt                 10      355.87      5.68
    mpegaudio            10      200.54      0.47
  volano25               10    11916.20    134.85
    connections          10      400.00      0.00
    time                 10       67.14      0.76
  --------------------------------------------------------------------------
  Weighted Geomean              1493.24
============================================================================
ref.bimorph2: reference_server
  Benchmark         Samples        Mean     Stdev   %Diff    P   Significant
  jetstream              10       34.89      0.36    0.49 0.282            *
    Copy                 10      445.70      2.06    0.47 0.062            *
    Parse                10      344.20      2.35   -0.17 0.523            *
    Read                 10      113.80      2.78    2.40 0.063            *
    Write                10      387.00      9.46   -0.78 0.367            *
  scimark                10       67.54      1.77    3.22 0.007          Yes
    Sparse               10       43.97      1.24    3.39 0.022            *
    LU                   10       80.37      2.41    4.52 0.001          Yes
    SOR                  10      171.70      6.73    3.17 0.067            *
    FFT                  10       15.07      0.38    2.44 0.025            *
    Monte                10       26.62      0.04   -0.02 0.796            *
  specjbb2000            10    22025.16     69.69    0.21 0.224            *
    First_Warehouse      10    12408.77    122.43    0.27 0.506            *
    Last_Warehouse       10    22025.16     69.69    0.21 0.224            *
  specjbb2005            10     7585.57     44.26    0.11 0.727            *
    peak                 10     8064.37     38.89    0.72 0.023            *
    peak_warehouse       10        3.30      0.67    2.94 0.792            *
    last                 10     7585.57     44.26    0.11 0.727            *
    interval_average     10      842.90      4.95    0.13 0.691            *
    first                10     3164.04     50.24    0.73 0.311            *
    overall_average      10     7131.31     37.48    0.33 0.217            *
    last_warehouse       10        8.00      0.00   -0.00 0.000          Yes
  specjvm98              10      119.92      0.59    0.82 0.004          Yes
    javac                10       70.53      0.86    0.72 0.249            *
    db                   10       34.40      0.44    3.67 0.000          Yes
    jess                 10      138.65      1.71    0.59 0.271            *
    jack                 10       95.18      1.43    0.19 0.831            *
    compress             10      156.74      1.16    1.03 0.003          Yes
    mtrt                 10      354.03      7.06   -0.52 0.529            *
    mpegaudio            10      200.79      0.45    0.12 0.246            *
  volano25               10    11816.50    112.83   -0.84 0.090            *
    connections          10      400.00      0.00    0.00 0.000          Yes
    time                 10       67.71      0.65   -0.84 0.091            *
  --------------------------------------------------------------------------
  Weighted Geomean              1501.30              0.54
============================================================================
jaberwocky%

Comments
SUGGESTED FIX Additional fix. Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2006/20060925131119.kvn.6419652/workspace/webrevs/webrev-2006.09.25/index.html This is the correction for my original putback for 6419652 where experimental changes were used accidently. In that experiment I investigated the inlining of a second method when we have polymorphic call site and a major receiver. It causes the inlining of the second method String::equal() (1% of calls) into HashMap::get() so that the compiled code of HashMap::get() becomes big. As result HashMap::get() could not be inlined any more. Solution: Inline and call a second method only for pure bimorphic cases.
25-09-2006

SUGGESTED FIX Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2006/20060919195748.kvn.6419652/workspace/webrevs/webrev-2006.09.19/index.html - Check only total number of traps per method to allow the transition from monomorphic to bimorphic cases. - Avoid counters scaling before we made decision in call_generator(). - Remove duplicated traps count check in ciMethod::call_profile_at_bci(). - Replace TypeProfile ratio with the percentage of major receiver. - Add flag UseOnlyInlinedBimorphic which prevent the generation of the call to a second method if it can't be inlined.
20-09-2006

EVALUATION See description.
28-04-2006