JDK-8039999 : Investigate performance implications of castPP removal
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9
  • Priority: P4
  • Status: Closed
  • Resolution: Won't Fix
  • OS: generic
  • CPU: generic
  • Submitted: 2014-04-11
  • Updated: 2016-12-19
  • Resolved: 2015-11-04
Related Reports
Duplicate :  
Duplicate :  
Description
JDK-8034216 eagerly removes all castPP nodes after CCP phase. This can have performance implications, since NotNull type information might be lost. See earlier concerns:


May be we should investigate if the next performance statement about jvm98 is still true and solve it differently (in PhiNode::Ideal() for example):

// I tried to leave the CastPP's in.  This makes the graph more accurate in
// some sense; we get to keep around the knowledge that an oop is not-null
// after some test.  Alas, the CastPP's interfere with GVN (some values are
// the regular oop, some are the CastPP of the oop, all merge at Phi's which
// cannot collapse, etc).  This cost us 10% on SpecJVM, even when I removed
// some of the more trivial cases in the optimizer.  Removing more useless
// Phi's started allowing Loads to illegally float above null checks.  I gave
// up on this approach.  CNC 10/20/2000 
Comments
CastPPs are now kept until the end of the optimization passes and only removed before matching. See JDK-8069191.
04-11-2015

On x86: JMH-JavadocJdk7 -1.37% SPECjvm2008-SciMark-G1 -2.24% SPECjvm2008-XML -1.76% On sparc several regressions, biggest ones: SPECjvm2008-MPEG -8.18% SPECjvm2008-XML -7.04% SPECjvm2008-Serial -5.26% JMH-JavadocStartup -5.28%
12-01-2015

Leaving all CastPP in required that in Compile::final_graph_reshaping_impl(), the following code be disabled because it removes CastPP nodes and control dependencies can be lost: case Op_CastPP: if (n->in(1)->is_DecodeN() && Matcher::gen_narrow_oop_implicit_null_checks()) { Node* in1 = n->in(1); const Type* t = n->bottom_type(); Node* new_in1 = in1->clone(); new_in1->as_DecodeN()->set_type(t); if (!Matcher::narrow_oop_use_complex_address()) { // // x86, ARM and friends can handle 2 adds in addressing mode // and Matcher can fold a DecodeN node into address by using // a narrow oop directly and do implicit NULL check in address: // // [R12 + narrow_oop_reg<<3 + offset] // NullCheck narrow_oop_reg // // On other platforms (Sparc) we have to keep new DecodeN node and // use it to do implicit NULL check in address: // // decode_not_null narrow_oop_reg, base_reg // [base_reg + offset] // NullCheck base_reg // // Pin the new DecodeN node to non-null path on these platform (Sparc) // to keep the information to which NULL check the new DecodeN node // corresponds to use it as value in implicit_null_check(). // new_in1->set_req(0, n->in(0)); } n->subsume_by(new_in1, this); if (in1->outcnt() == 0) { in1->disconnect_inputs(NULL, this); } } break;
12-01-2015