JDK-8065965 : 3% performance regression on SPECjvm2008-Crypto-G1 on x64 following 8054478
Type:Bug
Component:hotspot
Sub-Component:compiler
Affected Version:9
Priority:P5
Status:Resolved
Resolution:Cannot Reproduce
Submitted:2014-11-26
Updated:2016-06-17
Resolved:2015-04-14
The Version table provides details related to the release that this issue/RFE will be addressed.
Unresolved : Release in which this issue/RFE will be addressed. Resolved: Release in which this issue/RFE has been resolved. Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.
A run through aurora.se shows 8054478 introduces a 3% performance regression on SPECjvm2008-Crypto-G1 on x64. A rerun of that set of tests confirmed it.
Comments
Can't reproduce with current hs-comp for jdk9 or jdk8u. Closing.
14-04-2015
The improvement to the types that CastIINode::Value() provides when _carry_dependency is on causes the following optimization:
in ConvI2LNode::Ideal():
#ifdef _LP64
// Convert ConvI2L(AddI(x, y)) to AddL(ConvI2L(x), ConvI2L(y)) ,
// but only if x and y have subranges that cannot cause 32-bit overflow,
// under the assumption that x+y is in my own subrange this->type().
// This assumption is based on a constraint (i.e., type assertion)
// established in Parse::array_addressing or perhaps elsewhere.
// This constraint has been adjoined to the "natural" type of
// the incoming argument in(0). We know (because of runtime
// checks) - that the result value I2L(x+y) is in the joined range.
// Hence we can restrict the incoming terms (x, y) to values such
// that their sum also lands in that range.
// This optimization is useful only on 64-bit systems, where we hope
// the addition will end up subsumed in an addressing mode.
// It is necessary to do this when optimizing an unrolled array
// copy loop such as x[i++] = y[i++].
// On 32-bit systems, it's better to perform as much 32-bit math as
// possible before the I2L conversion, because 32-bit math is cheaper.
// There's no common reason to "leak" a constant offset through the I2L.
// Addressing arithmetic will not absorb it as part of a 64-bit AddL.
Node* z = in(1);
int op = z->Opcode();
if (op == Op_AddI || op == Op_SubI) {
to be applied more aggressively. On my workstation it causes a performance regression that is statistically significant but that doesn't explain the entire regression.
16-01-2015
ILW=Perf regression, SpecJVM98-crypto, none so far=LMH=P5