United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4834191 : C2 fails to compile MD2 implementation

Details
Type:
Bug
Submit Date:
2003-03-18
Status:
Resolved
Updated Date:
2004-02-10
Project Name:
JDK
Resolved Date:
2004-02-10
Component:
hotspot
OS:
generic
Sub-Component:
compiler
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:
5.0 (b38)

Related Reports
Relates:

Sub Tasks

Description
While performance testing the new code for 4834179, I noticed that C2 seems to bail out to the interpreter and performance is 10+ x worse than C1. Slightly simplified code fragment in comments section. When run with PrintOpto, C2 says "Warning: Bail out for infinite loop".

Happens on SPARC and Windows/x86, so I assume it is a generic problem.

                                    

Comments
SUGGESTED FIX

Webrev:                 http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/export2/archived_workspaces/main/c2_baseline/2004/20040204062429.rasbold.c2_baseline2/workspace/webrevs/webrev-2004.02.04/index.html


###@###.### 2004-02-04
                                     
2004-02-04
EVALUATION

###@###.### 2003-03-18

Setup Test4834191.java in directory Bugs/4834191 and reproduced compile bailout.
Turning off loop unrolling allows compilation to complete (-XX:LoopUnrollLimit=0)

Note: This command line does not result in a bailout with jdk1.4.1
$JAVA_HOME/bin/java_g -server -Xbatch -XX:+PrintCompilation -XX:+PrintOptoBailouts Test4834191
----- -----

We are failing to compile implCompress method.  C2 bails out in
final_graph_reshaping() because the IfFalse branch of an IfNode has
been removed from the graph.

Several things are conspiring to happen here.

First, the first call to split-if makes enough progress such that we
do not attempt iteration spliting on the first round of
PhaseIdealLoop. This allows igvn to raise the bottom type on the loop
iteration phi node to int:0-8.

Second, when we do get to iteration splitting, the first loop in
implCompress is not maximally unrolled because our heuristic indicates
that the resulting code would be too large. (This is a bad choice by
our heuristic, because upon unrolling the loop's "body_size"
decreases.)

Third, we add pre and post loops for the first Java loop, clouding the
loops bounds of the "main" loop. Then the main loop is unrolled 16x
(four doublings), which is allowable because the body_size decreases
enough to fit it in our 4*LoopUnrollLimit restriction.

Fourth, because of the int range propagation for the phi node in igvn,
C2 is smart enough to figure out that iterations 9-16 are dead and can
be eliminated. But C2 isn't able to remove the If test guarding those
iterations, it can only remove the IfFalse arm that is dead.

-----

If we retain the current loop optimization structure, there are three
thing that come to mind to improve the situation.

1. Adjust the maximally unroll heuristic to account for an expected
reduction of nodes upon unrolling. That is, make the maximally unroll
limit something like 6*LoopUnrollLimit instead of 4*.

2. Annotate the loop head with the trip count discovered by
policy_maximally_unroll(). This trip count would be propagated when
the node is cloned, and should be adjusted as unrolling proceeds. We
would stop unrolling when the count gets to one.

3. Instead of bailing out at the sight of a partially used IfNode,
Mike P suggests the modification of the graph with a never taken node
to be handled properly by Code_Gen().

###@###.### 2003-04-02

Addendum: the test DOES fail with 1.4.1; one just needs to compile the program
with a javac from 1.4.1

###@###.### 2003-04-02

From the putback message:

When a counted loop has a known trip count, C2 may decline to
maxmimally unroll it due to heuristics that involve node and iteration
counts.  C2 usually then attempts to unroll it 2x at a time, with the
heuristics reconsidered after each doubling.

In odd cases such as the one in this bug, the compiler then may
attempt to double so many times that the unrolled loop encompasses
more iterations than original loop was known to have.  The min-trip
test in generated code protects such an unrolled section so that
normally the worst thing that happens is that the unrolled section can
not be reached. But in this bug's case, C2 figures out that some of
the unrolled section is dead, but fails to completely delete the dead
code and bails out of the compile.

The bug is fixed adding a trip_count field to the CountedLoopNode.
That field is set in policy_maximally_unroll(). The trip_count is then
adjusted down as a loop is peeled or unrolled, and is used as a leash
to keep the compiler from over-unrolling.

###@###.### 2004-02-04
                                     
2004-02-04
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
tiger-beta2

FIXED IN:
tiger-beta2

INTEGRATED IN:
tiger-b38
tiger-beta2


                                     
2004-06-14



Hardware and Software, Engineered to Work Together