Bug ID: JDK-4834191 C2 fails to compile MD2 implementation

Type: Bug
Component: hotspot
Sub-Component: compiler
Affected Version: 5.0

Priority: P3
Status: Resolved
Resolution: Fixed
OS: generic
CPU: generic

Submitted: 2003-03-18
Updated: 2004-02-10
Resolved: 2004-02-10

Other
5.0 b38Fixed

While performance testing the new code for 4834179, I noticed that C2 seems to bail out to the interpreter and performance is 10+ x worse than C1. Slightly simplified code fragment in comments section. When run with PrintOpto, C2 says "Warning: Bail out for infinite loop".

Happens on SPARC and Windows/x86, so I assume it is a generic problem.

CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: tiger-beta2 FIXED IN: tiger-beta2 INTEGRATED IN: tiger-b38 tiger-beta2

14-06-2004

SUGGESTED FIX Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/export2/archived_workspaces/main/c2_baseline/2004/20040204062429.rasbold.c2_baseline2/workspace/webrevs/webrev-2004.02.04/index.html ###@###.### 2004-02-04

04-02-2004

EVALUATION ###@###.### 2003-03-18 Setup Test4834191.java in directory Bugs/4834191 and reproduced compile bailout. Turning off loop unrolling allows compilation to complete (-XX:LoopUnrollLimit=0) Note: This command line does not result in a bailout with jdk1.4.1 $JAVA_HOME/bin/java_g -server -Xbatch -XX:+PrintCompilation -XX:+PrintOptoBailouts Test4834191 ----- ----- We are failing to compile implCompress method. C2 bails out in final_graph_reshaping() because the IfFalse branch of an IfNode has been removed from the graph. Several things are conspiring to happen here. First, the first call to split-if makes enough progress such that we do not attempt iteration spliting on the first round of PhaseIdealLoop. This allows igvn to raise the bottom type on the loop iteration phi node to int:0-8. Second, when we do get to iteration splitting, the first loop in implCompress is not maximally unrolled because our heuristic indicates that the resulting code would be too large. (This is a bad choice by our heuristic, because upon unrolling the loop's "body_size" decreases.) Third, we add pre and post loops for the first Java loop, clouding the loops bounds of the "main" loop. Then the main loop is unrolled 16x (four doublings), which is allowable because the body_size decreases enough to fit it in our 4*LoopUnrollLimit restriction. Fourth, because of the int range propagation for the phi node in igvn, C2 is smart enough to figure out that iterations 9-16 are dead and can be eliminated. But C2 isn't able to remove the If test guarding those iterations, it can only remove the IfFalse arm that is dead. ----- If we retain the current loop optimization structure, there are three thing that come to mind to improve the situation. 1. Adjust the maximally unroll heuristic to account for an expected reduction of nodes upon unrolling. That is, make the maximally unroll limit something like 6*LoopUnrollLimit instead of 4*. 2. Annotate the loop head with the trip count discovered by policy_maximally_unroll(). This trip count would be propagated when the node is cloned, and should be adjusted as unrolling proceeds. We would stop unrolling when the count gets to one. 3. Instead of bailing out at the sight of a partially used IfNode, Mike P suggests the modification of the graph with a never taken node to be handled properly by Code_Gen(). ###@###.### 2003-04-02 Addendum: the test DOES fail with 1.4.1; one just needs to compile the program with a javac from 1.4.1 ###@###.### 2003-04-02 From the putback message: When a counted loop has a known trip count, C2 may decline to maxmimally unroll it due to heuristics that involve node and iteration counts. C2 usually then attempts to unroll it 2x at a time, with the heuristics reconsidered after each doubling. In odd cases such as the one in this bug, the compiler then may attempt to double so many times that the unrolled loop encompasses more iterations than original loop was known to have. The min-trip test in generated code protects such an unrolled section so that normally the worst thing that happens is that the unrolled section can not be reached. But in this bug's case, C2 figures out that some of the unrolled section is dead, but fails to completely delete the dead code and bails out of the compile. The bug is fixed adding a trip_count field to the CountedLoopNode. That field is set in policy_maximally_unroll(). The trip_count is then adjusted down as a loop is peeled or unrolled, and is used as a leash to keep the compiler from over-unrolling. ###@###.### 2004-02-04

04-02-2004