United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-7024475 loop doesn't terminate when compiled
JDK-7024475 : loop doesn't terminate when compiled

Details
Type:
Bug
Submit Date:
2011-03-04
Status:
Closed
Updated Date:
2011-04-25
Project Name:
JDK
Resolved Date:
2011-04-25
Component:
hotspot
OS:
solaris_10
Sub-Component:
compiler
CPU:
x86
Priority:
P3
Resolution:
Fixed
Affected Versions:
hs21
Fixed Versions:
hs21 (b07)

Related Reports
Backport:
Relates:
Relates:

Sub Tasks

Description
The following test periodically hangs when run with -server.  If you run a debug might you might need to increase the time it was before it considers it to be hung.  I'm still trying to narrow which release it appeared it.  6u14 is currently the first release where I've seen it hang.

import java.net.*;
import org.jsuffixarrays.*;

public class test extends Thread {
    public static void main(String[] args) throws Exception {
        int hung = 0;
        for (int i = 0; i < 10000; i++) {
            URLClassLoader apploader = (URLClassLoader)test.class.getClassLoader();
            URLClassLoader ucl = new URLClassLoader(apploader.getURLs(), apploader.getParent());
            Class c = ucl.loadClass("test");
            Thread t = (Thread)c.newInstance();
            t.start();
            for (int s = 0; s < 10; s++) {
                Thread.sleep(500);
                if (!t.isAlive()) {
                    break;
                }
            }
            if (t.isAlive()) {
                hung++;
            }
            System.out.println("ok " + (i + 1 - hung) + " hung " + hung);
        }
    }
    public void run() {
        DivSufSortTest t = new DivSufSortTest();
        t.setupForConstraints();
        t.invariantsOnRandomLargeAlphabet();
        t.sameResultWithArraySlice();
        t.invariantsOnRandomSmallAlphabet();
    }
}
Hi Tom.

Apologies for not providing a broader context, it follows.

> Is it hanging, as in making no progress, or not terminating?  

It is hanging as in: it consumes 100% of a single core, never terminates (when executed without debugging), in the debugging mode it does not terminate until a JVM breakpoint is set inside the loop below, then it breaks out to the debugger and you can step-by-step execute the code or re-run it and, magically, it will terminate.

> Does it terminate if given a lot more time?  It's not GC bound?  By debugger do you mean gdb or a Java debugger?

No, it never terminates... at least not in a reasonable time. Normally this loop ends in 2-3 seconds and when hung it does not terminate after 30 minutes or so (the longest we waited to kill it on the build server).

To me It looks like it JITs into something that causes an endless loop. I also checked with jrockit and J9 and these never hung, so it's probably hotspot.

Talking about the debugger, that was a Java debugger, I didn't inspect JIT dumps of what this method compiles to yet, didn't have the time and wanted to check if it's a known issue first. 

I'll try to reproduce it with JIT code dumps, I'll let you know if I succeed.

Dawid

On Thu, Mar 3, 2011 at 9:37 PM, Tom Rodriguez <###@###.###> wrote:

On Mar 3, 2011, at 4:39 AM, Dawid Weiss wrote:

> Hi. We see an infrequent, but very annoying VM hangups in purely
> computational code, as in here:
>
> http://builds.carrot2.org/build/result/viewBuildResultsFailedTests.action?buildKey=JSA-JSAHEAD-JOB1&buildNumber=8
>
> The loop in question is like this:
>
>            for (c0 = ALPHABET_SIZE - 2, j = m; 0 < j; --c0)
>            {
>                for (c1 = ALPHABET_SIZE - 1; c0 < c1; j = i, --c1)
>                {
>                    i = bucket_B[(c0) * ALPHABET_SIZE + (c1)];
>                    if (1 < (j - i))
>                    {
>                        ssSort(PAb, i, j, buf, bufsize, 2, n, SA[i] == (m - 1));
>                    }
>                }
>            }
>
> Unfortunately I cannot provide an always-halting example, but the bug
> seems to be JVM-related because:
>
> 1) once the VM hangs in debugging mode, breaking out to the debugger
> and stepping through the code terminates normally (so a deopt. I
> assume),

Is it hanging, as in making no progress, or not terminating?  Does it terminate if given a lot more time?  It's not GC bound?  By debugger do you mean gdb or a Java debugger?

tom

>
> 2) the freeze is non-deterministic, while the test is (regardless of
> the 'randomness', it always starts from the same seed),
>
> 3) we could reproduce this occasionally under different VMs and
> different OSs (64-bit linux, 64-bit Windows).
>
> Has there been any bug in the JIT that might be causing this? Thanks
> for pointers if you recognize an evil code pattern above.
>
> Dawid
>
> P.S. The above code is a translation from another person's C snippet,
> so I can't easily explain why the loop is built this way or why c0 is
> passed between the inner and outer loop :).

                                    

Comments
EVALUATION

The seems to appear with the RPO parsing changes 6384206.  I don't think it's the changes themselves but RPO parsing certainly exposed other issues when they went in.
                                     
2011-03-07
EVALUATION

This appears to be a long standing issue with IdealLoopTree::policy_do_remove_empty_loop.  That codes assumes the counted loop has a zero trip guard and will incorrectly change value produced by the loop if it does not.  Most normally written loops will get a zero trip guard as a result of peeling.  In the case from the bug report the nested loop with complex bounds and an OSR entry point creates a really unusual loop shape.  The RPO parsing changes allowed us to optimize it slightly better too.  With help from Vladimir I was able to build a test case that failed for normal compiles.

public class Test {

    static int i;
    static int x1;
    static int[] bucket_B;

    static void test(Test test, int i, int c0, int j, int c1) {
        for (;;) {
            if (c1 > c0) {
                if (c0 > 253) {
                    throw new InternalError("c0 = " + c0);
                }
                int index = c0 * 256 + c1;
                if (index == -1) return;
                i = bucket_B[index];
                if (1 < j - i && test != null)
                    x1 = 0;
                j = i;
                c1--;
            } else {
                c0--;
                if (j <= 0)
                    break;
                c1 = 255;
            }
        }
    }

    public static void main(String args[]) {
        Test t = new Test();
        bucket_B = new int[256*256];
        for (int i = 1; i < 256*256; i++) {
            bucket_B[i] = 1;
        }
        for (int n = 0; n < 100000; n++) {
            test(t, 2, 85, 1, 134);
        }
    }
}

The fix is most likely to peel the loop before removing it to ensure it has a guard though in the common case this introduces a little redundancy.
                                     
2011-03-24
EVALUATION

Note that this test case fails with every 1.6.0 release.
                                     
2011-03-24
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/1927db75dd85
                                     
2011-03-27
EVALUATION

7024475: loop doesn't terminate when compiled
Reviewed-by: kvn

The code which replaces an empty loop with the final computed value
only works correctly if the the loop is guaranteed to execute once
since it replaces the conditionally executed loop body with a value.
Loop transformations like peeling generally result in zero trip guards
being generated but they aren't required to be there.  The fix is to
make sure we have a guard by peeling the loop first.  This results in
a small redundant test in some cases.  I added a check for obvious
ones, which occurred in scimark SOR.  Main and post loops are
constructed with zero trip guards so I assume one exists in those
cases.  Tested with new test case, original failing program,
refworkload and full CTW.

I also improved a few debugging features.  Node::find could get into
an infinite loop walking debug_orig, so it should be check for a
cycle.  I made some other efficiency improvements before I realized
that the problem was a cycle.  I went ahead and kept those changes.
IdealGraphPrinter should print out a few more debug fields and should
dump the block structure if it exists.  I changed the printing after
beautify loops so it only happens once instead of after each recursive
call.  I added an OSROnlyBCI flag to control which OSR bcis can be
compiled. I changed the assert in loop verification to allow verifying
when the igvn worklist isn't empty.
                                     
2011-03-28



Hardware and Software, Engineered to Work Together