JDK-8157521 : VM crashes on Windows instead of throwing StackOverflowError
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9
  • Priority: P3
  • Status: Resolved
  • Resolution: Duplicate
  • OS: windows
  • CPU: x86_64
  • Submitted: 2016-05-22
  • Updated: 2016-06-17
  • Resolved: 2016-06-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9Resolved
Related Reports
Duplicate :  
Relates :  
Description
A simple recursive program may crash JVM with EXCEPTION_STACK_OVERFLOW.

public class RecursiveCall {
    static int depth;

    public static void main(String[] args) {
        try {
            recursive();
        } catch (StackOverflowError e) {
            System.out.println(depth);
        }
    }

    static void recursive() {
        depth++;
        recursive();
    }
}

I can always reproduce it within a few minutes when running the following batch file on Windows x64:

:loop
java -XX:CompileOnly=RecursiveCall RecursiveCall
goto loop

java version "9-ea"
Java(TM) SE Runtime Environment (build 9-ea+119)
Java HotSpot(TM) 64-Bit Server VM (build 9-ea+119, mixed mode)

The bug looks similar to JDK-8152598, but that one was closed as a duplicate of non-public issue. So I'll leave this one in order to provide the publicly available description, initial analysis and the proposed solution (see comments), and also to track the status of the fix.
Comments
Confirmed duplicate
17-06-2016

(Per previous comment) Assign to Nils for double-checking whether this is a duplicate of JDK-8067946.
15-06-2016

# J 2 C2 RecursiveCall.recursive()V (12 bytes) @ 0x0000021360400160 [0x0000021360400160+0x0000000000000000] Crash happens at stack banging instruction: 89 84 24 00 a0 ff ff mov DWORD PTR [rsp-0x6000],eax There was similar crash in compiled code investigated recently (JDK-8156538) and it was concluded it's a duplicate of JDK-8067946. [~neliasso] can you double-check?
23-05-2016

Probably JDK-8067946 is not worded very well, but in the given formulation it does not look like a duplicate. In my case StackYellowPages DO HAVE effect, but those yellow pages are untimely unguarded in response to EXCEPTION_STACK_OVERFLOW thrown during execution of VM code. This bug might be a duplicate of JDK-8152598, but that one is linked to a non-public issue and closed without any comments/analysis.
23-05-2016

Is it a duplicate of JDK-8067946?
23-05-2016

Though a simple fix is just to increate StackShadowPages, I would also recommend to review the stack usage of certain VM functions. The problem is that ALL of the abovementioned functions allocate RegisterMap structure on the stack. Each RegisterMap takes 4648 bytes, mainly because of slots for AVX registers.
23-05-2016

The problem occurs when the thread stack is almost full, after RecursiveCall.recursive() is compiled by C2, and C1 version is made non-entrant. The first EXCEPTION_STACK_OVERFLOW happens when VM runtime hits the yellow zone during resolution of non-entrant call site. That's how the call trace looks like: SharedRuntime::find_callee_info_helper() Line 1105 SharedRuntime::find_callee_method() Line 1240 SharedRuntime::reresolve_call_site() Line 1741 SharedRuntime::handle_wrong_method() Line 1462 Exception handler then disables yellow zone, it appears enough to finish VM call. No Java-level exception is thrown, because the thread is not in Java. After execution returns to Java, stack banging ignores disabled yellow zone, and the next time EXCEPTION_STACK_OVERFLOW happens in red zone. Normally VM runtime should never hit yellow zone, because it is supposed not to use more than StackShadowPages. However, SharedRuntime::handle_wrong_method() DOES use more than StackShadowPages (which is rather small on Windows). The suggested solution is to increase StackShadowPages on Windows and/or reduce VM stack usage. Currently StackShadowPages=6 on Windows, but 20 on other platforms. Probably 10 will be good enough. I've verified that the test no longer crashes with -XX:StackShadowPages=7.
22-05-2016