JDK-8295867 : TestVerifyGraphEdges.java fails with exit code -1073741571 when using AlwaysIncrementalInline
Type:Bug
Component:hotspot
Sub-Component:compiler
Affected Version:20
Priority:P3
Status:Resolved
Resolution:Fixed
Submitted:2022-10-25
Updated:2022-12-14
Resolved:2022-11-11
The Version table provides details related to the release that this issue/RFE will be addressed.
Unresolved : Release in which this issue/RFE will be addressed. Resolved: Release in which this issue/RFE has been resolved. Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.
Extra flags:
-XX:-TieredCompilation -XX:+AlwaysIncrementalInline
All the testing system spits out is this:
Failed. Unexpected exit from test [exit code: -1073741571]
A pull request was submitted for review.
URL: https://git.openjdk.org/jdk/pull/11065
Date: 2022-11-09 18:14:43 +0000
09-11-2022
Testing more, I think CDS is involved tangentially: after enhanced constants resolution with JDK-8293979, we are able to inline a bit more deeply, which is enough to go overboard with smaller compiler stacks.
09-11-2022
I am testing fix.
09-11-2022
`java.lang.invoke.LambdaForm$Kind::<clinit>` is really long linear Enum class method:
https://github.com/openjdk/jdk/blame/master/src/java.base/share/classes/java/lang/invoke/LambdaForm.java#L250
In addition we inline all <init> class initializer for EA when run with -Xcomp:
https://github.com/openjdk/jdk/blame/master/src/hotspot/share/opto/bytecodeInfo.cpp#L410
We even hit inlined Nodes limit (18000) during sequential inlining:
<inline_fail reason='NodeCountInliningCutoff'/>
I looked on stack frame size for `verify_edges()` and it is 10 words including PC and SP (640 bytes in 64-bits VM). With recursion depth 5000 it is 3.2Mb.
Anyway, we need to convert `verify_edges()` to normal method to avoid this deep recursion. I don't see other solutions here.
09-11-2022
I reproduced failure with 64 bits VM with -XX:CompilerThreadStackSize=256
08-11-2022
Test passed by setting compiler's stack size the same as for 64 bits -XX:CompilerThreadStackSize=1024 with maximum stack depth from Aleksey's patch output:
Depth is too high: 6306
Aleksey is right - it is stack overflow because on 32-bit we have smaller stacks.
In general it also proves that there is Java code which requires big stack for recursive calls during compilation.
We need to avoid recursions in C2.
08-11-2022
Re-adjusting priority (again) since it adds noise to GHA testing:
ILW = stack overflow in debug-only code, affects tier1 (linux-x86) in GHA CI testing; one test, on multiple platforms; no workaround = MMH = P3
08-11-2022
This can be reproduced reliably with:
```
$ build/linux-x86-server-fastdebug/images/jdk/bin/java -Xshare:on -Xcomp -XX:+VerifyGraphEdges -XX:+PrintCompilation -XX:-TieredCompilation -XX:CICompilerCount=1
...
7809 1684 !b java.lang.invoke.DirectMethodHandle::makePreparedLambdaForm (798 bytes)
7812 1684 ! java.lang.invoke.DirectMethodHandle::makePreparedLambdaForm (798 bytes) made not entrant
7813 1685 b java.lang.invoke.LambdaForm$Kind::<clinit> (1261 bytes)
<end of compilation log>
...
Segmentation fault (core dumped)
I did the depth instrumentation in `Node::verify_edges` before, see my first comment here. It matches the compilation log: `java.lang.invoke.LambdaForm$Kind::<clinit>` seems to cause stack overflow.
08-11-2022
[~manc] Let me have a look, the potential fix should not be hard (simulating recursion with a stack) but I would like to check first that the large size of the method under compilation is justifiable.
08-11-2022
This test keeps failing in Github pre-submit tests. I saw it in https://github.com/openjdk/jdk/pull/10974 and https://github.com/openjdk/jdk/pull/11032.
Should we add it to ProblemList.txt if it is hard to come up with a good fix?
08-11-2022
I can't reproduce it locally with 64 bit VM.
[~shade] Since you can easy reproduce the issue with 32-bit VM (based on your comment in JDK-8295936) can you give us information about which method was compiled when that happened?
-Xshare:on|off may affect methods compiled with Xcomp.
Unless there is indeed some bug in CDS changes (or bug in C2 where we had JDK-8284882) I can't imaging why we runout of stack (1Mb on x64).
There maximum nodes in graph I observed in my local test run by printing visited.size() in verify_graph_edges() was:
Compiling java.net.URLStreamHandler::parseURL (1124 bytes): visited='6333', nodes='7015' live='6968'
Note, it is "visited nodes" and not max stack depth during recursive call.
Theoretically we may fill 1MB stack because we have MaxNodeLimit=80000 if code is huge and very linear.
My first suggestion is to add -XX:+PrintCompilation -XX:-TieredCompilation -XX:CICompilerCount=1 flags to the test. So we know which compilation crashed VM.
Second, we may instrument Node::verify_edges() to check recursion depth.
But real fix would be to convert recursive Node::verify_edges() method to normal method.
28-10-2022
Fair enough :)
26-10-2022
I don't want to be the assignee for this. I think this calls for the compiler expert to take a look. :)
26-10-2022
Re-adjusting priority after latest findings:
ILW = stack overflow in debug-only code; one test, on multiple platforms; no workaround = LHH = P4
26-10-2022
[~shade] Since you have got further in the investigation, would you like to take over this bug? Feel free to re-assign it if so.
26-10-2022
Right, the issue I observed also seems to affect compilation of LambdaForm$Kind::<init>, which also explains the connection to JDK-8295936.
26-10-2022
Saw similar thing with x86_32, JDK-8295936. There seems to be an usually deep graph that stack-overflows the verification code.
This patch:
https://cr.openjdk.java.net/~shade/8295867/diag.patch
...run with:
$ build/linux-x86-server-fastdebug/images/jdk/bin/java -Xshare:on -Xcomp -XX:+VerifyGraphEdges
...produces:
Depth is too high: 5232
16826 jmpDir === 5685 [[ 5684 ]] !orig=14419
Depth is too high: 5233
5685 Region === 5685 16828 16827 [[ 5685 16826 8083 ]] !jvms: LambdaForm$Kind::<init> @ bci:8 (line 328) LambdaForm$Kind::<clinit> @ bci:135 (line 258)
Depth is too high: 5234
16828 jmpDir === 5686 [[ 5685 ]] !orig=14419
Depth is too high: 5235
5686 MachProj === 5687 [[ 16828 ]] #0/unmatched !jvms:
Segmentation fault (core dumped)
26-10-2022
ILW = stack overflow; windows only, one test only, with AlwaysIncrementalInline only; no workaround = MMH = P3
26-10-2022
The issue appears after integration of JDK-8293979, even if the connection between these two issues is not obvious.
25-10-2022
Googling the error code gives me this:
0xC00000FD is the error code for stack overflows