JDK-8144484 : assert(no_dead_loop) failed: dead loop detected
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2015-12-02
  • Updated: 2017-03-13
  • Resolved: 2017-02-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10 JDK 9
10Fixed 9 b159Fixed
Related Reports
Relates :  
Description
Running Nashorn + Octane with a fastdebug build and -Xcomp crashes the VM:

 99	ConI	===  0  [[ 21081  20723  20684  36796  28724  29144  8663  9607  28721  9609  9202  366  6213  14156  18493  7269  495  126  9823  5779  7267  10801  761  14023  10570  17746  7081  885  532  1007  17744  23287  11798  8907  10093  10095  5368  1320  23284  17360  4506  6669  14792  3643  11921  10367  17133  11187  13026  12387  14163  10365  1934  7765  3641  4061  8157  6211  11938  9200  2226  11931  4063  12107  11678  16410  5777  8917  11676  16227  13412  8661  13410  4938  14332  2836  5366  13903  14146  14614  15754  6671  15586  8924  15584  8784  4504  4940  15339  11185  14616  15182  13901  8159  23741  23738  24019  24016  24297  24294  24575  24572  24867  24864  25068  25897  25899  27006  26718  27008  27451  27448  27879  27876  28298  28295  29331  29856  30126  30123  30661  30664  30956  31440  31697  31918  32182  32185  32479  32476  33018  33501  33772  33775  34010  34274  34277  34570  34567  35111  35114  35406  35700  35906  36795  36804  36804 ]]  #int:1
 36795	AddI	=== _  36795  99  [[ 31943  33048  32192  19605  32207  36804  36804  36804  36796  36795  19813  31710  33302  32398  33474  32104  33694  31928  34020  34196  31726  35947  35710  35622  35416  35328  35044  35028  35121  35136  34577  34592  34284  34299  34035  33782  33797  33458  33511  33526  32951  32935  32988  33031  31450  31465  30878  30969  30594  30578  30671  30686  30148  29789  29773  29826  29869  29344  20267  19000  29360  20244  32777  20219  32486  29615  20182  36689  19086  19086  36447  29885  20086  30133  36202  30422  35932  35741  19178  19178  35447  30631  19952  30985  19925  19285  19257  35081  34871  31239  31397  31413  34489  19631 ]]  !jvms: JSType::addExact @ bci:2 1359484306::invokeStatic_I3_I @ bci:14 232635121::reinvoke @ bci:24 537235912::linkToTargetMethod @ bci:6 Script$Recompilation$2499$833928AAZAA$typescript_compiler::L:16271$TypeChecker$sourceIsRelatableToTarget @ bci:1982
 20219	ConvI2L	=== _  36795  [[ 19032  19000 ]]  #long:minint..maxint:www !orig=[18994] !jvms: Script$Recompilation$2499$833928AAZAA$typescript_compiler::L:16271$TypeChecker$sourceIsRelatableToTarget @ bci:985
# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/phaseX.cpp:785
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/oracle/8144212/hotspot/src/share/vm/opto/phaseX.cpp:785), pid=25672, tid=25692
#  assert(no_dead_loop) failed: dead loop detected
#
# JRE version: Java(TM) SE Runtime Environment (9.0) (build 1.9.0-internal-fastdebug-tohartma_2015_11_30_14_45-b00)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.9.0-internal-fastdebug-tohartma_2015_11_30_14_45-b00, compiled mode, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %P" (or dumping to /home/tohartma/programs/octane/core.25672)
#
# An error report file with more information is saved as:
# /home/tohartma/programs/octane/hs_err_pid25672.log
#
# Compiler replay data is saved as:
# /home/tohartma/programs/octane/replay_pid25672.log

Stack: [0x00007f57982fe000,0x00007f57983ff000],  sp=0x00007f57983f8ba0,  free space=1002k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x127fda2]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x182
V  [libjvm.so+0x1280aea]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
V  [libjvm.so+0x8b7ca4]  report_vm_error(char const*, int, char const*, char const*, ...)+0xd4
V  [libjvm.so+0x103e23e]  PhaseGVN::dead_loop_check(Node*)+0x26e
V  [libjvm.so+0x1044e5f]  PhaseIterGVN::transform_old(Node*)+0x8f
V  [libjvm.so+0x1040504]  PhaseIterGVN::optimize()+0x84
V  [libjvm.so+0x832764]  Compile::inline_incrementally(PhaseIterGVN&)+0x3f4
V  [libjvm.so+0x8361e1]  Compile::Optimize()+0x401
V  [libjvm.so+0x838066]  Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0x1406
V  [libjvm.so+0x6d5a83]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x113

To reproduce run Nashorn with the Octane benchmark "jjs -J-Xcomp run.js".
Where run.js can be modified to only execute the "typescript" benchmarks (see attached version of run.js).
Comments
<webrev.02> - http://cr.openjdk.java.net/~rraghavan/8144484/webrev.02/ Confirmed no issues with testing (8144484 test, jprt -testset hotspot, RBT Pre-integration testing)
03-02-2017

8144484 fix proposal in PhiNode::Ideal() [src/share/vm/opto/cfgnode.cpp] ---------------------- // Return a node which is more "ideal" than the current node. Must preserve // the CFG, but we can still strip out dead paths. Node *PhiNode::Ideal(PhaseGVN *phase, bool can_reshape) { ........... ........... // Split phis through memory merges, so that the memory merges will go away. // Piggy-back this transformation on the search for a unique input.... // It will be as if the merged memory is the unique value of the phi. // (Do not attempt this optimization unless parsing is complete. // It would make the parser's memory-merge logic sick.) // (MergeMemNode is not dead_loop_safe - need to check for dead loop.) if (progress == NULL && can_reshape && type() == Type::MEMORY) { ........... ........... // We know that at least one MergeMem->base_memory() == this // (saw_self == true). If all other inputs also references this phi // (directly or through data nodes) - it is dead loop. bool saw_safe_input = false; for (uint j = 1; j < req(); ++j) { Node *n = in(j); if (n->is_MergeMem() && n->as_MergeMem()->base_memory() == this) { continue; // skip known cases } + // TOP inputs should not be counted as safe inputs because if the + // Phi references itself through all other inputs then splitting the + // Phi through memory merges would create dead loop at later stage.. + if (n == top) { + continue; + } if (!is_unsafe_data_reference(n)) { saw_safe_input = true; // found safe input break; } } if (!saw_safe_input) return top; // all inputs reference back to this phi - dead loop // Phi(...MergeMem(m0, m1:AT1, m2:AT2)...) into // MergeMem(Phi(...m0...), Phi:AT1(...m1...), Phi:AT2(...m2...)) ................... ---------------------- Notes: 1. Confirmed the 8144484 test replay file is compatible and reported crash with the reproducible steps present using the original test build and also promoted builds jdk9b147 onwards still jdk9b154. The issue is also reproducible using local test build before recent 8173195 commit. After 8173195 commit the 'assert(no_dead_loop) fail' is not present for the replay test. The actual 8144484 root cause seems not solved, but only hidden due to reshaped graph after recent 8173195 changes. 2. Regarding the dependency on -Xmx73723M option to reproduce crash. Xmx setting affects UseCompressedOops - Found no crash with -Xmx32736M; but failure with -Xmx32737M. So the crash happens with -XX:-UseCompressedOops in all cases. Also extra explicit -XX:-UseSHA256Intrinsics may be required to reproduce the issue based on the test host. 3. Able to reproduce the issue only in Windows-x64 machines, could not yet get the failure in a Linux x64 machine! Comparing PrintFlagsFinal and making the settings same did not help. The recent 25-Jan nightly failures reported are also in Windows-x64. Planning to figure out why later. For the 8144484 test case, NO other crash if the failing assert(no_dead_loop) is commented out. So may be no proof yet of product builds getting affected. 4. The location of the crash is during the first call of igvn.optimize() itself in Compile::Optimize() [during PHASE_ITER_GVN1 just after PHASE_AFTER_PARSING] The dead_loop_check() fails in PhaseIterGVN::transform_old() for a new MergeMemNode created in PhiNode::Ideal() which is newly created through a call from PhaseIterGVN::transform_old() itself. The failure is due to a loop when a data node references itself directly or through another data node excluding control nodes. 5. Extracts comments from PhiNode::Ideal() at the root cause location - ........... // Split phis through memory merges, so that the memory merges will go away. // Piggy-back this transformation on the search for a unique input.... // It will be as if the merged memory is the unique value of the phi. // (Do not attempt this optimization unless parsing is complete. // It would make the parser's memory-merge logic sick.) // (MergeMemNode is not dead_loop_safe - need to check for dead loop.) if (progress == NULL && can_reshape && type() == Type::MEMORY) { ................ The root cause of the issue seems strong dead loop checks not done in PhiNode::Ideal() before new MergeMemNodes creation, which is not dead_loop_safe. Following is the related input graph nodes dump in PhiNode::Ideal() before checking for possible dead loop (before confirming saw_safe_input) 2808 Region === 2808 1 2980 2804 [[ 2808 3002 2987 2985 ]] .......... 2980 IfFalse === 2978 [[ 2808 ]] .......... 2804 IfTrue === 2803 [[ 2808 ]] .......... 2985 Phi === 2808 1 6044 3011 [[ 3020 3003 3026 3011 2860 2785 2769 2821 2833 2943 2903 ]] .......... 6044 MergeMem === _ 1 3011 1 1 1 6045 [[ 6050 6048 2985 ]] .......... 3011 MergeMem === _ 1 2985 1 1 1 2987 [[ 2987 2985 6045 6044 ]] .......... Found that during transformations the Region/Phi nodes can have TOP ConNode in the input list (1-Con node above) This possibility of existence of TOP node is not handled when checking the possible dead loops during splitting PhiNodes to MergeMem nodes, at PhaseIterGVN::transform_old() > PhiNode::Ideal() calls. Without the fix, related graph nodes dumped during failing case - 3011 MergeMem === _ 1 6058 1 1 1 2987 [[ 2987 6058 6045 6044 ]] ... 6058 MergeMem === _ 1 3011 1 1 1 6059 [[ 2903 2943 2833 2821 2769 2785 2860 3011 3026 3003 3020 ]] ... The dead loop causing the crash -- [6058] > [3011] > [6058] > [3011] 6. Above proposed fix in PhiNode::Ideal() is to ignore the TOP inputs before checking the is_unsafe_data_reference(n), so that saw_safe_input is set to the correct value. TOP inputs should be ignored else if there is an actual dead loop and if the Phi references to itself through all other inputs then splitting the Phi through memory merges would create dead loop at later stage because the Phi with the TOP input will go away exposing the dead loop. Confirmed the fix with 8144484 test and JPRT (-testset hotspot). RBT in progress. Will submit fix webrev for review
03-02-2017

RFR thread - http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-January/025513.html <webrev.01> - http://cr.openjdk.java.net/~rraghavan/8144484/webrev.01/
31-01-2017

[~rraghavan], could you please have a look?
14-12-2016

ILW = Assert in loop optimizations due to dead loop, very rare but easy to reproduce, no workaround = MMH = P3
14-12-2016

A similar crash appeared in the hotspot nightly: V [jvm.dll+0x9e7f50] os::platform_print_native_stack+0x100;; ?platform_print_native_stack@os@@SA_NPEAVoutputStream@@PEBXPEADH@Z+0x100 V [jvm.dll+0xb9eb8c] VMError::report+0xb0c;; ?report@VMError@@CAXPEAVoutputStream@@_N@Z+0xb0c V [jvm.dll+0xb9fabb] VMError::report_and_die+0x45b;; ?report_and_die@VMError@@SAXHPEBD0PEADPEAVThread@@PEAEPEAX40H_K@Z+0x45b V [jvm.dll+0xba00ed] VMError::report_and_die+0x5d;; ?report_and_die@VMError@@SAXPEAVThread@@PEBDH11PEAD@Z+0x5d V [jvm.dll+0x4e98f8] report_vm_error+0x78;; ?report_vm_error@@YAXPEBDH00ZZ+0x78 V [jvm.dll+0xa2dc0f] PhaseGVN::dead_loop_check+0x12f;; ?dead_loop_check@PhaseGVN@@QEAAXPEAVNode@@@Z+0x12f V [jvm.dll+0xa31ceb] PhaseIterGVN::transform_old+0x2db;; ?transform_old@PhaseIterGVN@@EEAAPEAVNode@@PEAV2@@Z+0x2db V [jvm.dll+0xa2f863] PhaseIterGVN::optimize+0x213;; ?optimize@PhaseIterGVN@@QEAAXXZ+0x213 V [jvm.dll+0x48dda0] Compile::Optimize+0x170;; ?Optimize@Compile@@AEAAXXZ+0x170 V [jvm.dll+0x48bd2c] Compile::Compile+0xd8c;; ??0Compile@@QEAA@PEAVciEnv@@PEAVC2Compiler@@PEAVciMethod@@H_N33PEAVDirectiveSet@@@Z+0xd8c V [jvm.dll+0x3a8d40] C2Compiler::compile_method+0x130;; ?compile_method@C2Compiler@@UEAAXPEAVciEnv@@PEAVciMethod@@HPEAVDirectiveSet@@@Z+0x130 V [jvm.dll+0x4a30a7] CompileBroker::invoke_compiler_on_method+0x617;; ?invoke_compiler_on_method@CompileBroker@@CAXPEAVCompileTask@@@Z+0x617 V [jvm.dll+0x4a1bc2] CompileBroker::compiler_thread_loop+0x292;; ?compiler_thread_loop@CompileBroker@@SAXXZ+0x292 V [jvm.dll+0xb53261] JavaThread::thread_main_inner+0x211;; ?thread_main_inner@JavaThread@@QEAAXXZ+0x211 V [jvm.dll+0xb51f3a] JavaThread::run+0x25a;; ?run@JavaThread@@UEAAXXZ+0x25a V [jvm.dll+0x9e6e4a] thread_native_entry+0x11a;; ?thread_native_entry@@YAIPEAVThread@@@Z+0x11a
13-12-2016

I did lots of runs with the latest hs-comp build and Nashorn+Octane. This does not reproduce anymore.
24-03-2016