JDK-8205940 : LoadNode::find_previous_arraycopy fails with "broken allocation" assert
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: os_x
  • CPU: x86_64
  • Submitted: 2018-06-27
  • Updated: 2020-02-19
  • Resolved: 2018-06-29
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 12
11 b20Fixed 12Fixed
Related Reports
Blocks :  
Relates :  
Description
We get a hotspot assertion error running http client tests using new TLS1.3 implementation in jdk/jdk on Mach5. Looks like it might be C2 intrinsics related (MacOSX only). 

A Mach 5 job today showing the problem is at [1]. The three failures are all the same cause. In this run, the problem is seen on macos releases Yosemite and High Sierra.

We cannot reliably reproduce it outside Mach 5. To reproduce (in m5), start with current jdk/jdk forest, and add the patch attached to this report (which enables use of TLS1.3) to open. Then submit mach 5 job with:

mach5 remote-build-and-test -b macosx-x64-debug --test open/test/jdk/java/net/httpclient --email xxx@oracle.com --test-repeat 50

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/scratch/mesos/slaves/c4ee7e63-1ded-4e8c-9581-ce26f27e3af4-S221801/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/844482a8-c3dc-4753-b437-e742c427a048/runs/20999157-2932-4ce1-817e-87734d38d1e3/workspace/open/src/hotspot/share/opto/memnode.cpp:521), pid=98920, tid=23555
#  assert(alloc != __null && (!ReduceBulkZeroing || alloc->initialization()->is_complete_with_arraycopy())) failed: broken allocation
#
# JRE version: Java(TM) SE Runtime Environment (11.0) (fastdebug build 11-internal+0-2018-06-27-1550285.michael.x.mcmahon.jdk)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 11-internal+0-2018-06-27-1550285.michael.x.mcmahon.jdk, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
Comments
Thanks, Roland.
28-06-2018

Thanks. So yes, I would say bail out if alloc == NULL.
28-06-2018

I've tried to trace the Phi back to where it's created and it comes from: #2 0x00007f21253596cf in PhaseIdealLoop::clone_loop_handle_data_uses (this=0x7f20f91a43e0, old=0x7f20dc1644a0, old_new=..., loop=0x7f20dc275f40, outer_loop=0x7f20dc275f40, split_if_set=@0x7f20f91a20a0: 0x0, split_bool_set=@0x7f20f91a21a0: 0x0, split_cex_set=@0x7f20f91a21d0: 0x7f20dc29aac0, worklist=..., new_counter=3346, mode=PhaseIdealLoop::CloneIncludesStripMined) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopopts.cpp:1738 #3 0x00007f212535b9ff in PhaseIdealLoop::clone_loop (this=0x7f20f91a43e0, loop=0x7f20dc275f40, old_new=..., dd=20, mode=PhaseIdealLoop::CloneIncludesStripMined, side_by_side_idom=0x7f20dc29f8a0) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopopts.cpp:2104 #4 0x00007f212533a99e in PhaseIdealLoop::create_slow_version_of_loop (this=0x7f20f91a43e0, loop=0x7f20dc275f40, old_new=..., opcode=150, mode=PhaseIdealLoop::CloneIncludesStripMined) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopUnswitch.cpp:264 #5 0x00007f2125339db3 in PhaseIdealLoop::do_unswitching (this=0x7f20f91a43e0, loop=0x7f20dc275f40, old_new=...) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopUnswitch.cpp:135 #6 0x00007f21253369af in IdealLoopTree::iteration_split (this=0x7f20dc275f40, phase=0x7f20f91a43e0, old_new=...) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopTransform.cpp:3185 #7 0x00007f21253369ea in IdealLoopTree::iteration_split (this=0x7f20dc276140, phase=0x7f20f91a43e0, old_new=...) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopTransform.cpp:3193 #8 0x00007f21253369ea in IdealLoopTree::iteration_split (this=0x7f20dc276340, phase=0x7f20f91a43e0, old_new=...) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopTransform.cpp:3193 #9 0x00007f212534978e in PhaseIdealLoop::build_and_optimize (this=0x7f20f91a43e0, do_split_ifs=false, skip_loop_opts=false, last_round=false) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopnode.cpp:2931 #10 0x00007f2124ceea76 in PhaseIdealLoop::PhaseIdealLoop (this=0x7f20f91a43e0, igvn=..., do_split_ifs=false, skip_loop_opts=false, last_round=false) at /oracle/jdk_jdk/open/src/hotspot/share/opto/loopnode.hpp:942 I think what happens is that there is an AllocateArray + ArrayCopy combination in a loop exit that was emitted for the clone intrinsic. Then the loop invariant ArrayCopy is moved out of the loop but the AllocateArray is not because it is loop dependent. During loop unswitching (see above), we create a phi node to merge the AllocateArray results from the exit of two different loop versions.
28-06-2018

The assert is likely too strong. I wouldn't test for ld_alloc but return if alloc == NULL. The ArrayCopy node is for a clone. The expectation is that there is an allocation right before the ArrayCopy. What causes a Phi to be between the ArrayCopy and the allocations?
28-06-2018

The LoadNode for which we are calling find_previous_arraycopy(..) is a polling page load: 433 ConL === 0 [[ 434 ]] #long:288 432 ThreadLocal === 0 [[ 434 2187 744 746 748 781 2134 2065 1988 1922 1860 1840 1858 3236 ]] !jvms: HmacCore::engineInit @ bci:137 Mac::init @ bci:13 2474 MemBarCPUOrder === 1414 1 1415 1 1 [[ 1419 1418 ]] !orig=1417 !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 2456 IfFalse === 2455 [[ 3825 ]] #0 !orig=1622 !jvms: HmacCore::engineInit @ bci:88 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 3824 IfFalse === 3817 [[ 3825 ]] #0 !orig=2456,1622 !jvms: HmacCore::engineInit @ bci:88 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 434 AddP === _ 1 432 433 [[ 435 546 1615 1725 2235 3177 ]] !jvms: HmacCore::engineInit @ bci:137 Mac::init @ bci:13 1419 Proj === 2474 [[ 1627 1484 1472 1626 1447 3687 1438 2418 1571 1531 1612 1615 3630 3804 3686 ]] #2 Memory: @BotPTR *+bot, idx=Bot; !orig=[1492] !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 3825 Region === 3825 3824 2456 [[ 3825 2461 3828 3830 3832 1615 ]] 1615 LoadP === 3825 1419 434 [[ 2461 ]] @rawptr:BotPTR, idx=Raw; #rawptr:BotPTR (does not depend only on test) !jvms: HmacCore::engineInit @ bci:137 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 2461 SafePoint === 3825 1 1612 1 1 1615 10 1 1 10 11 27 628 3523 3396 3470 1 629 628 3470 1 1 1407 3828 1 [[ 2457 ]] SafePoint !orig=1611 !jvms: HmacCore::engineInit @ bci:137 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 2457 OuterStripMinedLoopEnd === 2461 23 [[ 2458 3661 ]] P=0,984611, C=292270,000000 There is no allocation and therefore ld_alloc = NULL. However, we still call LoadNode::find_previous_arraycopy() and we are unable to find the allocation for the ArrayCopy destination because there is a Phi in between: 1390 AllocateArray === 1370 1214 1371 8 1 ( 1385 171 1389 1380 10 1 1 10 11 27 628 699 1201 1271 1 629 628 1271 1 1 1 1 1 1 1365 ) [[ 1391 1392 1393 1400 1401 1402 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int ) SecretKeySpec::getEncoded @ bci:4 reexecute HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 3214 AllocateArray === 3283 3201 3281 8 1 ( 3149 171 3074 3152 10 1 1 10 11 27 628 3243 3169 3203 1 629 628 3203 1 1 1 1 1 1 3154 ) [[ 3207 3210 3211 3212 3213 3280 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, int ) SecretKeySpec::getEncoded @ bci:4 reexecute HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 !orig=1390 !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 1402 Proj === 1390 [[ 3519 ]] #5 !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 3213 Proj === 3214 [[ 3519 ]] #5 !orig=1402 !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 3519 Phi === 3355 3213 1402 [[ 1407 1403 ]] #rawptr:NotNull 1407 CheckCastPP === 1404 3519 [[ 1409 1409 1718 1718 1717 1526 1525 1447 2472 1682 1472 1484 1670 1658 1526 1516 1516 2461 3625 3634 3634 3681 3690 3690 3799 3808 3808 3839 3841 3841 3864 3866 3866 3915 3917 3917 ]] #byte[int:0..max-2]:NotNull:exact * !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 1409 AddP === _ 1407 1407 215 [[ 1413 ]] !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 1413 ArrayCopy === 1404 1 1386 8 1 ( 1408 _ 1409 _ 1412 _ _ _ _ ) [[ 1414 1416 ]] void ( java/lang/Object *, int, java/lang/Object *, int, int, int, int, BotPTR *+bot, BotPTR *+bot ) (clone) !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 1414 Proj === 1413 [[ 2474 ]] #0 !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 2474 MemBarCPUOrder === 1414 1 1415 1 1 [[ 1419 1418 ]] !orig=1417 !jvms: SecretKeySpec::getEncoded @ bci:4 HmacCore::engineInit @ bci:32 Mac::chooseProvider @ bci:124 Mac::init @ bci:22 The graph looks correct to me (lots of loop optimizations happened before). I would say the assert is too strong and we should also bail out if ld_alloc = NULL. [~roland], you've introduced that code with JDK-8076188. What do you think?
28-06-2018

I can reproduce this with latest jdk/jdk and attached patch + replay compilation. Investigating.
28-06-2018

This code has been introduced by JDK-8076188 in JDK 9 but it might be that subsequent changes triggered/introduced the problem.
28-06-2018