JDK-8303513 : C2: LoadKlassNode::make fails with 'expecting TypeKlassPtr'
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 21,22
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2023-03-02
  • Updated: 2023-07-12
  • Resolved: 2023-06-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21 JDK 22
21Fixed 22 b03Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
RunThese8M/RunThese30M fails with "assert(adr_type != 0LL) failed: expecting TypeKlassPtr" on a fastdebug build of Generational ZGC (see attached HotSpot error files, pid78296 for windows-x64 and pid1054514 for linux-x64). The failure is observed on a Generational ZGC build using -XX:+UseZGC, but it is not GC-specific (can also be reproduced using -XX:+UseParallelGC -XX:-UseCompressedOops).

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (c:\sb\prod\1677586887\workspace\open\src\hotspot\share\opto\memnode.cpp:2289), pid=78296, tid=48652
#  assert(adr_type != 0LL) failed: expecting TypeKlassPtr

Current CompileTask:
C2:1229395 128231 %     4       javasoft.sqe.tests.api.java.util.Collections.ncopies.Stream::lambda$getStreamFactory$2 @ 11 (73 bytes)

Stack: [0x0000002fc8e00000,0x0000002fc8f00000]
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
(...)
V  [jvm.dll+0xbaebeb]  LoadKlassNode::make+0x5b  (memnode.cpp:2289)
V  [jvm.dll+0xdf4b83]  SubTypeCheckNode::load_klass+0x153  (subtypenode.cpp:212)
V  [jvm.dll+0xdf51eb]  SubTypeCheckNode::verify+0x32b  (subtypenode.cpp:183)
V  [jvm.dll+0xdf4973]  SubTypeCheckNode::Ideal+0x2a3  (subtypenode.cpp:113)
V  [jvm.dll+0xcac915]  PhaseIterGVN::transform_old+0xe5  (phaseX.cpp:1356)
V  [jvm.dll+0xca8cb3]  PhaseIterGVN::optimize+0x2b3  (phaseX.cpp:1206)
V  [jvm.dll+0x51c682]  Compile::Optimize+0x1b2  (compile.cpp:2218)
V  [jvm.dll+0x51a43b]  Compile::Compile+0x16bb  (compile.cpp:834)
(...)

FAILURE ANALYSIS

The failure is caused by an implicit assumption made by the verification code within SubTypeCheckNode::Ideal() [1]. This code wrongly assumes that if obj_or_subklass (the ObjOrSubKlass input of the SubTypeCheck node) is a klass or OOP pointer, then 'obj_or_subklass->bottom_type() != TOP'. This assumption does not hold if obj_or_subklass is a projection of the TOP constant node, which can happen within IGVN e.g. if 'obj_or_subklass->in(0)' is an unreachable call node that gets replaced with TOP, as can be seen in before-after-removing-call.png (attached).

The consequence is that 'adr', the node computing the klass address of obj_or_subklass [2], has bottom type TOP, which triggers the reported assertion failure in LoadKlassNode::make() [3]. The failure is not specific to ZGC. It has only been observed with this GC configuration because a specific intermediate Idealization step in the IGVN sequence that leads to the above situation is only performed if UseCompressedOops is disabled. This step replaces a LoadP node with the value stored by a dominating StoreP node that writes into the same address, as can be seen in before-after-load-idealization.png (attached).

This Idealization is not performed if UseCompressedOops is enabled because, unlike LoadP, the corresponding LoadN node is not recorded for IGVN upon creation -- only its successor DecodeN node is [4,5]. This missing optimization opportunity should be addressed separately.

SUGGESTED SOLUTION

To skip verification of SubtypeCheck nodes [1] if 'obj_or_subklass->bottom_type == TOP'. This is a low-risk fix affecting debug-only code. An alternative, more invasive solution, would be to skip the entire SubtypeCheckNode::Ideal() call in this case.

[1] https://github.com/openjdk/jdk/blob/199b1bf5009120efd1fd37a1ddabc0c6fb84f62c/src/hotspot/share/opto/subtypenode.cpp#L113
[2] https://github.com/openjdk/jdk/blob/199b1bf5009120efd1fd37a1ddabc0c6fb84f62c/src/hotspot/share/opto/subtypenode.cpp#L211
[3] https://github.com/openjdk/jdk/blob/199b1bf5009120efd1fd37a1ddabc0c6fb84f62c/src/hotspot/share/opto/memnode.cpp#L2289
[4] https://github.com/openjdk/jdk/blob/bac02b6e9d9e1e93db27c7888188f29631e07f47/src/hotspot/share/opto/graphKit.cpp#L1561-L1566
[5] https://github.com/openjdk/jdk/blob/199b1bf5009120efd1fd37a1ddabc0c6fb84f62c/src/hotspot/share/opto/memnode.cpp#L956
Comments
The fix for this bug is integrated in jdk-22+3-94 and in jdk-21+28-2346.
17-06-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk21/pull/21 Date: 2023-06-15 10:25:30 +0000
15-06-2023

Changeset: 83d92672 Author: Roberto CastaƱeda Lozano <rcastanedalo@openjdk.org> Date: 2023-06-15 10:08:28 +0000 URL: https://git.openjdk.org/jdk/commit/83d92672d4c2637fc37ddd873533c85a9b083904
15-06-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/14463 Date: 2023-06-14 08:23:44 +0000
14-06-2023

Okay. Good analysis.
26-05-2023

[~kvn] Thanks for the pointer, I checked but the failure is still reproducible after JDK-8308583, see failure analysis in the updated description above.
26-05-2023

Is it possible that it relates to JDK-8308583? See Roland's fix for it.
25-05-2023

before-igvn.pdf shows the relevant subgraph before IGVN. The failing SubTypeCheck node is '1885 SubTypeCheck'. After some GVN iterations, '1787 Phi' is optimized away and '1805 Proj' becomes the ObjOrSubKlass input of 1885 (see on-failure.pdf). '1800 CallStaticJava' is also optimized away and replaced by the TOP node as input to 1805, causing the discrepancy between phase->type(1805 Proj) (non-TOP) and (1805 Proj)->bottom_type() (TOP) which ultimately leads to the assertion failure.
15-05-2023

Observations: 1. The failure can be reproduced on both the Generational ZGC repo and the integration PR in mainline. 2. Through the entire SubTypeCheckNode::Ideal(), in(ObjOrSubKlass) is a projection with the 'top' node (1 Con) as input (possibly the projection of a call node that has been optimized away). 3. Through the entire SubTypeCheckNode::Ideal(), phase->type(in(ObjOrSubKlass))->isa_oopptr() holds, but in(ObjOrSubKlass)->bottom_type() is TOP. The reason for this discrepancy is that phase->type(in(ObjOrSubKlass)) has not been updated yet, whereas in(ObjOrSubKlass)->bottom_type(), as a projection node, trivially propagates the TOP type from its input (the 'top' node) [1]. This causes adr->bottom_type() within SubTypeCheckNode::load_klass() to return TOP [2], which is unexpected by LoadKlassNode::make(..., adr, ...) [3] and triggers the assertion failure. 4. On entry of SubTypeCheckNode::Ideal(), it is common to observe phase->type(in(ObjOrSubKlass)) != in(ObjOrSubKlass)->Value(phase), this can be seen e.g. in compiler/arraycopy/stress/TestStressArrayCopy.java: - phase->type(in(ObjOrSubKlass)): java/lang/Object:NotNull * - in(ObjOrSubKlass)->Value(phase): java/lang/Object (java/util/stream/Sink,java/util/function/Consumer):NotNull * 5. The failure is independent of JDK-8299155 (can also be reproduced after reverting it). 6. The failure is dependent/triggered by JDK-8297933 (cannot be reproduced after reverting it). 7. At the point of failure, a very large number of classes is loaded (hundreds of thousands) 8. Removing the verification code does not lead to other C2 failures or obvious miscompilations, however it uncovers another issue which is likely unrelated (similarly to JDK-8234355, the JVM crashes when running the 'VM.class_hierarchy' diagnostic command due to the large number of loaded classes). 9. In generational ZGC, class unloading tends to be less frequent than in non-generational ZGC, since it is only triggered by major collections. This might explain why the issue is only reproducible with generational ZGC. [1] https://github.com/openjdk/jdk/blob/dc4096ce136c867e0806070a2d7c8b4efef5294c/src/hotspot/share/opto/multnode.cpp#L117 [2] https://github.com/openjdk/jdk/blob/dc4096ce136c867e0806070a2d7c8b4efef5294c/src/hotspot/share/opto/addnode.cpp#L633-L634 [3] https://github.com/openjdk/jdk/blob/dc4096ce136c867e0806070a2d7c8b4efef5294c/src/hotspot/share/opto/memnode.cpp#L2288-L2289
08-05-2023

This issue is dependent on/triggered by JDK-8297933 (cannot be reproduced if JDK-8297933 is reverted).
05-05-2023

ILW = Assert during C2 compilation, intermittent with long running stress test - only reproduced twice so far with generational ZGC, no known workaround but disable compilation of affected method = HLM = P3
02-03-2023

Smells like JDK-8297933, where [~epeter] reported something similar with the first version of the patch. Actually, the code path that fails was just added recently by JDK-8299155.
02-03-2023