Bug ID: JDK-8253927 guarantee(object->mark() == markWord::INFLATING()) failed: invariant

JDK-8253927 : guarantee(object->mark() == markWord::INFLATING()) failed: invariant

Type: Bug
Component: hotspot
Sub-Component: runtime
Affected Version: 15,16,17

Priority: P2
Status: Closed
Resolution: Duplicate
CPU: ppc

Submitted: 2020-10-02
Updated: 2021-02-10
Resolved: 2021-02-10

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

Other
tbdResolved

Related Reports

Duplicate :	JDK-8261522 - [PPC64] AES intrinsics write beyond the destination array
Relates :	JDK-8152172 - PPC64: Support AES intrinsics

Description

Crash was observed several times when running Spec JVM 2008 in crypto.aes on a Power9 machine. JVM was built on 2020-09-28 and 2020-11-12.
Unclear if only PPC64 or weak memory model platforms are affected or if it is a general bug.

#  Internal Error (synchronizer.cpp:1373), pid=6583, tid=7553
#  guarantee(object->mark() == markWord::INFLATING()) failed: invariant
#
# JRE version: OpenJDK Runtime Environment (16.0.0.1) (build 16.0.0.1-internal+0-adhoc.openjdk.jdk)
# Java VM: OpenJDK 64-Bit Server VM (16.0.0.1-internal+0-adhoc.openjdk.jdk, mixed mode, tiered, compressed oops, g1 gc, linux-ppc64le)
# Problematic frame:
# V  [libjvm.so+0xe5279c]  ObjectSynchronizer::inflate(Thread*, oopDesc*, ObjectSynchronizer::InflateCause)+0x7dc

V  [libjvm.so+0xe5279c]  ObjectSynchronizer::inflate(Thread*, oopDesc*, ObjectSynchronizer::InflateCause)+0x7dc
V  [libjvm.so+0xe5283c]  ObjectSynchronizer::exit(oopDesc*, BasicLock*, Thread*)+0x4c
V  [libjvm.so+0xdad3b0]  SharedRuntime::complete_monitor_unlocking_C(oopDesc*, BasicLock*, JavaThread*)+0xc0
J 6247 c2 java.io.PrintStream.println(Ljava/lang/String;)V java.base@16.0.0.1-internal (44 bytes) @ 0x00007fff9dc7fc34 [0x00007fff9dc7f200+0x0000000000000a34]
J 6521 c1 spec.benchmarks.crypto.aes.Main.runEncryptDecrypt(Ljavax/crypto/SecretKey;Ljava/lang/String;Ljava/lang/String;)V (93 bytes) @ 0x00007fff962def94 [0x00007fff962de600+0x0000000000000994]
J 6618 c1 spec.benchmarks.crypto.aes.Main.harnessMain()V (97 bytes) @ 0x00007fff95e86648 [0x00007fff95e86380+0x00000000000002c8]
J 5512 c1 spec.harness.BenchmarkThread.runLoop(Lspec/harness/results/IterationResult;)Lspec/harness/results/LoopResult; (243 bytes) @ 0x00007fff966a5498 [0x00007fff966a4800+0x0000000000000c98]
j  spec.harness.BenchmarkThread.executeIteration()Z+74
j  spec.harness.BenchmarkThread.run()V+1
...

From gdb:
#6  0x00007fffb3f927a4 in ObjectSynchronizer::inflate (self=0x7ffef00efba0, object=0xff802c08, cause=ObjectSynchronizer::inflate_cause_vm_internal) at ./src/hotspot/share/runtime/synchronizer.cpp:1373
m = 0x7ffe5c004960
r30=0x00007ffe5c004960 points into unknown readable memory: 0x00007fff95660000 | 00 00 66 95 ff 7f 00 00
mark = {_value = 0x7ffe9d7ddba8
r8,r17=0x00007ffe9d7ddba8 is pointing into the stack for thread: 0x00007ffef00efba0
0x7ffe9d7ddba8: 0x00007fff95660000

#9  SharedRuntime::complete_monitor_unlocking_C (obj=0xff802c08, lock=0x7ffe9d7ddc70, thread=<optimized out>) at ./src/hotspot/share/runtime/sharedRuntime.cpp:2129
lock: {_displaced_header = {_value = 9

Comments

Problem seems to be the read-modify-write logic in aescrypt_encryptBlock / aescrypt_decryptBlock. The vector instructions may read bytes beyond the array and write them back. This may result in lost concurrent mark word changes. Workarounds: -XX:-UseAES or: -XX:ObjectAlignmentInBytes=16
10-02-2021
It can be redroduced very sporadically. It was no longer observed when disabling the _aescrypt_decryptBlock intrinsic. Issue was also not observed with BiasedLocking enabled. Possible workaround: -XX:-UseAES
10-02-2021
I'll update the information in the bug description to the new occurance. Older one (hs_err_pid18455.log) for reference: # Internal Error (synchronizer.cpp:1953), pid=18455, tid=19363 # guarantee(object->mark() == markWord::INFLATING()) failed: invariant V [libjvm.so+0xe463cc] ObjectSynchronizer::inflate(Thread, oopDesc, ObjectSynchronizer::InflateCause)+0x76c V [libjvm.so+0xe467ac] ObjectSynchronizer::exit(oopDesc, BasicLock, Thread)+0x4c V [libjvm.so+0xda18a0] SharedRuntime::complete_monitor_unlocking_C(oopDesc, BasicLock, JavaThread)+0xc0 J 5765 c2 sun.nio.cs.StreamEncoder.flushBuffer()V java.base@16.0.0.1-internal (42 bytes) @ 0x00007fff95e1d84c [0x00007fff95e1d400+0x000000000000044c] J 6249 c2 java.io.PrintStream.println(Ljava/lang/String;)V java.base@16.0.0.1-internal (44 bytes) @ 0x00007fff95f1f1a0 [0x00007fff95f1ea80+0x0000000000000720] J 6534 c1 spec.benchmarks.crypto.aes.Main.runEncryptDecrypt(Ljavax/crypto/SecretKey;Ljava/lang/String;Ljava/lang/String;)V (93 bytes) @ 0x00007fff8e575794 [0x00007fff8e574e00+0x0000000000000994] j spec.benchmarks.crypto.aes.Main.harnessMain()V+69 ... object->_mark points to a stack lock: 0x00007ffe67dfdbc8 is pointing into the stack for thread: 0x00007ffeec10d5e0 content of stack slot 0x7ffe67dfdbc8: 0x0000000000000009 which belongs to the crashing thread: Current thread (0x00007ffeec10d5e0): JavaThread "BenchmarkThread crypto.aes 5" [_thread_in_Java, id=19363, stack(0x00007ffe67c00000,0x00007ffe67e00000)] Crash happened at Time: Mon Sep 28 23:25:25 2020 CEST elapsed time: 466.850977 seconds (0d 0h 7m 46s) Shortly after this GC Event: 466.837 GC heap after {Heap after GC invocations=1476 (full 2): garbage-first heap total 1638400K, used 314352K [0x000000009c000000, 0x0000000100000000) region size 1024K, 1 young (1024K), 1 survivors (1024K) Metaspace used 16658K, capacity 16847K, committed 17152K, reserved 1064960K class space used 1402K, capacity 1472K, committed 1536K, reserved 1048576K }
13-11-2020
Yes, this looks like something different. The mark word looks valid. Evacuation failures are unfortunately not logged in the event, there is a CR for that. However the hs_err file does not indicate there is particular memory pressure (or has been recently) as the heap is not that well filled and the gcs are spaced wide apart.
15-10-2020
Thanks for the hints. I have added the last GC Event. The mark word was CASed to INFLATING which is 0 before we reached this point where we expect it still to be 0. Instead of 0, it contains a pointer to a stack slot (stack lock, also shown in the issue description above). So there can't be a forwarding pointer any more. Last G1 GC cycle contained 1 young and 1 surviver regions. How can I see if there was an evacuation failure? I guess there would be some extra text in the Event for that.
08-10-2020
An email reply from Thomas to Martin's original email about this issue: On 10/8/20 4:31 AM, Thomas Schatzl wrote: > > This reminds me a bit of JDK-8248438 because of the stack trace. This will be fixed by JDK-8254164. > > You can you check whether its the same issue using the following keys: > > Using g1 (the snippet does not show even that) > - the markword contains a self-forwarded pointer > AND/OR > - the most recent gc has been a mixed gc with an optional evacuation phase and an evacuation failure (can be seen in a gc log) > > If you can manage to verify the first, this is a duplicate of JDK-8254164 (note that that bug is there since JDK 13). If you can only verify the latter, there is very high probability that this is the issue. > > Thanks, > Thomas
08-10-2020