JDK-8312099 : SIGSEGV in RegisterNMethodOopClosure::do_oop(oopDesc**)+0x38
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 20.0.1
  • Priority: P2
  • Status: Resolved
  • Resolution: Incomplete
  • OS: linux
  • Submitted: 2023-07-14
  • Updated: 2024-12-30
  • Resolved: 2024-02-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 22
22Resolved
Related Reports
Relates :  
Description
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f96d50c1118, pid=2253658, tid=2253832
#
# JRE version: OpenJDK Runtime Environment (20.0.1+9) (build 20.0.1+9-29)
# Java VM: OpenJDK 64-Bit Server VM (20.0.1+9-29, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x762118]  RegisterNMethodOopClosure::do_oop(oopDesc**)+0x38
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

Command Line: -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -Djava.security.manager=allow -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j2.formatMsgNoLookups=true -Djava.locale.providers=SPI,COMPAT --add-opens=java.base/java.io=org.elasticsearch.preallocate -XX:+UseG1GC -Djava.io.tmpdir=/tmp/elasticsearch-3574794575791466772 -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,level,pid,tags:filecount=32,filesize=64m -Xms31744m -Xmx31744m -XX:MaxDirectMemorySize=16642998272 -XX:InitiatingHeapOccupancyPercent=30 -XX:G1ReservePercent=25 -Des.distribution.type=deb --module-path=/usr/share/elasticsearch/lib --add-modules=jdk.net --add-modules=org.elasticsearch.preallocate -Djdk.module.main=org.elasticsearch.server org.elasticsearch.server/org.elasticsearch.bootstrap.Elasticsearch

Host: Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz, 4 cores, 62G, Debian GNU/Linux 11 (bullseye)
Time: Tue Jul  4 09:58:20 2023 CEST elapsed time: 80.362941 seconds (0d 0h 1m 20s)

---------------  T H R E A D  ---------------

Current thread (0x00007f96481f0db0):  JavaThread "elasticsearch[***][system_write][T#1]" daemon [_thread_in_vm, id=2253832, stack(0x00007f9495183000,0x00007f9495284000)]

Stack: [0x00007f9495183000,0x00007f9495284000],  sp=0x00007f949527ca98,  free space=998k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x762118]  RegisterNMethodOopClosure::do_oop(oopDesc**)+0x38
V  [libjvm.so+0x758077]  G1CollectedHeap::register_nmethod(nmethod*)+0x37
V  [libjvm.so+0x5183b4]  Runtime1::patch_code(JavaThread*, Runtime1::StubID)+0x3e4
V  [libjvm.so+0x519967]  Runtime1::access_field_patching(JavaThread*)+0x17
v  ~RuntimeStub::access_field_patching Runtime1 stub 0x00007f96c054c258
J 26745 c1 org.elasticsearch.index.engine.InternalEngine.planIndexingAsNonPrimary(Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/InternalEngine$IndexingStrategy; org.elasticsearch.server@8.8.2 (176 bytes) @ 0x00007f96bbe79738 [0x00007f96bbe78e00+0x0000000000000938]
J 26744 c1 org.elasticsearch.index.engine.InternalEngine.indexingStrategyForOperation(Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/InternalEngine$IndexingStrategy; org.elasticsearch.server@8.8.2 (22 bytes) @ 0x00007f96bbe789d4 [0x00007f96bbe78840+0x0000000000000194]
J 26738 c1 org.elasticsearch.index.engine.InternalEngine.index(Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (939 bytes) @ 0x00007f96bbe6aef4 [0x00007f96bbe6a620+0x00000000000008d4]
J 26736 c1 org.elasticsearch.index.shard.IndexShard.index(Lorg/elasticsearch/index/engine/Engine;Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (251 bytes) @ 0x00007f96bbe679e4 [0x00007f96bbe670c0+0x0000000000000924]
J 23760 c1 org.elasticsearch.index.shard.IndexShard.applyIndexOperation(Lorg/elasticsearch/index/engine/Engine;JJJLorg/elasticsearch/index/VersionType;JJJZLorg/elasticsearch/index/engine/Engine$Operation$Origin;Lorg/elasticsearch/index/mapper/SourceToParse;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (145 bytes) @ 0x00007f96bb7d76c4 [0x00007f96bb7d7140+0x0000000000000584]
J 25529 c1 org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnReplica(JJJJZLorg/elasticsearch/index/mapper/SourceToParse;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (27 bytes) @ 0x00007f96bbbe6f1c [0x00007f96bbbe6c60+0x00000000000002bc]
J 25528 c1 org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(Lorg/elasticsearch/action/DocWriteResponse;Lorg/elasticsearch/action/DocWriteRequest;Lorg/elasticsearch/index/shard/IndexShard;)Lorg/elasticsearch/index/engine/Engine$Result; org.elasticsearch.server@8.8.2 (217 bytes) @ 0x00007f96bbbe5ae4 [0x00007f96bbbe5420+0x00000000000006c4]
J 25547 c1 org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(Lorg/elasticsearch/action/bulk/BulkShardRequest;Lorg/elasticsearch/index/shard/IndexShard;)Lorg/elasticsearch/index/translog/Translog$Location; org.elasticsearch.server@8.8.2 (215 bytes) @ 0x00007f96bbbf797c [0x00007f96bbbf7100+0x000000000000087c]
j  org.elasticsearch.action.bulk.TransportShardBulkAction.lambda$dispatchedShardOperationOnReplica$4(Lorg/elasticsearch/action/bulk/BulkShardRequest;Lorg/elasticsearch/index/shard/IndexShard;)Lorg/elasticsearch/action/support/replication/TransportReplicationAction$ReplicaResult;+6 org.elasticsearch.server@8.8.2
j  org.elasticsearch.action.bulk.TransportShardBulkAction$$Lambda$8356+0x00000008023f2fc8.get()Ljava/lang/Object;+12 org.elasticsearch.server@8.8.2
J 19886 c1 org.elasticsearch.action.ActionListener.completeWith(Lorg/elasticsearch/action/ActionListener;Lorg/elasticsearch/common/CheckedSupplier;)V org.elasticsearch.server@8.8.2 (46 bytes) @ 0x00007f96b99f1e5c [0x00007f96b99f1d60+0x00000000000000fc]
j  org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnReplica(Lorg/elasticsearch/action/bulk/BulkShardRequest;Lorg/elasticsearch/index/shard/IndexShard;Lorg/elasticsearch/action/ActionListener;)V+9 org.elasticsearch.server@8.8.2
j  org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnReplica(Lorg/elasticsearch/action/support/replication/ReplicatedWriteRequest;Lorg/elasticsearch/index/shard/IndexShard;Lorg/elasticsearch/action/ActionListener;)V+7 org.elasticsearch.server@8.8.2
j  org.elasticsearch.action.support.replication.TransportWriteAction$2.doRun()V+16 org.elasticsearch.server@8.8.2
J 24015 c1 org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun()V org.elasticsearch.server@8.8.2 (43 bytes) @ 0x00007f96bb88ed94 [0x00007f96bb88ea60+0x0000000000000334]
J 26719 c2 org.elasticsearch.common.util.concurrent.AbstractRunnable.run()V org.elasticsearch.server@8.8.2 (32 bytes) @ 0x00007f96c1c22670 [0x00007f96c1c22620+0x0000000000000050]
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@20.0.1
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@20.0.1
j  java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5 java.base@20.0.1
j  java.lang.Thread.run()V+19 java.base@20.0.1
v  ~StubRoutines::call_stub 0x00007f96c041bcc6
V  [libjvm.so+0x8a84c5]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x315
V  [libjvm.so+0x8a9e32]  JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, JavaThread*)+0x1d2
V  [libjvm.so+0x97bcbe]  thread_entry(JavaThread*, JavaThread*)+0x8e
V  [libjvm.so+0x8bfdf8]  JavaThread::thread_main_inner() [clone .part.0]+0xb8
V  [libjvm.so+0xe598e6]  Thread::call_run()+0xa6
V  [libjvm.so+0xc895c8]  thread_native_entry(Thread*)+0xd8
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::access_field_patching Runtime1 stub 0x00007f96c054c258
J 26745 c1 org.elasticsearch.index.engine.InternalEngine.planIndexingAsNonPrimary(Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/InternalEngine$IndexingStrategy; org.elasticsearch.server@8.8.2 (176 bytes) @ 0x00007f96bbe79738 [0x00007f96bbe78e00+0x0000000000000938]
J 26744 c1 org.elasticsearch.index.engine.InternalEngine.indexingStrategyForOperation(Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/InternalEngine$IndexingStrategy; org.elasticsearch.server@8.8.2 (22 bytes) @ 0x00007f96bbe789d4 [0x00007f96bbe78840+0x0000000000000194]
J 26738 c1 org.elasticsearch.index.engine.InternalEngine.index(Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (939 bytes) @ 0x00007f96bbe6aef4 [0x00007f96bbe6a620+0x00000000000008d4]
J 26736 c1 org.elasticsearch.index.shard.IndexShard.index(Lorg/elasticsearch/index/engine/Engine;Lorg/elasticsearch/index/engine/Engine$Index;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (251 bytes) @ 0x00007f96bbe679e4 [0x00007f96bbe670c0+0x0000000000000924]
J 23760 c1 org.elasticsearch.index.shard.IndexShard.applyIndexOperation(Lorg/elasticsearch/index/engine/Engine;JJJLorg/elasticsearch/index/VersionType;JJJZLorg/elasticsearch/index/engine/Engine$Operation$Origin;Lorg/elasticsearch/index/mapper/SourceToParse;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (145 bytes) @ 0x00007f96bb7d76c4 [0x00007f96bb7d7140+0x0000000000000584]
J 25529 c1 org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnReplica(JJJJZLorg/elasticsearch/index/mapper/SourceToParse;)Lorg/elasticsearch/index/engine/Engine$IndexResult; org.elasticsearch.server@8.8.2 (27 bytes) @ 0x00007f96bbbe6f1c [0x00007f96bbbe6c60+0x00000000000002bc]
J 25528 c1 org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(Lorg/elasticsearch/action/DocWriteResponse;Lorg/elasticsearch/action/DocWriteRequest;Lorg/elasticsearch/index/shard/IndexShard;)Lorg/elasticsearch/index/engine/Engine$Result; org.elasticsearch.server@8.8.2 (217 bytes) @ 0x00007f96bbbe5ae4 [0x00007f96bbbe5420+0x00000000000006c4]
J 25547 c1 org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(Lorg/elasticsearch/action/bulk/BulkShardRequest;Lorg/elasticsearch/index/shard/IndexShard;)Lorg/elasticsearch/index/translog/Translog$Location; org.elasticsearch.server@8.8.2 (215 bytes) @ 0x00007f96bbbf797c [0x00007f96bbbf7100+0x000000000000087c]
j  org.elasticsearch.action.bulk.TransportShardBulkAction.lambda$dispatchedShardOperationOnReplica$4(Lorg/elasticsearch/action/bulk/BulkShardRequest;Lorg/elasticsearch/index/shard/IndexShard;)Lorg/elasticsearch/action/support/replication/TransportReplicationAction$ReplicaResult;+6 org.elasticsearch.server@8.8.2
j  org.elasticsearch.action.bulk.TransportShardBulkAction$$Lambda$8356+0x00000008023f2fc8.get()Ljava/lang/Object;+12 org.elasticsearch.server@8.8.2
J 19886 c1 org.elasticsearch.action.ActionListener.completeWith(Lorg/elasticsearch/action/ActionListener;Lorg/elasticsearch/common/CheckedSupplier;)V org.elasticsearch.server@8.8.2 (46 bytes) @ 0x00007f96b99f1e5c [0x00007f96b99f1d60+0x00000000000000fc]
j  org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnReplica(Lorg/elasticsearch/action/bulk/BulkShardRequest;Lorg/elasticsearch/index/shard/IndexShard;Lorg/elasticsearch/action/ActionListener;)V+9 org.elasticsearch.server@8.8.2
j  org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnReplica(Lorg/elasticsearch/action/support/replication/ReplicatedWriteRequest;Lorg/elasticsearch/index/shard/IndexShard;Lorg/elasticsearch/action/ActionListener;)V+7 org.elasticsearch.server@8.8.2
j  org.elasticsearch.action.support.replication.TransportWriteAction$2.doRun()V+16 org.elasticsearch.server@8.8.2
J 24015 c1 org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun()V org.elasticsearch.server@8.8.2 (43 bytes) @ 0x00007f96bb88ed94 [0x00007f96bb88ea60+0x0000000000000334]
J 26719 c2 org.elasticsearch.common.util.concurrent.AbstractRunnable.run()V org.elasticsearch.server@8.8.2 (32 bytes) @ 0x00007f96c1c22670 [0x00007f96c1c22620+0x0000000000000050]
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@20.0.1
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@20.0.1
j  java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5 java.base@20.0.1
j  java.lang.Thread.run()V+19 java.base@20.0.1
v  ~StubRoutines::call_stub 0x00007f96c041bcc6

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00007f9ed004ac78

vm_info: OpenJDK 64-Bit Server VM (20.0.1+9-29) for linux-amd64 JRE (20.0.1+9-29), built on 2023-03-07T13:56:11Z by "mach5one" with gcc 11.2.0


Full hs_err log attached.
Comments
No reproducer, not enough information to make progress. Looks like a bad oop, which could come from many sources, often from problems in native application code.
05-02-2024

Here's the code in question: 00000000007620e0 <RegisterNMethodOopClosure::do_oop(oopDesc**)>: 7620e0: 48 8b 06 mov (%rsi),%rax 7620e3: 48 85 c0 test %rax,%rax 7620e6: 74 40 je 762128 <RegisterNMethodOopClosure::do_oop(oopDesc**)+0x48> 7620e8: 48 8b 57 08 mov 0x8(%rdi),%rdx 7620ec: 48 8b 77 10 mov 0x10(%rdi),%rsi 7620f0: 8b 8a 48 02 00 00 mov 0x248(%rdx),%ecx 7620f6: 48 8b ba 40 02 00 00 mov 0x240(%rdx),%rdi 7620fd: 48 8b 92 28 02 00 00 mov 0x228(%rdx),%rdx 762104: 48 d3 e7 shl %cl,%rdi 762107: 48 8d 0d e6 12 bd 00 lea 0xbd12e6(%rip),%rcx # 13333f4 <HeapRegion::LogOfHRGrainBytes> 76210e: 48 29 f8 sub %rdi,%rax 762111: 8b 09 mov (%rcx),%ecx 762113: 48 d3 e8 shr %cl,%rax 762116: 89 c0 mov %eax,%eax 762118: 48 8b 3c c2 mov (%rdx,%rax,8),%rdi 76211c: e9 9f 9b 0e 00 jmpq 84bcc0 <HeapRegion::add_code_root_locked(nmethod*)> 762121: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 762128: c3 retq 762129: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) so the move seems to be this code from G1CollectedHeap::heap_region_containing() --> region_at() --> HeapRegionManager::at() --> G1HeapRegionTable::get_by_index(): G1BiasedMappedArrayBase::_base[index] %rdi would be G1BiasedMappedArrayBase::_base %rax would be "uint const region_idx = addr_to_region(addr);" If this was a debug build then probably some of the asserts would have fired before getting this far. So there is probably a bad oop, and there's a good chance it's the new oop the patching code is trying to add.
19-01-2024

Still hoping for a reproducer [~chegar] .
15-01-2024

There are some very old bugs (circa JDK8 and JDK11) that look somewhat similar, some of which were fixed and some resolved as not reproducible. I didn't find anything particularly recent though. This failure was in JDK 20.0.1. Not sure whether it is before or after JDK-8290025 - "Remove the Sweeper" (JDK 20 b13), which seems potentially relevant. If before, maybe it's no longer an issue. If after, I think there might have been a bit of a bug tail on that change, so again might no longer be an issue. The failing instruction is: mov rdi, [rdx+rax*8] rax: 0x00000000ffffefff rdx: 0x00007f96d0052c80 CodeHeap 'non-profiled nmethods' bounds [0x00007f96c09ac000, 0x00007f96c1ccc000, 0x00007f96c7ee4000] CodeHeap 'profiled nmethods' bounds [0x00007f96b8ee4000, 0x00007f96bbf84000, 0x00007f96c041b000] CodeHeap 'non-nmethods' bounds [0x00007f96c041b000, 0x00007f96c06fb000, 0x00007f96c09ac000] So rdx doesn't appear to be in the code heap. Digging deeper into the hs_err file, it looks like rdx is in an anonymous mmapped region between two elasticsearch entries: RDX: 7f96d0052c80 7f96c7ee4000-7f96d0000000 r--s 00000000 fe:00 285614 /usr/share/elasticsearch/jdk/lib/modules 7f96d0000000-7f96d4000000 rw-p 00000000 00:00 0 7f96d4000000-7f96d4002000 r--s 00000000 fe:01 14024907 /var/lib/elasticsearch/indices/9PNcCc4kQKuvGbe3OPl1Ow/0/index/_m.cfs
20-12-2023

[~chegar] Is there a reproducer for this? Hopefully something smaller than running Elasticsearch and then hitting it with lots of queries.
18-12-2023

Thanks Kim. I'll check the jdk20u repository, but unless something really "bad" happened, then JDK-8290025 *should* be included in JDK 20.0.1.
20-07-2023