JDK-8231635 : SA Stackwalking code stuck in BasicTypeDataBase.findDynamicTypeForAddress()
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc-agent
  • Affected Version: 14
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-09-30
  • Updated: 2023-08-04
  • Resolved: 2019-11-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 14
14 b24Fixed
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8232276 :  
Description
When running "jhsdb jmap" on on java process, on occasion it will get stuck, I believe in an infinite loop, resulting in a timeout if it happens during a test. For the most part our jmap tests never see this, I believe because all threads are idle (blocked). When you have an active process, you see this failure somewhat frequently. The new HeapDumpTestWithActiveProcess.java test uncovers it on about 2% of runs on linux-x64.

I believe this issue was introduced by JDK-6760477, which added stack traces to the jmap heap dump.

Note this issue is also covered by JDK-8230872, which I will be closing once I get appropriate CRs files. 

This failure is most commonly noted when HeapDumpTestWithActiveProcess.java times out. In the log you will see:

----------System.out:(230/19729)----------
Starting Jshell
[JShell]:/opt/mach5/mesos/work_dir/jib-master/install/2019-09-30-1717232.chris.plummer.jdk.clean/linux-x64-debug.jdk/jdk-14/fastdebug/bin/jshell
[JShell] |  Welcome to JShell -- Version 14-internal
[JShell] |  For an introduction type: /help intro
Jshell Started in 3926ms
Starting jmap against 19214
[2019-09-30T17:54:36.169781Z] Gathering output for process 19271
Timeout refired 960 times
...
java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0]
	at JShellHeapDumpTest.launch(JShellHeapDumpTest.java:79)
	at JShellHeapDumpTest.launch(JShellHeapDumpTest.java:94)
	at JShellHeapDumpTest.testHeapDump(JShellHeapDumpTest.java:115)
	at JShellHeapDumpTest.main(JShellHeapDumpTest.java:162)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
	at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: java.lang.RuntimeException: Expected to get exit value of [0]
	at jdk.test.lib.process.OutputAnalyzer.shouldHaveExitValue(OutputAnalyzer.java:455)
	at JShellHeapDumpTest.launch(JShellHeapDumpTest.java:77)
	... 9 more

Here's the backtrace you see if you look at the jmap process in the processes.html file after HeapDumpTestWithActiveProcess.java gets a timeout:

	at sun.jvm.hotspot.types.basic.BasicTypeDataBase.findDynamicTypeForAddress(jdk.hotspot.agent@14-internal/BasicTypeDataBase.java:328)
	at sun.jvm.hotspot.runtime.VirtualBaseConstructor.instantiateWrapperFor(jdk.hotspot.agent@14-internal/VirtualBaseConstructor.java:102)
	at sun.jvm.hotspot.utilities.GrowableArray.at(jdk.hotspot.agent@14-internal/GrowableArray.java:61)
	at sun.jvm.hotspot.code.CodeCache.findBlobUnsafe(jdk.hotspot.agent@14-internal/CodeCache.java:102)
	at sun.jvm.hotspot.code.CodeCache.findBlob(jdk.hotspot.agent@14-internal/CodeCache.java:85)
	at sun.jvm.hotspot.runtime.x86.X86Frame.sender(jdk.hotspot.agent@14-internal/X86Frame.java:282)
	at sun.jvm.hotspot.runtime.Frame.sender(jdk.hotspot.agent@14-internal/Frame.java:205)
	at sun.jvm.hotspot.runtime.amd64.AMD64CurrentFrameGuess.run(jdk.hotspot.agent@14-internal/AMD64CurrentFrameGuess.java:140)
	at sun.jvm.hotspot.runtime.linux_amd64.LinuxAMD64JavaThreadPDAccess.getCurrentFrameGuess(jdk.hotspot.agent@14-internal/LinuxAMD64JavaThreadPDAccess.java:95)
	at sun.jvm.hotspot.runtime.JavaThread.getCurrentFrameGuess(jdk.hotspot.agent@14-internal/JavaThread.java:265)
	at sun.jvm.hotspot.runtime.JavaThread.getLastJavaVFrameDbg(jdk.hotspot.agent@14-internal/JavaThread.java:227)
	at sun.jvm.hotspot.runtime.ThreadStackTrace.dumpStack(jdk.hotspot.agent@14-internal/ThreadStackTrace.java:54)
	at sun.jvm.hotspot.utilities.HeapHprofBinWriter.dumpStackTraces(jdk.hotspot.agent@14-internal/HeapHprofBinWriter.java:718)
	at sun.jvm.hotspot.utilities.HeapHprofBinWriter.write(jdk.hotspot.agent@14-internal/HeapHprofBinWriter.java:439)
	- locked <0x0000000415c585c8> (a sun.jvm.hotspot.utilities.HeapHprofBinWriter)
	at sun.jvm.hotspot.tools.JMap.writeHeapHprofBin(jdk.hotspot.agent@14-internal/JMap.java:182)
	at sun.jvm.hotspot.tools.JMap.run(jdk.hotspot.agent@14-internal/JMap.java:97)
	at sun.jvm.hotspot.tools.Tool.startInternal(jdk.hotspot.agent@14-internal/Tool.java:262)
	at sun.jvm.hotspot.tools.Tool.start(jdk.hotspot.agent@14-internal/Tool.java:225)
	at sun.jvm.hotspot.tools.Tool.execute(jdk.hotspot.agent@14-internal/Tool.java:118)
	at sun.jvm.hotspot.tools.JMap.main(jdk.hotspot.agent@14-internal/JMap.java:176)
	at sun.jvm.hotspot.SALauncher.runJMAP(jdk.hotspot.agent@14-internal/SALauncher.java:321)
	at sun.jvm.hotspot.SALauncher$$Lambda$4/0x0000000800ba5c40.accept(jdk.hotspot.agent@14-internal/Unknown Source)
	at sun.jvm.hotspot.SALauncher.main(jdk.hotspot.agent@14-internal/SALauncher.java:406)
Comments
URL: https://hg.openjdk.java.net/jdk/jdk/rev/5f458adec06b User: cjplummer Date: 2019-11-13 21:06:33 +0000
13-11-2019

The bug appears to be in AMD64CurrentFrameGuess.run(). Please look at the description section of this CR to see where this method appears in the stack trace. It gets stuck in the following loop (the inner "while" loop") for (long offset = 0; offset < regionInBytesToSearch; offset += vm.getAddressSize()) { try { Address curSP = sp.addOffsetTo(offset); Frame frame = new X86Frame(curSP, null, pc); RegisterMap map = thread.newRegisterMap(false); while (frame != null) { if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { // We were able to traverse all the way to the // bottommost Java frame. // This sp looks good. Keep it. if (DEBUG) { System.out.println("CurrentFrameGuess: Choosing sp = " + curSP + ", pc = " + pc); } setValues(curSP, null, pc); return true; } frame = frame.sender(map); } } catch (Exception e) { if (DEBUG) { System.out.println("CurrentFrameGuess: Exception " + e + " at offset " + offset); } // Bad SP. Try another. } } I confirmed that that frame returned by frame.sender(map) is the same as the frame itself. The identical frame was detected with: Frame oldFrame = frame; frame = frame.sender(map); if (oldFrame.toString().equals(frame.toString())) { System.out.println("AMD64CurrentFrameGuess.run: oldFrame == frame: " + frame); } Which is printing out: AMD64CurrentFrameGuess.run: oldFrame == frame: sp: 0x0000004c31d1f190, unextendedSP: 0x0000004c31d1f190, fp: 0x000000043fca8b80, pc: 0x0000004c00000014 I don't know much about this code, but I think when this happens we can just abort the frame walking for the current offset ("break" out of the "while" loop), and the outer loop will try the next offset.
01-11-2019

Three tests fail due to this bug: serviceability/sa/ClhsdbJstackXcompStress.java serviceability/sa/TestJhsdbJstackMixed.java jdk/sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java
29-10-2019