JDK-8313814 : Test serviceability/sa/ClhsdbSymbol.java timed out after passing
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc-agent
  • Affected Version: 22
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: windows
  • Submitted: 2023-08-06
  • Updated: 2024-09-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Description
hsdb> + quit

 LingeredApp stdout: [[0.039s][info][cds] trying to map c:\\ade\\mesos\\work_dir\\jib-master\\install\\jdk-22+10-658\\windows-x64-debug.jdk\\jdk-22\\fastdebug\\bin\\server\\classes.jsa
[0.039s][info][cds] Opened archive c:\\ade\\mesos\\work_dir\\jib-master\\install\\jdk-22+10-658\\windows-x64-debug.jdk\\jdk-22\\fastdebug\\bin\\server\\classes.jsa.
[0.039s][info][cds] Archive was created with UseCompressedOops = 1, UseCompressedClassPointers = 1
[0.039s][info][cds] full module graph: disabled because archive was created without full module graph
[0.039s][info][cds] Core region alignment: 65536
[0.039s][info][cds] Reserved archive_space_rs [0x0000000800000000 - 0x0000000800d10000] (13697024) bytes
[0.039s][info][cds] Reserved class_space_rs   [0x0000000801000000 - 0x0000000841000000] (1073741824) bytes
[0.039s][info][cds] Windows mmap workaround: releasing archive space.
[0.039s][info][cds] Mapped static  region #0 at base 0x0000000800000000 top 0x0000000800500000 (ReadWrite)
[0.039s][info][cds] Mapped static  region #1 at base 0x0000000800500000 top 0x0000000800d10000 (ReadOnly)
[0.040s][info][cds] optimized module handling: enabled
[0.040s][info][cds] full module graph: disabled
];
 LingeredApp stderr: []
 LingeredApp exitValue = 0
Test PASSED
----------System.err:(1/16)----------
STATUS:Passed.
----------rerun:(42/6336)*----------
...
result: Error. Program `c:\ade\mesos\work_dir\jib-master\install\jdk-22+10-658\windows-x64-debug.jdk\jdk-22\fastdebug\bin\java' timed out (timeout set to 480000ms, elapsed time including timeout handling was 720894ms).

Seems like a jtreg issue, but could be an issue with the processes created by the test.
Comments
Looks like this is the same issue as JDK-8285729 "serviceability/sa/ClhsdbThreadContext.java timed out". In both cases we are getting stuck for a long period of time in a native Windows API called by WindbgDebuggerLocal.lookupByName0(). We'll keep both issues open for easier bookkeeping for the two different tests that are failing.
27-07-2024

I just saw a very similar failure with serviceability/sa/ClhsdbThreadContext.java. It passed, but timed out. In the log you see: Starting clhsdb against 2772 [2024-05-07T07:50:07.538661800Z] Gathering output for process 7960 Timeout signalled after 480 seconds [2024-05-07T08:03:52.808095200Z] Waiting for completion for process 7960 [2024-05-07T08:03:52.809119200Z] Waiting for completion finished for process 7960 Output: Attaching to process 2772, please wait... hsdb> hsdb> + verbose true So it spent over 12 minutes waiting for the clhsdb command to complete, which it eventually did (and did so without error). The similarity to serviceability/sa/ClhsdbSymbol.java this CR was originally filed for is that I see that identical WindbgDebuggerLocal.lookupByName0() stack dump that was made when the timeout was triggered. One difference however is that all 6 stack dumps look the same, whereas for the previous serviceability/sa/ClhsdbSymbol.java failure only the first stack dump looked like this, and subsequent stack dumps said the process had exited.
07-05-2024

Early on in the test output we see: Starting clhsdb against 12436 [2024-05-03T11:55:12.387242200Z] Gathering output for process 6560 Timeout signalled after 480 seconds [2024-05-03T12:05:49.908662800Z] Waiting for completion for process 6560 [2024-05-03T12:05:49.908662800Z] Waiting for completion finished for process 6560 Output: Attaching to process 12436, please wait... hsdb> hsdb> + verbose true hsdb> + inspect 0x000002324a1a2f20 <snip inspect output> hsdb> + quit Note the timeout message. So it timed out during the inspect command, but the command eventually completed properly, as did the test, and the test passed. A thread dump (produced when the test timed out) for process 6560 shows: "jdk.hotspot.agent\/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.lookupByName0(Native Method)", "jdk.hotspot.agent\/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.lookupByName(WindbgDebuggerLocal.java:425)", "jdk.hotspot.agent\/sun.jvm.hotspot.debugger.windbg.WindbgDebuggerLocal.lookup(WindbgDebuggerLocal.java:219)", "jdk.hotspot.agent\/sun.jvm.hotspot.types.basic.BasicVtblAccess.getVtblForType(BasicVtblAccess.java:63)", "jdk.hotspot.agent\/sun.jvm.hotspot.types.basic.BasicTypeDataBase.vtblForType(BasicTypeDataBase.java:164)", "jdk.hotspot.agent\/sun.jvm.hotspot.types.basic.BasicTypeDataBase.addressTypeIsEqualToType(BasicTypeDataBase.java:180)", "jdk.hotspot.agent\/sun.jvm.hotspot.types.basic.BasicTypeDataBase.guessTypeForAddress(BasicTypeDataBase.java:273)", "jdk.hotspot.agent\/sun.jvm.hotspot.CommandProcessor$25.doit(CommandProcessor.java:1119)", "jdk.hotspot.agent\/sun.jvm.hotspot.CommandProcessor.executeCommand(CommandProcessor.java:2230)", "jdk.hotspot.agent\/sun.jvm.hotspot.CommandProcessor.executeCommand(CommandProcessor.java:2200)", "jdk.hotspot.agent\/sun.jvm.hotspot.CommandProcessor.run(CommandProcessor.java:2071)", "jdk.hotspot.agent\/sun.jvm.hotspot.CLHSDB.run(CLHSDB.java:112)", "jdk.hotspot.agent\/sun.jvm.hotspot.CLHSDB.main(CLHSDB.java:44)", "jdk.hotspot.agent\/sun.jvm.hotspot.SALauncher.runCLHSDB(SALauncher.java:281)", "jdk.hotspot.agent\/sun.jvm.hotspot.SALauncher.main(SALauncher.java:500)" This is an expected stack for the inspect command, but it is stuck (temporarily) in WindbgDebuggerLocal.lookupByName0(). The native stack is: 1 Id: 19a0.449c Suspend: 1 Teb: 0000007a`8e187000 Unfrozen # Child-SP RetAddr Call Site 00 0000007a`8e3f9ac0 00007ff8`61c7fcec DBGHELP!SymLoadModuleEx+0x4206 01 0000007a`8e3f9b10 00007ff8`61c7fc60 DBGHELP!SymLoadModuleEx+0x4e7c 02 0000007a`8e3f9b40 00007ff8`61cf0457 DBGHELP!SymLoadModuleEx+0x4df0 03 0000007a`8e3f9fb0 00007ff8`61ca0f21 DBGHELP!SymLoadModuleEx+0x755e7 04 0000007a`8e3f9fe0 00007ff8`61d8073e DBGHELP!SymLoadModuleEx+0x260b1 05 0000007a`8e3fa020 00007ff8`61c798b1 DBGHELP!SetCheckUserInterruptShared+0x6f7e 06 0000007a`8e3fb0b0 00007ff8`61daabc7 DBGHELP!SymFromName+0xad1 07 0000007a`8e3fb0f0 00007ff8`61d9dcc8 DBGHELP!SymFindFileInPathW+0x3b87 08 0000007a`8e3fb130 00007ff8`61d9cdd8 DBGHELP!SymGetFileLineOffsets64+0x4538 09 0000007a`8e3fb1d0 00007ff8`61da0d05 DBGHELP!SymGetFileLineOffsets64+0x3648 0a 0000007a`8e3fd1d0 00007ff8`586232ed DBGHELP!SymEnumSymbolsExW+0x55 0b 0000007a`8e3fd230 00007ff8`58623161 dbgeng+0x3032ed 0c 0000007a`8e3fd3c0 00007ff8`58361f45 dbgeng+0x303161 0d 0000007a`8e3fe0f0 00007ff8`58362332 dbgeng+0x41f45 0e 0000007a`8e3fe4f0 00007ff8`5857877a dbgeng+0x42332 0f 0000007a`8e3fe730 00007ff8`58578687 dbgeng+0x25877a *** WARNING: Unable to verify checksum for C:\ade\mesos\work_dir\jib-master\install\jdk-23+22-1763\windows-x64-debug.jdk\jdk-23\fastdebug\bin\saproc.dll 10 0000007a`8e3fe780 00007ff8`5f28339d dbgeng+0x258687 11 0000007a`8e3fea10 00000215`4745a76b saproc!Java_sun_jvm_hotspot_debugger_windbg_WindbgDebuggerLocal_lookupByName0( struct JNIEnv_ * env = 0x00000215`372e1fc0, class _jobject * obj = <Value unavailable error>, class _jstring * objName = 0x0000007a`8e3feaa0, class _jstring * sym = 0x0000007a`8e3fea70)+0xad 12 0000007a`8e3fea40 00000000`00000000 0x00000215`4745a76b So SA has made it into the Windows lib for debugger support, and it appears stuck there...for a while. This is the first iteration of an attempted 6 stack dumps. What's interesting is that starting with the 2nd attempt all you see in the output is: Cannot debug pid 6560, NTSTATUS 0xC000010A "An attempt was made to access an exiting process." So it seems that producing the first stack dump unstuck the process.
03-05-2024

JDK-8310860 is another "passed but timed out" failure, but the failure message is different, so I don't think it is timing out for the same reason: result: Error. Agent error: java.lang.Exception: Agent 19 timed out with a timeout of 960 seconds; check console log for any additional details
21-08-2023

We may already have multiple svc bugs filed with a similar "Test PASSED" issue, but at the moment I can't find any. If I do I'll link them.
07-08-2023