JDK-8334866 : Improve Speed of ElfDecoder source search
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-06-24
  • Updated: 2025-11-05
  • Resolved: 2025-11-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 26
26 masterFixed
Related Reports
Relates :  
Description
ElfDecoder source search (dwarf scanning) is really slow. My preliminary perf analysis suggests we spend ~40% in file reads. 

Source search is used during call stack printing when we crash and write the hs-err file, and for NMT detail reports. Especially for the former case speed is important. VMs must finish writing the error log quickly in order for the customer to be able to restart the java service. That is why we limit error log printing time (ErrorLogTimeout).

The task is to analyze performance more in depth and to check if we can improve this (e.g. by caching somewhere).
Comments
Changeset: dddfcd03 Branch: master Author: Kerem Kat <krk@openjdk.org> Committer: Aleksey Shipilev <shade@openjdk.org> Date: 2025-11-05 08:33:14 +0000 URL: https://git.openjdk.org/jdk/commit/dddfcd03aa30514d63eceff707d48bff35e93c56
05-11-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/27337 Date: 2025-09-17 10:14:09 +0000
17-09-2025

I have profiled ElfDecoder by introducing Decoder::get_source_info calls in JfrStackTrace::record_inner and enabling JFR, to increase method call counts. Output of get_source_info is ignored in the benchmark. Top contender on the critical path is DwarfFile::DebugAranges::read_address_descriptors, as it scans the entire .debug_aranges section linearly, until it finds a .debug_info corresponding offset for its offset_in_library argument. To resolve this, the proposed change caches the .debug_aranges section in a sorted array and uses a binary search for lookups. I observed ~3000x speedup from ~1ms to ~300ns in resolving a single address to a debug_info offset in my tests. This completely removes the find_compilation_unit_offset function from the critical path for resolving line numbers for a given address. As an example, the cache for libjvm.so would contain ~50K entries and occupy ~1200 KB in a release build. We could take alternative approaches too. Supporting the .gdb_index section would offer fast, mmap-able lookups, but it is a non-standard DWARF section, increases binary sizes, and is not enabled by default. Similarly, the newer .debug_names section in DWARF 5 would still require preprocessing. After this bottleneck is resolved, the next area for optimization appears to be the DwarfFile::LineNumberProgram, which emulates the DWARF virtual machine to calculate line numbers.
16-09-2025