JDK-8245264 : Test runtime/cds/appcds/SignedJar.java fails
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 15
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2020-05-19
  • Updated: 2024-10-03
  • Resolved: 2020-05-28
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 15
15 b25Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8245280 :  
Description
----------System.err:(34/2227)----------
 stdout: [[0.004s][info][class,load] opened: /opt/mach5/mesos/work_dir/jib-master/install/jdk-15+24-1145/linux-x64-debug.jdk/jdk-15/fastdebug/lib/modules
[0.050s][info][class,load] java.lang.Object source: shared objects file
[0.050s][info][class,load] java.io.Serializable source: shared objects file
[0.050s][info][class,load] java.lang.Comparable source: shared objects file
[0.050s][info][class,load] java.lang.CharSequence source: shared objects file
[0.050s][info][class,load] java.lang.constant.Constable source: shared objects file
[0.050s][info][class,load] java.lang.constant.ConstantDesc source: shared objects file
[0.051s][info][class,load] java.lang.String source: shared objects file
[0.051s][info][class,load] java.lang.reflect.AnnotatedElement source: shared objects file
[0.051s][info][class,load] java.lang.reflect.GenericDeclaration source: shared objects file
[0.051s][info][class,load] java.lang.reflect.Type source: shared objects file
[0.051s][info][class,load] java.lang.invoke.TypeDescriptor source: shared objects file
[0.051s][info][class,load] java.lang.invoke.TypeDescriptor$OfField source: shared objects file
[0.051s][info][class,load] java.lang.Class source: shared objects file
Error occurred during initialization of VM
java/lang/NoSuchMethodError: Method 'int java.lang.Object.hashCode()' name or signature does not match
];
 stderr: []
 exitValue = 1

Failure appeared with changes to JDK-8242151 and JDK-8245151, both related to siging and jars and the test uses a signed jar. Nothing obvious though. I suspect an error of some kind hidden during jar creation.
Comments
URL: https://hg.openjdk.java.net/jdk/jdk/rev/8ada30d6eae7 User: minqi Date: 2020-05-27 23:04:44 +0000
28-05-2020

I am surprised that we never see the same problem with heapDump.cpp. If the heap is dumped while the symbol table is being resized, the SymbolTableDumper will produce an empty output. This will make the heap dump invalid, because we will only have the address of all the symbols, but will not have the characters for the symbols. http://hg.openjdk.java.net/jdk/jdk/file/ca1687338afe/src/hotspot/share/services/heapDumper.cpp#l1835
26-05-2020

If we don't care about the concurrent work on SymbolTable or StringTable, use _local_table like dynamicDump did, still can collect symbols. Another version is simpler: --- a/src/hotspot/share/memory/metaspaceShared.cpp Thu May 21 15:56:27 2020 -0700 +++ b/src/hotspot/share/memory/metaspaceShared.cpp Fri May 22 18:19:46 2020 -0700 @@ -1150,12 +1150,14 @@ bool allow_nested_vm_operations() const { return true; } }; // class VM_PopulateDumpSharedSpace -class SortedSymbolClosure: public SymbolClosure { +class SortedSymbolClosure: public UniqueMetaspaceClosure { GrowableArray<Symbol*> _symbols; - virtual void do_symbol(Symbol** sym) { - assert((*sym)->is_permanent(), "archived symbols must be permanent"); - _symbols.append(*sym); + virtual bool do_unique_ref(Ref* ref, bool read_only) { + assert(ref->msotype() == MetaspaceObj::SymbolType, "must be"); + _symbols.append((Symbol*)ref->obj()); + return true; } + static int compare_symbols_by_address(Symbol** a, Symbol** b) { if (a[0] < b[0]) { return -1; @@ -1168,7 +1170,7 @@ public: SortedSymbolClosure() { - SymbolTable::symbols_do(this); + SymbolTable::metaspace_pointers_do(this); _symbols.sort(compare_symbols_by_address); } GrowableArray<Symbol*>* get_sorted_symbols() {
23-05-2020

[~dholmes] I agree that this may not be the ideal fix, but it should work around the potential build problem. For a more permanent fix, we can avoid walking the symbol table during static dump time (dynamic dump does not walk the symbol table): + when creating extra symbols from SharedArchiveConfigFile, record each such symbol in a global array + at the beginning of dumping, use GatherKlassesAndSymbols (from dynamicArchive.cpp) to discover all symbols that are reachable from the classes that will be archived. This will collect all the symbols that we need, without walking the symbol table. This has the extra advantage of not archiving temporary symbols that are not needed by the archive. I will file a separate RFE.
22-05-2020

[~iklam] indeed your patch did clear/resolve the build issue I was seeing, thanks
22-05-2020

Yes I understood what Ioi's patch does. I'm questioning if that is the best way to try and fix this.
22-05-2020

If the jar sign tool change caused the problem, the problem may exist and not triggered before. Ioi's patch did not change the code processing StringTable and SymbolTable, it only checks at the very beginning of VM_PopulateDumpSharedSpace (doit, safepoint already) the work for StringTable and SymbolTable, if the work is in process, abort this vm operation and loop until the work finished.
22-05-2020

If the issue is that the symbol table is not walkable because the ServiceThread has entered the safepoint whilst in the middle of processing the symbol table, then can we not change the symbol table processing code so that we don't enter a safepoint if we are dumping?
22-05-2020

[~lcable] could you try this and see if it can fix your build issue? http://cr.openjdk.java.net/~iklam/jdk15/tmp-8245264-prototype/ Thanks!
22-05-2020

Upgraded to P2 since this can cause (intermittent?) build failures on macos. I checked our CI and haven't seen such failures in the main pipeline, but it might have happened in other pipelines or personal builds.
22-05-2020

I have seen this same problem while building the JDK on my dev machine (MacOS 10.14.6) using 'make images' with a "std" 'configure' using the macosx-x86_64-server-release configuration. The problem is not resolved by 'make clean', nor by 'make dist-clean', or by rm-ing build/*, reconfiguring and rebuilding ... the error always re-appears, although not other build errors are indicated... Compiling 1 files for CLASSLIST_JAR Creating support/classlist.jar ERROR: Failed to generate link optimization data. This is likely a problem with the newly built JVM/JDK. Error occurred during initialization of VM java/lang/NoSuchMethodError: Method 'int java.lang.Object.hashCode()' name or signature does not match make[3]: *** [/Users/lpgc/src/jdk/build/macosx-x86_64-server-release/support/link_opt/classlist] Error 1 make[2]: *** [generate-link-opt-data] Error 2
22-05-2020

This has also been seen during a JDK build: Compiling 1 files for CLASSLIST_JAR Creating support/classlist.jar ERROR: Failed to generate link optimization data. This is likely a problem with the newly built JVM/JDK. Error occurred during initialization of VM java/lang/NoSuchMethodError: Method 'int java.lang.Object.hashCode()' name or signature does not match make[3]: *** [/Users/lpgc/src/jdk/build/macosx-x86_64-server-release/support/link_opt/classlist] Error 1
22-05-2020

Somewhat related bug that involves CDS and StringTable: JDK-8213574 Deadlock in string table expansion when dumping lots of CDS classes. Both StringTable and SymbolTable are based on ConcurrentHashTable.
21-05-2020

This problem is triggered by the SignedJar.java test because the cryptography code used in code signing causes more activities on the symbol table, leading to table resizing which happens to be not finished when the CDS dumping code enters the safepoint. Once the safepoint is entered, the concurrent symbol table work (in a ServiceThread) is suspended. As a result, the symbol table is not walkable.
21-05-2020

According to Calvin's investigation, the bug happens because we failed to populate SortedSymbolClosure::_symbols. http://hg.openjdk.java.net/jdk/jdk/file/6d2c3c2fcb43/src/hotspot/share/memory/metaspaceShared.cpp#l1153 As a result, the methods in archived classes are no longer sorted properly, leading to runtime failures. ============================= // Call function for all symbols in the symbol table. void SymbolTable::symbols_do(SymbolClosure *cl) { // all symbols from shared table SharedSymbolIterator iter(cl); _shared_table.iterate(&iter); _dynamic_shared_table.iterate(&iter); // all symbols from the dynamic table SymbolsDo sd(cl); if (!_local_table->try_scan(Thread::current(), sd)) { log_info(symboltable)("symbols_do unavailable at this moment"); } } virtual void do_symbol(Symbol** sym) { assert((*sym)->is_permanent(), "archived symbols must be permanent"); _symbols.append(*sym); } Below is the try_scan: inline bool ConcurrentHashTable<CONFIG, F>:: try_scan(Thread* thread, SCAN_FUNC& scan_f) { if (!try_resize_lock(thread)) { return false; } do_scan_locked(thread, scan_f); unlock_resize_lock(thread); return true; } For the signed jar case, it returns false because the _resize_lock_owner was held by another thread. So in try_resize_lock, it returns false: inline bool ConcurrentHashTable<CONFIG, F>:: try_resize_lock(Thread* locker) { if (_resize_lock->try_lock()) { if (_resize_lock_owner != NULL) { assert(locker != _resize_lock_owner, "Already own lock"); // We got mutex but internal state is locked. _resize_lock->unlock(); return false; <<<< return false here } } else { return false; } _invisible_epoch = 0; _resize_lock_owner = locker; return true; } The _resize_lock_owner is a ServiceThread (Thread #12 below) for cleaning up the SymbolTable? I guess some GC activity occurred before the dumping started? Thread #8 below is the VM thread for archive dumping but it started later and thus saw the _resize_lock_owner is non-NULL in try_resize_lock. Thread #12 [Service Thread] 46050 [core: 5] (Suspended : Breakpoint) ConcurrentHashTable<SymbolTableConfig, (MemoryType)10>::try_resize_lock at concurrentHashTable.inline.hpp:306 0x7ffff673d942 ConcurrentHashTable<SymbolTableConfig, (MemoryType)10>::BulkDeleteTask::prepare at concurrentHashTableTasks.inline.hpp:130 0x7ffff673cfb6 SymbolTable::clean_dead_entries() at symbolTable.cpp:721 0x7ffff673a0f1 SymbolTable::do_concurrent_work() at symbolTable.cpp:767 0x7ffff673a35a ServiceThread::service_thread_entry() at serviceThread.cpp:157 0x7ffff668775a JavaThread::thread_main_inner() at thread.cpp:1,969 0x7ffff6792f42 JavaThread::run() at thread.cpp:1,952 0x7ffff6792df3 Thread::call_run() at thread.cpp:399 0x7ffff678f07e thread_native_entry() at os_linux.cpp:791 0x7ffff65a4bca start_thread() at 0x7ffff79b0dd5 <...more frames...> Thread #8 [VM Thread] 46046 [core: 0] (Suspended : Breakpoint) ConcurrentHashTable<SymbolTableConfig, (MemoryType)10>::try_resize_lock at concurrentHashTable.inline.hpp:306 0x7ffff673d942 ConcurrentHashTable<SymbolTableConfig, (MemoryType)10>::try_scan<SymbolsDo> at concurrentHashTable.inline.hpp:1,083 0x7ffff673c1bd SymbolTable::symbols_do() at symbolTable.cpp:279 0x7ffff6738efe SortedSymbolClosure::SortedSymbolClosure() at metaspaceShared.cpp:1,171 0x7ffff64ff596 ArchiveCompactor::copy_and_compact() at metaspaceShared.cpp:1,334 0x7ffff64ffe9d VM_PopulateDumpSharedSpace::doit() at metaspaceShared.cpp:1,605 0x7ffff64fac1a VM_Operation::evaluate() at vmOperations.cpp:67 0x7ffff6819fb2 VMThread::evaluate_operation() at vmThread.cpp:374 0x7ffff6852d24 VMThread::loop() at vmThread.cpp:512 0x7ffff68532eb VMThread::run() at vmThread.cpp:273 0x7ffff685289e <...more frames...>
21-05-2020

ILW = MHM = P3
19-05-2020

The problem relates to shared archive file created by the test. The jar files by themselves seem fine. But using the jsa file leads to the failure. With logging enabled all we see is: [0.394s][debug][cds,mirror ] [Ljava.lang.Class; has raw archived mirror [0.394s][info ][class,load ] java.lang.Class source: shared objects file [0.394s][debug][class,load ] klass: 0x000000080000bfd8 super: 0x0000000800007420 interfaces: 0x0000000800007cb8 0x000000080000d358 0x000000080000d9a8 0x000000080000c2f0 0x000000080000dea0 0x000000080000e4c0 loader: [loader data: 0x00007f86d41fdfb0 of 'bootstrap'] Error occurred during initialization of VM java/lang/NoSuchMethodError: Method 'int java.lang.Object.hashCode()' name or signature does not match I have to assume the archive is corrupted somehow.
19-05-2020

I've confirmed this is caused by JDK-8242151.
19-05-2020