JDK-8175249 : VMThread::run fails in VerifyBeforeExit : Universe::verify
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9,11
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2017-02-20
  • Updated: 2019-06-20
  • Resolved: 2018-01-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11 b01Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
#0  0xf7705430 in __kernel_vsyscall ()
#1  0x4479d871 in raise () from /lib/libc.so.6
#2  0x4479f14a in abort () from /lib/libc.so.6
#3  0xf6dc2e31 in os::abort(bool, void*, void const*) () at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/os/linux/vm/os_linux.cpp:1378
#4  0xf70f06e6 in VMError::report_and_die(int, char const*, char const*, char*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned int) ()
    at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/utilities/vmError.cpp:1359
#5  0xf70f114f in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) ()
    at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/utilities/vmError.cpp:1103
#6  0xf70f1199 in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*) ()
    at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/utilities/vmError.cpp:1109
#7  0xf6dd25ed in JVM_handle_linux_signal () at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:610
#8  0xf6dbf62e in signalHandler(int, siginfo*, void*) () at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/os/linux/vm/os_linux.cpp:4229
#9  <signal handler called>
#10 oopDesc::verify() () at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/oops/oop.cpp:88
#11 0xf66c4934 in Dictionary::verify() () at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/classfile/dictionary.hpp:321
#12 0xf7005225 in SystemDictionary::verify() ()
    at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/classfile/systemDictionary.cpp:2912
#13 0xf708d91e in Universe::verify(VerifyOption, char const*) ()
    at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/memory/universe.cpp:1229
#14 0xf712fcdd in VMThread::run() () at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/share/vm/memory/universe.hpp:505
#15 0xf6dce754 in thread_native_entry(Thread*) () at /scratch/workspace/9-2-build-linux-i586-phase2/jdk9/6099/hotspot/src/os/linux/vm/os_linux.cpp:679
#16 0x44912bc9 in start_thread () from /lib/libpthread.so.0
#17 0x44855c9e in clone () from /lib/libc.so.6



Problem is observed in multicore machines,

How to reproduce 
test jdk: fastdebug 9-ea+157, mixed mode, tiered, serial gc, linux-x86
testbase: langtools/test
m/c type: i got failures in 32 core, 72 core machines , not sure about other configs.
$JAVA_HOME/bin/java -jar $JTREG/lib/jtreg.jar -verbose  -testjdk $JAVA_HOME -dir:$TESTBASE  -vmoption:-Xmx768m -retain:fail,error  -xml:verify -agentvm  -concurrency:6 -timeoutFactor:5 -vmoption:-XX:+UseSerialGC :tier1

Comments
ILW: Impact=H can cause crash due to stale oop Likelihood=L only really crashes during verification Workaround=H no workaround P3?
22-01-2018

For clarification of what _pd_set is, see http://hg.openjdk.java.net/jdk/hs/file/423faefc77df/src/hotspot/share/classfile/dictionary.hpp#l129 // Contains the set of approved protection domains that can access // this dictionary entry. // // This protection domain set is a set of tuples: // // (InstanceKlass C, initiating class loader ICL, Protection Domain PD) // // [Note that C.protection_domain(), which is stored in the java.lang.Class // mirror of C, is NOT the same as PD] // // If such an entry (C, ICL, PD) exists in the table, it means that // it is okay for a class Foo to reference C, where // // Foo.protection_domain() == PD, and // Foo's defining class loader == ICL // // The usage of the PD set can be seen in SystemDictionary::validate_protection_domain() // It is essentially a cache to avoid repeated Java up-calls to // ClassLoader.checkPackageAccess(). // ProtectionDomainEntry* volatile _pd_set;
18-01-2018

Okay, the pd-set is a cache of PD's used by loading classes, so that we can short-circuit the need to do the Java level access check each time a load request is made. The cache should not keep strong references to the cached PD's and so the cache logic must be prepared to find a "null" entry.
18-01-2018

When loading a class the current class has a ProtectionDomain, associated with the current ClassLoader. Why would that PD be associated with the class being loaded through a different loader (and potentially with a different PD) ? In any case unless there is a complex set of classloaders involved, a PD associated with the calling class's ClassLoader should remain reachable. ??
18-01-2018

This macro does it: _protection_domain is an injected field in java.lang.Class // Interface to java.lang.Class objects #define CLASS_INJECTED_FIELDS(macro) \ macro(java_lang_Class, klass, intptr_signature, false) \ macro(java_lang_Class, array_klass, intptr_signature, false) \ macro(java_lang_Class, oop_size, int_signature, false) \ macro(java_lang_Class, static_oop_field_count, int_signature, false) \ macro(java_lang_Class, protection_domain, object_signature, false) \ macro(java_lang_Class, signers, object_signature, false) So the stale protection domain reference is not from the class_loader->class->protection_domain, it's in the dictionary: class_loader -> dictionary -> class, pd_set the pd_set list of protection domains are the ones used to load the class but not the one from the class itself. There doesn't seem to be a reference from class_loader to the protection domain used to load the class that we put in the set.
17-01-2018

There should be a strong reference chain ClassLoader -> Class -> ProtectionDomain. But I'm confused: Class.getProtectionDomain() calls JVM_GetProtectionDomain() which does: java_lang_Class::protection_domain(JNIHandles::resolve(cls)); where: oop java_lang_Class::protection_domain(oop java_class) { assert(_protection_domain_offset != 0, "must be set"); return java_class->obj_field(_protection_domain_offset); } but there is no "protection domain" field in java.lang.Class, and nowhere do I see _protection_domain_offset being initialized ??
17-01-2018

The system dictionary has entries like: class loader, klass, pd_set where pd_set is a linked list pointing to protection_domain oops that are used during class loading There have been multiple changes to the structure of the system dictionary and the protection domain set, but the same applies from jdk7 and before. Going back to jdk7 (and before), the system dictionary entries are weak roots during full collections with class unloading. Nothing walks the pd_set to make sure that the oops that it points to are kept alive if appropriate or cleaned out of the pd_set. There can be unmarked oops in the pd_set. Verification is crashing on a stale oop. @ioi I thought we believed that protection_domain oops in the pd_set are references kept live from one of the classes that loaded the target class, but this is not the case.
17-01-2018

"make run-test-tier1" was the command line I was looking for. I see that the crash is in the agent but the classpaths and maybe even classes are missing from the hs_err command line. Better to start with what you were doing. Thanks.
17-01-2018

I believe this is the actual command-line in the hs_err file: Command Line: -Xmx512m -XX:MaxRAMPercentage=1 -ea -esa --patch-module=java.base=/export/users/dh198349/jdk-master/build/linu x-x64-debug/test-support/jtreg_open_test_langtools_tier1/patches/java.base -Djava.security.policy=file:/export/users/dh19834 9/jdk-master/build/linux-x64-debug/test-support/jtreg_open_test_langtools_tier1/jtreg.policy com.sun.javatest.regtest.agent. AgentServer -allowSetSecurityManager -port 35254 The crash occurred trying to run the jtreg agent VM, not running any specific test. I can not determine where the testing actually failed. The overall test run was done via "make run-test-tier1"
17-01-2018

[~dholmes] can you send me the command line for this failure (not what's in hs_err which is missing stuff)?
17-01-2018

I've re-opened this as I've seen a very similar failure when testing JDK 11. This occurs running the jtreg agent. # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00002acc3a628763, pid=13926, tid=13977 # # JRE version: Java(TM) SE Runtime Environment (11.0) (fastdebug build 11-internal+0-2017-12-21-2109022.daholme.jdk-master) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 11-internal+0-2017-12-21-2109022.daholme.jdk-master, mixed mode, tie red, compressed oops, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x148d763] oopDesc::verify()+0x33 # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c" (or dump ing to /export/users/dh198349/jdk-master/build/linux-x64-debug/test-support/jtreg_open_test_langtools_tier1/scratch/11/core. 13926) # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # --------------- S U M M A R Y ------------ Command Line: -Xmx512m -XX:MaxRAMPercentage=1 -ea -esa --patch-module=java.base=/export/users/dh198349/jdk-master/build/linu x-x64-debug/test-support/jtreg_open_test_langtools_tier1/patches/java.base -Djava.security.policy=file:/export/users/dh19834 9/jdk-master/build/linux-x64-debug/test-support/jtreg_open_test_langtools_tier1/jtreg.policy com.sun.javatest.regtest.agent. AgentServer -allowSetSecurityManager -port 35254 Host: Intel(R) Xeon(R) CPU X5675 @ 3.07GHz, 24 cores, 141G, Ubuntu 12.04 LTS Time: Sat Jan 13 22:29:27 2018 EST elapsed time: 1420 seconds (0d 0h 23m 40s) --------------- T H R E A D --------------- Current thread (0x00002acc3c384000): VMThread "VM Thread" [stack: 0x00002acd54a97000,0x00002acd54b97000] [id=13977] Stack: [0x00002acd54a97000,0x00002acd54b97000], sp=0x00002acd54b95920, free space=1018k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x148d763] oopDesc::verify()+0x33 V [libjvm.so+0xc2d3f9] DictionaryEntry::verify()+0xd9 V [libjvm.so+0xe4c6b3] void BasicHashtable<(MemoryType)1>::verify_table<DictionaryEntry>(char const*)+0x63 V [libjvm.so+0xc2d9e8] Dictionary::verify()+0x108 V [libjvm.so+0x9de8d6] ClassLoaderDataGraph::verify_dictionary()+0x26 V [libjvm.so+0x173824c] SystemDictionary::verify()+0x3c V [libjvm.so+0x17c6604] Universe::verify(VerifyOption, char const*)+0x254 V [libjvm.so+0x187279c] VMThread::run()+0x1fc V [libjvm.so+0x14c095a] thread_native_entry(Thread*)+0xfa siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00000005d56dd5f0 Register to memory mapping: RAX=0x00002acc3b1cba20: <offset 0x0000000002030a20> in /export/users/dh198349/jdk-master/build/linux-x64-debug/images/jdk/li b/server/libjvm.so at 0x00002acc3919b000 RBX=0x00000000ed7e9888 is an unallocated location in the heap RCX=0x0000000000000003 is an unknown value RDX=0x00002acc3c02ec60 is an unknown value RSP=0x00002acd54b95920 is an unknown value RBP=0x00002acd54b95960 is an unknown value RSI=0x00000000e0000000 is an unknown value RDI=0x00000000ed7e9888 is an unallocated location in the heap R8 =0x0000000000000e2d is an unknown value R9 =0x0000000000000e00 is an unknown value R10=0x00000000e2d61640 is an oop java.security.ProtectionDomain {0x00000000e2d61640} - klass: 'java/security/ProtectionDomain' - ---- fields (total size 5 words): - private 'hasAllPerm' 'Z' @12 false - private final 'staticPermissions' 'Z' @13 false - private 'codesource' 'Ljava/security/CodeSource;' @16 a 'java/security/CodeSource'{0x00000000e2d62aa8} (e2d62aa8 e030620 0) - private 'classloader' 'Ljava/lang/ClassLoader;' @20 a 'jdk/internal/loader/ClassLoaders$AppClassLoader'{0x00000000e03062 00} (e0306200 e2d62b08) - private 'principals' '[Ljava/security/Principal;' @24 a 'java/security/Principal'[0] {0x00000000e2d62b08} (e2d62b08 e2d6 2b18) - private 'permissions' 'Ljava/security/PermissionCollection;' @28 a 'java/security/Permissions'{0x00000000e2d62b18} (e2d6 2b18 e2d62d88) - final 'key' 'Ljava/security/ProtectionDomain$Key;' @32 a 'java/security/ProtectionDomain$Key'{0x00000000e2d62d88} (e2d62 d88 0) R11=0x00002acc38f4f380: <offset 0x0000000000174380> in /lib/x86_64-linux-gnu/libc.so.6 at 0x00002acc38ddb000 R12=0x00000005d56dd5f0 is an unknown value R13=0x00000005d56dd5f0 is an unknown value R14=0x00002acc3c000900 is an unknown value R15=0x00002acc3c6c8c80 is an unknown value Registers: RAX=0x00002acc3b1cba20, RBX=0x00000000ed7e9888, RCX=0x0000000000000003, RDX=0x00002acc3c02ec60 RSP=0x00002acd54b95920, RBP=0x00002acd54b95960, RSI=0x00000000e0000000, RDI=0x00000000ed7e9888 R8 =0x0000000000000e2d, R9 =0x0000000000000e00, R10=0x00000000e2d61640, R11=0x00002acc38f4f380 R12=0x00000005d56dd5f0, R13=0x00000005d56dd5f0, R14=0x00002acc3c000900, R15=0x00002acc3c6c8c80 RIP=0x00002acc3a628763, EFLAGS=0x0000000000010246, CSGSFS=0x0000000000000033, ERR=0x0000000000000004 TRAPNO=0x000000000000000e Top of Stack: (sp=0x00002acd54b95920) 0x00002acd54b95920: 00000000e2d61640 0000000000000002 0x00002acd54b95930: 00002acc4eaee490 00002acdb84d8570 0x00002acd54b95940: 00002acc3b23e394 00002acd54b95970 0x00002acd54b95950: 00000000000021a0 00002acc3c6c8c80 Instructions: (pc=0x00002acc3a628763) 0x00002acc3a628743: 05 58 9e c3 00 48 85 ff 4c 8b 30 74 4b 48 8d 05 0x00002acc3a628753: a5 5e c1 00 48 89 fb 80 38 00 75 51 4c 8b 6f 08 0x00002acc3a628763: 49 8b 45 00 48 89 5d c0 4c 8d 65 c0 48 8d 1d 1e 0x00002acc3a628773: 5c c1 00 4c 8b b8 e8 03 00 00 80 3b 00 0f 85 aa --------------- P R O C E S S --------------- Threads class SMR info: _java_thread_list=0x00002acc3c3b2c80, length=18, elements={ 0x00002acc3c38e800, 0x00002acc3c395000, 0x00002acc3c3b6000, 0x00002acc3c3b8800, 0x00002acc3c3bb000, 0x00002acc3c3c2000, 0x00002acc3c3c4000, 0x00002acc3c3ce800, 0x00002acc3c3d1000, 0x00002acc3c3d3800, 0x00002acc3c3d6000, 0x00002acc3c3d8800, 0x00002acc3c3da800, 0x00002acc3c3dd000, 0x00002acc3c3df800, 0x00002acc3c444800, 0x00002acc3c599800, 0x00002acc3c5dc000 } _java_thread_list_alloc_cnt=207,_java_thread_list_free_cnt=206,_java_thread_list_max=22, _nested_thread_list_max=1 _tlh_cnt=21617, _tlh_times=1669, avg_tlh_time=0.08, _tlh_time_max=42 _deleted_thread_cnt=93, _deleted_thread_times=7, avg_deleted_thread_time=0.08, _deleted_thread_time_max=7 _delete_lock_wait_cnt=0, _delete_lock_wait_max=0 _to_delete_list_cnt=0, _to_delete_list_max=1
15-01-2018

I reproduced this in 9 (now it times out so can't reproduce it). I could not reproduce this in 10 but the protection_domain code that the system dictionary points to has been rewritten in 10. Closing as CNR. It is likely a bug in the verification code so wouldn't affect customers.
01-11-2017

Steps to reproduce are at the end of the description.
18-08-2017

[~jcm], can you reproduce this issue? If so, can you provide us with a reproducer?
17-08-2017