JDK-8178870 : instrumentation.retransformClasses cause coredump
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: 8,9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2017-04-17
  • Updated: 2020-05-07
  • Resolved: 2017-10-06
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10 JDK 8 Other
10 b31Fixed 8u212Fixed openjdk8u232Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
FULL PRODUCT VERSION :
build 1.8.0-b112

ADDITIONAL OS VERSION INFORMATION :
Red Hat Enterprise Linux Server release 7.1(Maipo)
Linux 3.10.0-229.el7.x86_64  #1 SMP Thu Jan 29 18:37:38 EST 2015 x86_64 x86_64 GNU/Linux

EXTRA RELEVANT SYSTEM CONFIGURATION :
8G memory
two cores cpu
1G network
-Xmx1g -Xms1g -XX:MaxPermSize=512m

A DESCRIPTION OF THE PROBLEM :
I am implementing a java agent program(APM), the requirement is to enable/disable the monitor function by UI,  The underlying implementation is to use the instrumentation.retransformClasses api to redefine the montored classes at runtime.
Asm is used to transform the class byte codes.
A coredump or hs_error_pid file is generated when retransformClasses is invoked occasionally.  So I designed a program to invoke retransformClasses iteratively on large number of loaded classes, a few hours later ,coredump will be reproduced.

When retransforming the classes, the coredump will be produced even using the 
original byte code without changed by asm. 

The issue is found on oracle jdk 8, not found on oracle jdk6, oracle jdk7



STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1.Implement a javaagent  and a function to retransform part of the loaded classes at runtime
2.retransform the classes without changed the byte code even, just do the retransform action,  and return the original byte code copy in the transform method of transformer
3. Repeat 2 in a program automatically in a few hours,sometimes half a day ,sometimes one or two days.



EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The retransformation will run without any errors
ACTUAL -
a coredump or hs_error_pid  will be generated alternatively

ERROR MESSAGES/STACK TRACES THAT OCCUR :
hs_error_pid information:

*** Error in `/home/bes/java/jdk1.8.0_92/bin/java': double free or corruption (out): 0x00007f4694193450 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7d1fd)[0x7f46f42ef1fd]


The coredump information

Program terminated with signal 6, Aborted. 
#0 0x00007fe7e605a5d7 in raise () from /lib64/libc.so.6 
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.el7.x86_64 libgcc-4.8.3-9.el7.x86_64 
(gdb) bt 
#0 0x00007fe7e605a5d7 in raise () from /lib64/libc.so.6 
#1 0x00007fe7e605bcc8 in abort () from /lib64/libc.so.6 
#2 0x00007fe7e609ae07 in __libc_message () from /lib64/libc.so.6 
#3 0x00007fe7e60a21fd in _int_free () from /lib64/libc.so.6 
#4 0x00007fe7e5999209 in os::free(void*, unsigned short) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#5 0x00007fe7e56da5a6 in InstanceKlass::release_C_heap_structures() () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#6 0x00007fe7e56e0a6a in InstanceKlass::deallocate_contents(ClassLoaderData*) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#7 0x00007fe7e552ab2f in ClassLoaderData::free_deallocate_list() () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#8 0x00007fe7e552b51b in ClassLoaderDataGraph::do_unloading(BoolObjectClosure*) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#9 0x00007fe7e5aa8f1b in SystemDictionary::do_unloading(BoolObjectClosure*) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#10 0x00007fe7e59fe6b4 in PSParallelCompact::marking_phase(ParCompactionManager*, bool, ParallelOldTracer*) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#11 0x00007fe7e5a037c6 in PSParallelCompact::invoke_no_policy(bool) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#12 0x00007fe7e5a040b3 in PSParallelCompact::invoke(bool) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#13 0x00007fe7e55414c4 in CollectedHeap::collect_as_vm_thread(GCCause::Cause) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#14 0x00007fe7e5b322f1 in VM_CollectForMetadataAllocation::doit() () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#15 0x00007fe7e5b3a1c5 in VM_Operation::evaluate() () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#16 0x00007fe7e5b385ba in VMThread::evaluate_operation(VM_Operation*) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#17 0x00007fe7e5b3897d in VMThread::loop() () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#18 0x00007fe7e5b38db0 in VMThread::run() () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so
#19 0x00007fe7e59a2058 in java_start(Thread*) () from /home/bes/java/jdk1.8.0/jre/lib/amd64/server/libjvm.so 
#20 0x00007fe7e6808df5 in start_thread () from /lib64/libpthread.so.0 
#21 0x00007fe7e611b1ad in clone () from /lib64/libc.so.6

REPRODUCIBILITY :
This bug can be reproduced occasionally.

---------- BEGIN SOURCE ----------
package agent.loader;

import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.security.ProtectionDomain;

public class PreClassTransformer implements ClassFileTransformer {
    public PreClassTransformer() {
    	super();
    }

    /**
     * Transform the class at loading stage
     */
    public byte[] transform(ClassLoader loader, String className, Class classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws
	    IllegalClassFormatException { 
	    byte[] cloneClassBytes = new byte[classfileBuffer.length];
		System.arrayCopy(classBytes,0,cloneClassBytes,0,classBytes.length);
        return cloneClassBytes;
    }    
}


package agent.loader;

import java.lang.instrument.Instrumentation;

public abstract class AbstractClassRetransformer {
    private String name;

    public AbstractClassRetransformer(String name) {
        this.name = name;
    }
    
    public void retransformClass(Instrumentation instrumentation, Class<?> retransformClass) {
        try {
            instrumentation.retransformClasses(retransformClass);
        } catch (Throwable ex) {
            ; log the errors
        }
    }       
}

---------- END SOURCE ----------


Comments
[8u] Missing get_ik() function is added by JDK-8155951, that also on 8u backport list. Updated webrev to reflect right backport order: http://cr.openjdk.java.net/~zgu/JDK-8178870-8u/webrev.01/
23-08-2019

Fix Request (8u) I would like to backport this patch to 8u, as it is on Oracle 8u backport list. The original patch does not apply cleanly. 1) Missing get_ik() function in jvmtiRedefineClasses.cpp in 8u code base. I imported get_ik() into 8u code base. 2) _scratch_classes was declared as Klass** in 8u, but InstanceKlass** in 10. Added casting in 8u code. 3) Missing test infrastructure in 8u, converted test to shell script. 8u webrev: http://cr.openjdk.java.net/~zgu/JDK-8178870-8u/webrev.00/ Code review: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2019-August/010063.html (Reviewed)
20-08-2019

[~kevinw] You might want to backport this fix since it was submitted externally.
06-10-2017

It turns out to be another double-free for cached_class_file(). This time the original classfile bytes are installed in the scratch_class, and the scratch_class fails to be loaded, and is put on the deallocate list. The bytes are freed when the entry is freed from the deallocate list, it must be freed by multiple classes on the deallocate list. The fix is: if the class fails to load and doesn't create it's own class_file_bytes (only gets those from the to-be-redefined class), NULL out the field to not deallocate it with release_C_heap_structures(). If the class fails to load, and does create it's own class file bytes, we leave the bytes on the InstanceKlass to be deallocated. If the class succeeds loading there are three cases in RedefineClasses::doit(), which is run at a safepoint. parent class file is NULL: transfer cached_class_file() to parent different cached_class_file bytes as parent, deallocate scratch_class version same class file bytes as parent: nothing All cases NULL out scratch_class->cached_class_file. I think there is no difference in behavior for CDS because there's a test for whether the cached_class_bytes are shared before deallocation.
05-10-2017

I can reproduce this in jdk 10. It's cached_class_file again. (gdb) frame 8 #8 0x00007f05391748c9 in InstanceKlass::release_C_heap_structures (this=0x1008f1550) at /scratch/cphillim/hg/10verdict/open/src/hotspot/share/oops/instanceKlass.cpp:2227 2227 os::free(_cached_class_file); (gdb) list 2222 assert(breakpoints() == 0x0, "should have cleared breakpoints"); 2223 } 2224 2225 // deallocate the cached class file 2226 if (_cached_class_file != NULL && !MetaspaceShared::is_in_shared_space(_cached_class_file)) { 2227 os::free(_cached_class_file); 2228 _cached_class_file = NULL; 2229 } 2230 #endif
02-10-2017

Running on fastdebug build of jdk8, generated "fatal error: memory stomping error" == ## nof_mallocs = 750165, nof_frees = 560731 ## memory stomp: GuardedMemory(0x00007f46d61e8280) base_addr=0x00007f46ee35e160 tag=0x0000000000000000 user_size=0 user_data=0x00007f46ee35e180 Header guard @0x00007f46ee35e160 is BROKEN Trailer guard @0x00007f46ee35e180 is BROKEN User data appears to have been freed # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/os.cpp:547 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/scratch/fairoz/jdk-8/jdk8u-dev/hotspot/src/share/vm/runtime/os.cpp:547), pid=60108, tid=0x00007f46d61e9700 # fatal error: memory stomping error # # JRE version: Java(TM) SE Runtime Environment (8.0) (build 1.8.0-internal-fastdebug-fmatte_2017_05_09_23_02-b00) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.71-b00-fastdebug mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /scratch/fairoz/JI/8178870/coredump/hs_err_pid60108.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # Current thread is 139942216767232 Dumping core ... do.sh: line 19: 60108 Aborted
10-05-2017

Issue is reproducible on 8 where as it passes on 9. 8u71 - Fail 8u121 - Fail 8u131 - Fail 8u152 - Fail 9 ea b167 - Pass
02-05-2017

Received response from submitter == Actually, my test case is alike yours, the difference lies that the agent instruments a tomcat server. Several hundreds of classes are retransformed. My case is more complex and related with our products, so I need some time(3~5 days) to refine the case. == Requested for core file == Could you please share the core file through uploading to dropbox. ==
26-04-2017

It would be great if we could get a core file from them to see what line in release_C_heap_structures is double-freed, then maybe we can find this with inspection.
26-04-2017

Requesting submitter for crash logs and complete test case. == The provided test case is not complete and we are unable to execute it. I have created the attached test case, let me know if anything I am missing? If not could you please share crash logs along with the complete test case to reproduce? ==
19-04-2017

Test case provided with the report is not complete, Created the attached test case with my understanding, running from last night, no crash observed on 8u92. If there is a crash on 8u92, then thought of running on 8u31 and 9 ea b165.
19-04-2017

Reopening to analyze more
18-04-2017

This failure mode: *** Error in `/home/bes/java/jdk1.8.0_92/bin/java': double free or corruption (out): 0x00007f4694193450 *** is not an OutOfMemoryError. I do not think this is a duplicate of JDK-8164921.
18-04-2017

This issue is duplicate of JDK-8164921 which is fixed in 9 ea b143. Closing as duplicate of JDK-8164921
18-04-2017

This is the same issue JDK-8164921 after executing the testcase able to see the core-dump generated == [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded Tester from __VM_RedefineClasses__] [Loaded java.lang.reflect.InvocationTargetException from /net/scanas416.us.oracle.com/export/java_re/jdk/8u121/fcs/b13/binaries/linux-x64/jre/lib/rt.jar] Exception in thread "main" [Loaded java.lang.Throwable$PrintStreamOrWriter from /net/scanas416.us.oracle.com/export/java_re/jdk/8u121/fcs/b13/binaries/linux-x64/jre/lib/rt.jar] [Loaded java.lang.Throwable$WrappedPrintStream from /net/scanas416.us.oracle.com/export/java_re/jdk/8u121/fcs/b13/binaries/linux-x64/jre/lib/rt.jar] [Loaded java.util.IdentityHashMap from /net/scanas416.us.oracle.com/export/java_re/jdk/8u121/fcs/b13/binaries/linux-x64/jre/lib/rt.jar] [Loaded java.util.IdentityHashMap$KeySet from /net/scanas416.us.oracle.com/export/java_re/jdk/8u121/fcs/b13/binaries/linux-x64/jre/lib/rt.jar] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:386) at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:401) Caused by: java.lang.OutOfMemoryError at sun.instrument.InstrumentationImpl.retransformClasses0(Native Method) at sun.instrument.InstrumentationImpl.retransformClasses(InstrumentationImpl.java:144) at LoggingAgent.premain(LoggingAgent.java:11) ... 6 more FATAL ERROR in native method: processing of -javaagent failed doit.sh: line 4: 48723 Aborted (core dumped) java -XX:+TraceClassLoading -XX:+TraceClassUnloading -XX:MetaspaceSize=12m -XX:MaxMetaspaceSize=12m -javaagent:MyAgent.jar Main ==
18-04-2017

Issue is related to retransformClasses moving to hotspot-> JVMTI
18-04-2017