JDK-6338927 : Redefine crash on lazy NetBeans instrumentation -- upto 4096 redefines on a single class
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: 5.0,5.0u4
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_9,windows_xp
  • CPU: x86,sparc
  • Submitted: 2005-10-19
  • Updated: 2011-02-21
  • Resolved: 2006-01-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6
5.0u8Fixed 6 b67Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Crash of VM when using NetBeans profiler in RedefineClasses on repeated redefine of a single class on lazy instrumentation of a stress test (attached) which causes many redefines of the same class. hs_err attached.

In NetBeans make a project "bamm" add the attached "Main" and "c1" build and then profile / Analyze Performance / part of application / selecting "bamm.Main.main" as root.

Comments
EVALUATION An assertion was added about not observing obsolete methods in the previous version info that should only be saving weak references to EMCP methods. That assertion fired during testing and revealed that we were accidently saving weak refs to obsolete methods that could be seen during breakpoint processing. Once we stopped saving weak refs to obsolete methods, that exposed a bug where breakpoints could be set in previously EMCP methods that were made obsolete. This whole area of logic definitely had pre-rewrite flavor since it depended on us saving information about every generation of redefine. Additional logic needs to be added to flush previously EMCP method refs when a method is made obsolete.
04-01-2006

SUGGESTED FIX Yet one more regression with breakpoints showed up. After this regression was fixed the following test suites were run: JDI_REGRESSION, nsk.jdi, nsk.jvmti See the attached 6338927-webrev-cr5-full.tgz file for the full proposed fix after changes from the fifth (and hopefully final) round of code review. See the attached 6338927-webrev-cr5-delta.tgz file for just the deltas made due to the fourth round of code review.
04-01-2006

SUGGESTED FIX Additional testing revealed the need for yet one more tweak. See the attached 6338927-webrev-cr4-full.tgz file for the full proposed fix after changes from the fourth (and hopefully final) round of code review. See the attached 6338927-webrev-cr4-delta.tgz file for just the deltas made due to the fourth round of code review.
02-01-2006

SUGGESTED FIX See the attached 6338927-webrev-cr3-full.tgz file for the full proposed fix after changes from the third (and final) round of code review. See the attached 6338927-webrev-cr3-delta.tgz file for just the deltas made due to the third round of code review.
23-12-2005

EVALUATION check_methods_and_mark_as_obsolete() was correct in distinguishing between obsolete and EMCP methods. That portion of the initial proposed fix (cr0) and the initial reviewed fix (cr1) is wrong. I restored the original logic and added the following comment: // An EMCP method is _not_ obsolete. An obsolete method has a // different jmethodID than the current method. An EMCP method // has the same jmethodID as the current method. Having the // same jmethodID for all EMCP versions of a method allows for // a consistent view of the EMCP methods regardless of which // EMCP method you happen to have in hand. For example, a // breakpoint set in one EMCP method will work for all EMCP // versions of the method including the current one. This should keep me (or anyone else) from trying to make this optimization again. The proper solution of this portion of the perm-gen leak is for the various adjust_method_* routines to adjust both obsolete and EMCP methods in the constant pool cache, class i-table and class v-table. Testing revealed a random breakpoint failure due to a logic error in JvmtiBreakpoint::each_method_version_do(). The previous version logic still had a "previous instanceKlass" flavor to it and not a PreviousVersionInfo flavor. PreviousVersionInfo only saves EMCP method refs while the original implementation saved all method refs. The errant loop logic was causing a breakpoint to be set (or cleared) in the wrong method. Oops.
23-12-2005

SUGGESTED FIX See the attached 6338927-webrev-cr2-full.tgz file for the full proposed fix after changes from the second round of code review. See the attached 6338927-webrev-cr2-delta.tgz file for just the deltas made due to the second round of code review.
23-12-2005

SUGGESTED FIX See the attached 6338927-webrev-cr1-full.tgz file for the full proposed fix after changes from the initial round of code review. See the attached 6338927-webrev-cr1-delta.tgz file for just the deltas made due to the initial round of code review.
20-12-2005

SUGGESTED FIX See the attached 6338927-webrev-cr0.tgz file for the proposed fix.
15-12-2005

EVALUATION The ConstantPoolCacheKlass and ConstantPoolKlass objects are pinned in memory because of obsolete methods that were pinned in memory. check_methods_and_mark_as_obsolete() is not marking EMCP methods as obsolete and obsolete methods were being left in the oop map cache. I also noticed a problem where the EMCP (and obsolete) method weak refs array was always being allocated. Since we create a bitmap of the EMCP methods we can also count up the number of EMCP methods at the same time and allocate just the size weak refs array that we need. No more, no less. This saves space in the C/C++ heap and not in the perm-gen, but as long as I'm in here I should fix that too. I tried to use the TraceJVMTI option and found that it failed assertions due to the forced GC's that I was doing. I tracked down that problem and fixed it. This is _not_ the same problem as 6205548; I took a quick look at that problem and didn't see anything obvious. I tweaked alot of the RC_TRACE() code while I was hunting this bug. I've sifted the changes down to what I think would be helpful in future bug hunts. These changes should be preserved at the same time.
14-12-2005

EVALUATION This test program shows that we have pinned down ~28MB in ConstantPoolCacheKlass and ~21MB in ConstantPoolKlass before the VM crashes. I've done some initial debugging and the reason for the pinning of these objects is not clear. I temporarily modified the test program to include bail-out logic when we reach a specific point in the call graph. This allowed the test class to "finish" with the perm gen just about full. I added a System.gc() call to main() along with a very long sleep so that I could gcore the VM after we were done with the bammo class. All of the previously pinned objects appeared to still be pinned. I understand why the crash occurs: we have run out of perm gen. I don't yet understand why ConstantPoolCacheKlass and ConstantPoolKlass objects are pinned in memory. The next experiment is to add more tracing to Dictionary::do_unloading() to find out why things are not being GC'ed.
17-11-2005