JDK-8046246 : the constantPoolCacheOopDesc::adjust_method_entries() used in RedefineClasses does not scale
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: 7,8u60,9
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2014-06-06
  • Updated: 2015-10-08
  • Resolved: 2015-02-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8 JDK 9
8u60Fixed 9 b55Fixed
Related Reports
Relates :  
Relates :  
Description
The issue was reported by the HP.

Please, see the email from the support engineer Kevin Brown:

Hi Serguei,

I have an issue from our source licensee HP, and while it's not reproducible on Solaris/Linux I'm trying to advise HP on how to further debug it on HPUX/PaRISC. 

Vladimir indicated the method in question "adjust_cpool_cache_and_vtable()" was a JVMTI method and I hear you are an expert in this area.  Could you review the data below and let me know if you have any advise?

Thanks,
Kevin

-------- Original Message --------
Subject: 	Re: HP's high CPU utilization investigation...
Date: 	Thu, 08 May 2014 17:52:59 -0700
From: 	Vladimir Kozlov <vladimir.kozlov@oracle.com>
To: 	Kevin Brown <Kevin.L.Brown@oracle.com>


Hi Kevin,

The bug I mentioned to you:

https://bugs.openjdk.java.net/browse/JDK-8041984

but it looks like it is not related to your problem.

> Question: By design, is re-compilation(i.e. compilation while reloading
> instrumented classes) so very CPU intensive (~100%) Or is it expected
> for large no. of classes ?

When class is redefined we throw away (deoptimize) all related compiled 
methods. If new methods are hot they will be recompiled again after 
executed in Interpreter.

The method adjust_cpool_cache_and_vtable() is not JIT compiler related. 
This method is in jvmti code. Ask serviceability group about it.

That code was significantly changed since jdk7.

Regards,
Vladimir

On 5/8/14 5:20 PM, Kevin Brown wrote:
> Hi Vladimir,
>
> Thanks for the quick chat and willingness to look at this.  I was hoping
> you might have some insight or could at least better direct me on
> investigating a high CPU usage issue that HP is reporting.    Their
> customer reported the issue initially with Weblogic on HPUX, and HP's
> testcase without Weblogic (attached) successfully reproduces the problem.
>
> *But* it is only reproduciable on HPUX.  When I run the test on Solaris
> or Linux, it does not show the same high CPU usage.  (never gets above
> 12-13%)  I cannot formally file a bug since we don't see it our our
> platforms, but HP is hoping we can provide guidance.
>
> Do you have any suggestions on further debugging, VM options to tune
> with, or areas to look into?
>
> Here is my full analysis thus far...
>
> Problem statement:
> 100% CPU utilization is seen when the number of classes need to be
> compiled are of the order of a thousand or more. Below are the numbers
> that could be of interest to you when tested on HPUX.
>
> o   High CPU time: 8-10 mins
> o   No. of loaded classes : ~1600
> o   No. of redefined classes:  443
> o   Calls to evaluate_operation(): 658
>
> What HP did on HP-UX:
> HP could produce the above numbers through dynamically instrumenting an
> in-house application that they generally use for testing.  (the attached
> testcase)
>
> JLE testing on Solaris and Linux:
> I (Kevin) was unable to reproduce the results on Solaris or Linux.  I
> tested with JDK 7u51 on Linux x64, connecting JConsole to badapp.jar
> file.  I do not see similar results, after 30 minutes of testing, the
> CPU usage never goes over 12.9%
>
> I suggested HP to try setting -XX:CompileThreshold to a lower setting
> but they claim this delays the high CPU usage since compilation
> eventually happens and goes to 100%.
>
> JVM  Call hierarchy:
> Here are the functions called from jvm, the
> adjust_cpool_cache_and_vtable() is the busiest method consuming the
> highest CPU.
>
>   VMThread::run()
> ���.
> VMThread::loop()
> ...
> evaluate_operations()
> ..
> evaluate()
> ...
> doit()
> ...
> redefine_single_Class()
> ....
> classes_do()
> ....
> adjust_cpool_cache_and_vtable()
> ���.
>
> Question: By design, is re-compilation(i.e. compilation while reloading
> instrumented classes) so very CPU intensive (~100%) Or is it expected
> for large no. of classes ?
>
> Thanks,
> Kevin

As we can see the HP reported a performance problem in the VM_RedefineClasses::adjust_cpool_cache_and_vtable().
Also, the performance profiling information (see attached file HP-high_cpu_caliper-report.zip received from HP)
shows the most overhead is in the constantPoolCacheOopDesc::adjust_method_entries().

Look at the following fragment:
Function Summary 
---------------------------------------------------------------------------------------------------------
% Total    Cumulat                        Secs                         Msecs                                    
   IP          % of              IP            in              Call          per                                     
Samples    Total          Samples   Func          Count      Call        Function                          File
---------------------------------------------------------------------------------------------------------
  19.52      19.52         65739       20.59          62723      0.33  libjvm.so::VM_RedefineClasses::adjust_cpool_cache_and_vtable(klassOopDesc*,oopDesc*,Thread*)  jvmtiRedefineClasses.cpp
  15.21      34.73         51230       16.05          27027      0.59  libjvm.so::RelocIterator::initialize(nmethod*,unsigned char*,unsigned char*)  relocInfo.cpp
  14.52      49.25         48907       15.32          44635      0.34  libjvm.so::constantPoolCacheOopDesc::adjust_method_entries(methodOopDesc**,methodOopDesc**,int,bool*)  cpCacheOop.cpp


Comments
Kevin, sorry for the late reply. Yes, I plan to backport this fix to 8u60 soon. And it could be backported to the 7u as well.
13-03-2015

Serguei, Is this fix something that could be backported to JDK 7 and/or 8? Thanks, Kevin
03-03-2015

This is the suggested fix: Open hotspot webrevs: http://cr.openjdk.java.net/~sspitsyn/webrevs/2015/hotspot/8046246-JVMTI-redefscale.2/ Open jdk (new unit test) webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2015/jdk/8046246-JVMTI-manymethods.1/ New algorithm has effectiveness O(M) instead of original O(M^2), where M is count of methods in the class. The new test (see webrev above) was used to mesure CPU time consumed by the ConstantPoolCache::adjust_method_entries() in both original and new approach. The performance numbers are: ----------------------------------------------------------------------------------------------------------------- Methods: ------- 1,000 -------------------- 10,000 ------------------------- 20,000 ----------------------------------------------------------------------------------------------------------------- Orig: --------- 600,000 nsec (1x) - 60,500,000 nsec (~100x) 243,000,000 nsec (~400x) New: ---------- 16,000 nsec (1x) ------ 178,000 nsec (~10x) ------ 355,000 nsec (~20x) ----------------------------------------------------------------------------------------------------------------
24-02-2015

The fix is out for a public review.
19-02-2015

There are two more places in the klassVTable.cpp where the used algorithm is O(methods_number^2). void klassVtable::adjust_method_entries(Method** old_methods, Method** new_methods, int methods_length, bool * trace_name_printed) { // search the vtable for uses of either obsolete or EMCP methods for (int j = 0; j < methods_length; j++) { Method* old_method = old_methods[j]; Method* new_method = new_methods[j]; // In the vast majority of cases we could get the vtable index // by using: old_method->vtable_index() // However, there are rare cases, eg. sun.awt.X11.XDecoratedPeer.getX() // in sun.awt.X11.XFramePeer where methods occur more than once in the // vtable, so, alas, we must do an exhaustive search. for (int index = 0; index < length(); index++) { if (unchecked_method_at(index) == old_method) { put_method_at(new_method, index); // For default methods, need to update the _default_methods array // which can only have one method entry for a given signature bool updated_default = false; if (old_method->is_default_method()) { updated_default = adjust_default_method(index, old_method, new_method); } . . . } } } } . . . void klassItable::adjust_method_entries(Method** old_methods, Method** new_methods, int methods_length, bool * trace_name_printed) { // search the itable for uses of either obsolete or EMCP methods for (int j = 0; j < methods_length; j++) { Method* old_method = old_methods[j]; Method* new_method = new_methods[j]; itableMethodEntry* ime = method_entry(0); // The itable can describe more than one interface and the same // method signature can be specified by more than one interface. // This means we have to do an exhaustive search to find all the // old_method references. for (int i = 0; i < _size_method_table; i++) { if (ime->method() == old_method) { ime->initialize(new_method); . . . } ime++; } } }
02-02-2015

This is current implementation: // RedefineClasses() API support: // If any entry of this ConstantPoolCache points to any of // old_methods, replace it with the corresponding new_method. void ConstantPoolCache::adjust_method_entries(Method** old_methods, Method** new_methods, int methods_length, bool * trace_name_printed) { if (methods_length == 0) { // nothing to do if there are no methods return; } // get shorthand for the interesting class Klass* old_holder = old_methods[0]->method_holder(); for (int i = 0; i < length(); i++) { if (!entry_at(i)->is_interesting_method_entry(old_holder)) { // skip uninteresting methods continue; } // The ConstantPoolCache contains entries for several different // things, but we only care about methods. In fact, we only care // about methods in the same class as the one that contains the // old_methods. At this point, we have an interesting entry. for (int j = 0; j < methods_length; j++) { Method* old_method = old_methods[j]; Method* new_method = new_methods[j]; if (entry_at(i)->adjust_method_entry(old_method, new_method, trace_name_printed)) { // current old_method matched this entry and we updated it so // break out and get to the next interesting entry if there one break; } } } }
06-06-2014