JDK-7133038 : G1: Some small profile based optimizations
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 7u4
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2012-01-24
  • Updated: 2013-09-18
  • Resolved: 2012-03-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 Other
7u4Fixed 8Fixed hs23Fixed
Description
While looking over some collect/analyze profiles measuring data cache misses, branches, and branch mispredicts some "high" metric items were identified in the following routines:

HeapRegion::oops_on_card_seq_iterate_careful()
* High DC misses when attempting to read the klass of the current object in both loops.
* High number of branch mispredicts in the body of the second loop.

instanceKlass::oop_oop_interate_[*]_nv()
* High number of DC misses while iterating over and de-referencing the reference fields in an object.

G1BlockOffsetArray::forward_to_block_containing_addr_slow()
* High number of DC misses while dereferencing objects during BOT walking.

FilterOutOfRegionClosure::do_oop_nv()
* High number of branches and branch mispredicts.

G1ParCopyHelper::copy_to_survivor_space()
* High number of mispredicts when calculating the object size (coming from size_given_klass).

Proposed changes:

HeapRegion::oops_on_card_seq_iterate_careful()
* High DC misses when attempting to read the klass of the current object in both loops.
  -> Add a prefetch to prefetch the next object after we obtain the size of the current
     object. Adding such a prefetch to second loop looks like the better candidate. I don't
     think that there is enough of a code window between the prefetch in iteration n and use
     in iteration n+1.

* High number of branch mispredicts in the body of the second loop.
  -> The body of the second loop is made up of a 3-way if-statement. The body of two of the
     clauses is the same. If we make the conditional statement "less" branchy then we should
     be able to reduce this.

instanceKlass::oop_oop_interate_[*]_nv()
* High number of DC misses while iterating over and de-referencing the oop maps associated with reference fields in an object.
  -> Simple. Prefeth the next oop map entry.

G1BlockOffsetArray::forward_to_block_containing_addr_slow()
* High number of DC misses while dereferencing objects during BOT walking.
  -> Adding prefetching to these loops is little bit more tricky. We can't add a prefetch after we obtain the size of the current block - there is not enough of code window between the prefetch and the subsequent use. Instead if we use a fixed prefetch amount and issue the prefetch before reading the block size then we might get enough of a code window.

FilterOutOfRegionClosure::do_oop_nv()
* High number of branches and branch mispredicts.
  -> Most of these are coming from the concurrent refinement path way and are coming as a result of calling the virtual do_oop() routine in the closure(s) applied by the FilterOutOfRegionClosure. Using specialization so that the non-virtual _nv version of the do_oop() of these closures is called should help.

G1ParCopyHelper::copy_to_survivor_space()
* High number of mispredicts when calculating the object size (coming from size_given_klass).
  -> It was thought that refactoring and flattening the if-statement in the routine might have given some positive results. After performing such a refactoring and generating the assembly - I don't see any different in the branches in the generated code.

Comments
EVALUATION http://hg.openjdk.java.net/lambda/lambda/hotspot/rev/b4ebad3520bb
22-03-2012

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b4ebad3520bb
27-01-2012

EVALUATION See description.
24-01-2012