JDK-6784849 : par compact - can fail when to_space is non-empty
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs14
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2008-12-13
  • Updated: 2011-03-08
  • Resolved: 2011-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u14Fixed 7Fixed hs14Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
The following failure occurred during nightly testing:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/opt/jprt/temp/P1/B/201608.jcoomes/source/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:729), pid=16381, tid=7
#  Error: assert(result <= addr,"object cannot move to the right")
#
# Java VM: OpenJDK Client VM (14.0-b08-2008-12-11-201608.jcoomes.gc-splitspace-fastdebug mixed mode solaris-x86 )
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#
---------------  T H R E A D  ---------------

Current thread (0x08137c00):  VMThread [stack: 0xb6635000,0xb6675000] [id=7]

Stack: [0xb6635000,0xb6675000]
VM_Operation (0xad03ab84): ParallelGCFailedAllocation, mode: safepoint, requested by thread 0x08651000


---------------  P R O C E S S  ---------------

Java Threads: ( => current thread )
  0x08660000 
[error occurred during error reporting (printing all threads), id 0xe0000000]

VM state:at safepoint (normal execution)

VM Mutex/Monitor currently owned by a thread:  ([mutex/lock_event])
[0x0807b1b8] Threads_lock - owner thread: 0x08137c00
[0x0807b848] Heap_lock - owner thread: 0x08651000

Heap
 PSYoungGen      total 22656K, used 7403K [0xf5d40000, 0xf8420000, 0xfac00000)
  eden space 11840K, 4% used [0xf5d40000,0xf679b978,0xf68d0000)
    lgrp 1 space 2504K, 0% used [0xf5d40000,0xf5d40018,0xf5fb2000)
    lgrp 2 space 5504K, 0% used [0xf5fb2000,0xf5fb2000,0xf6512000)
    lgrp 3 space 2112K, 0% used [0xf6512000,0xf6512000,0xf6722000)
    lgrp 4 space 1720K, 28% used [0xf6722000,0xf679b978,0xf68d0000)
  from space 10816K, 63% used [0xf7570000,0xf7c31480,0xf8000000)
  to   space 12928K, 17% used [0xf68d0000,0xf6afbf60,0xf7570000)
 ParOldGen       total 967936K, used 967935K [0xbac00000, 0xf5d40000, 0xf5d40000)
  object space 967936K, 99% used [0xbac00000,0xf5d3fe30,0xf5d40000)
 PSPermGen       total 12288K, used 4902K [0xb6c00000, 0xb7800000, 0xbac00000)
  object space 12288K, 39% used [0xb6c00000,0xb70c99f8,0xb7800000)

Failing test:

nsk/stress/jck12a/jck12a006

Comments
EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/234c22e54b98
18-12-2008

SUGGESTED FIX Diffs for the second part: update decrement_destination_counts() to use new_top() from the source space when determining whether to enqueue regions. Also remove code that is obsoleted by the change. diff --git a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp --- a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp +++ b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp @@ -1828,14 +1828,6 @@ new_top_addr); assert(done, "space must fit into old gen"); - // XXX - this is necessary because decrement_destination_counts() tests - // source_region() to determine if a region will be filled. Probably - // better to pass src_space->new_top() into decrement_destination_counts - // and test that instead. - // - // Clear the source_region field for each region in the space. - clear_source_region(space->bottom(), _space_info[id].new_top()); - // Reset the new_top value for the space. _space_info[id].set_new_top(space->bottom()); } else if (live > 0) { @@ -1854,7 +1846,6 @@ dst_space_id = SpaceId(id); dst_space_end = space->end(); new_top_addr = _space_info[id].new_top_addr(); - HeapWord* const clear_end = _space_info[id].new_top(); NOT_PRODUCT(summary_phase_msg(dst_space_id, space->bottom(), dst_space_end, SpaceId(id), next_src_addr, space->top());) @@ -1865,13 +1856,6 @@ new_top_addr); assert(done, "space must fit when compacted into itself"); assert(*new_top_addr <= space->top(), "usage should not grow"); - - // XXX - this should go away. See comments above. - // - // Clear the source_region field in regions at the end of the space that - // will not be filled. - HeapWord* const clear_beg = _summary_data.region_align_up(*new_top_addr); - clear_source_region(clear_beg, clear_end); } } @@ -3051,19 +3035,34 @@ } void PSParallelCompact::decrement_destination_counts(ParCompactionManager* cm, + SpaceId src_space_id, size_t beg_region, HeapWord* end_addr) { ParallelCompactData& sd = summary_data(); + +#ifdef ASSERT + MutableSpace* const src_space = _space_info[src_space_id].space(); + HeapWord* const beg_addr = sd.region_to_addr(beg_region); + assert(src_space->contains(beg_addr) || beg_addr == src_space->end(), + "src_space_id does not match beg_addr"); + assert(src_space->contains(end_addr) || end_addr == src_space->end(), + "src_space_id does not match end_addr"); +#endif // #ifdef ASSERT + RegionData* const beg = sd.region(beg_region); - HeapWord* const end_addr_aligned_up = sd.region_align_up(end_addr); - RegionData* const end = sd.addr_to_region_ptr(end_addr_aligned_up); - size_t cur_idx = beg_region; - for (RegionData* cur = beg; cur < end; ++cur, ++cur_idx) { + RegionData* const end = sd.addr_to_region_ptr(sd.region_align_up(end_addr)); + + // Regions up to new_top() are enqueued if they become available. + HeapWord* const new_top = _space_info[src_space_id].new_top(); + RegionData* const enqueue_end = + sd.addr_to_region_ptr(sd.region_align_up(new_top)); + + for (RegionData* cur = beg; cur < end; ++cur) { assert(cur->data_size() > 0, "region must have live data"); cur->decrement_destination_count(); - if (cur_idx <= cur->source_region() && cur->available() && cur->claim()) { - cm->save_for_processing(cur_idx); + if (cur < enqueue_end && cur->available() && cur->claim()) { + cm->save_for_processing(sd.region(cur)); } } } @@ -3178,7 +3177,8 @@ HeapWord* const old_src_addr = closure.source(); closure.copy_partial_obj(); if (closure.is_full()) { - decrement_destination_counts(cm, src_region_idx, closure.source()); + decrement_destination_counts(cm, src_space_id, src_region_idx, + closure.source()); region_ptr->set_deferred_obj_addr(NULL); region_ptr->set_completed(); return; @@ -3187,7 +3187,7 @@ HeapWord* const end_addr = sd.region_align_down(closure.source()); if (sd.region_align_down(old_src_addr) != end_addr) { // The partial object was copied from more than one source region. - decrement_destination_counts(cm, src_region_idx, end_addr); + decrement_destination_counts(cm, src_space_id, src_region_idx, end_addr); // Move to the next source region, possibly switching spaces as well. All // args except end_addr may be modified. @@ -3227,19 +3227,21 @@ region_ptr->set_deferred_obj_addr(closure.destination()); status = closure.copy_until_full(); // copies from closure.source() - decrement_destination_counts(cm, src_region_idx, closure.source()); + decrement_destination_counts(cm, src_space_id, src_region_idx, + closure.source()); region_ptr->set_completed(); return; } if (status == ParMarkBitMap::full) { - decrement_destination_counts(cm, src_region_idx, closure.source()); + decrement_destination_counts(cm, src_space_id, src_region_idx, + closure.source()); region_ptr->set_deferred_obj_addr(NULL); region_ptr->set_completed(); return; } - decrement_destination_counts(cm, src_region_idx, end_addr); + decrement_destination_counts(cm, src_space_id, src_region_idx, end_addr); // Move to the next source region, possibly switching spaces as well. All // args except end_addr may be modified. diff --git a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp --- a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp +++ b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp @@ -1154,8 +1154,10 @@ HeapWord* end_addr); // Decrement the destination count for each non-empty source region in the - // range [beg_region, region(region_align_up(end_addr))). + // range [beg_region, region(region_align_up(end_addr))). If the destination + // count for a region goes to 0 and it needs to be filled, enqueue it. static void decrement_destination_counts(ParCompactionManager* cm, + SpaceId src_space_id, size_t beg_region, HeapWord* end_addr);
18-12-2008

EVALUATION Further testing with the fix for 6786188 (which periodically adds live objects in to_space) revealed an additional problem in the method PSParallelCompact::decrement_destination_counts(). That method decrements the destination_count field for a range of source regions and possibly enqueues them if the count reaches zero. The test to determine whether to enqueue them checks whether the region, call it cur_region, is less than cur_region->source_region() (i.e., whether the data to be used to fill cur_region is at higher addresses ("to the right")). When the heap is very full and to_space is at lower addresses than from_space, this test would fail for regions in to_space that are being copied to from_space, even though the regions need to be filled. Comments in PSParallelCompact::summary_phase() mention the problem and the fix: // XXX - this is necessary because decrement_destination_counts() tests // source_region() to determine if a region will be filled. Probably // better to pass src_space->new_top() into decrement_destination_counts // and test that instead. // // Clear the source_region field for each region in the space. clear_source_region(space->bottom(), _space_info[id].new_top());
18-12-2008

SUGGESTED FIX Diffs for the first part: change the assertions. diff --git a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp --- a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp +++ b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp @@ -726,7 +726,7 @@ size_t live_to_left = bitmap->live_words_in_range(search_start, oop(addr)); result += partial_obj_size + live_to_left; - assert(result <= addr, "object cannot move to the right"); + DEBUG_ONLY(PSParallelCompact::check_new_location(addr, result);) return result; } @@ -3318,7 +3318,7 @@ ParMarkBitMap::IterationStatus MoveAndUpdateClosure::copy_until_full() { if (source() != destination()) { - assert(source() > destination(), "must copy to the left"); + DEBUG_ONLY(PSParallelCompact::check_new_location(source(), destination());) Copy::aligned_conjoint_words(source(), destination(), words_remaining()); } update_state(words_remaining()); @@ -3339,7 +3339,7 @@ // This test is necessary; if omitted, the pointer updates to a partial object // that crosses the dense prefix boundary could be overwritten. if (source() != destination()) { - assert(source() > destination(), "must copy to the left"); + DEBUG_ONLY(PSParallelCompact::check_new_location(source(), destination());) Copy::aligned_conjoint_words(source(), destination(), words); } update_state(words); @@ -3364,7 +3364,7 @@ } if (destination() != source()) { - assert(destination() < source(), "must copy to the left"); + DEBUG_ONLY(PSParallelCompact::check_new_location(source(), destination());) Copy::aligned_conjoint_words(source(), destination(), words); } diff --git a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp --- a/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp +++ b/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp @@ -1230,6 +1230,8 @@ #endif // #ifndef PRODUCT #ifdef ASSERT + // Sanity check the new location of a word in the heap. + static inline void check_new_location(HeapWord* old_addr, HeapWord* new_addr); // Verify that all the regions have been emptied. static void verify_complete(SpaceId space_id); #endif // #ifdef ASSERT @@ -1397,6 +1399,15 @@ } } +#ifdef ASSERT +inline void +PSParallelCompact::check_new_location(HeapWord* old_addr, HeapWord* new_addr) +{ + assert(old_addr >= new_addr || space_id(old_addr) != space_id(new_addr), + "must move left or to a different space"); +} +#endif // ASSERT + class MoveAndUpdateClosure: public ParMarkBitMapClosure { public: inline MoveAndUpdateClosure(ParMarkBitMap* bitmap, ParCompactionManager* cm,
17-12-2008

SUGGESTED FIX Fix has two parts. First, change the asserts to allow for objects being copied to different spaces. Second, update decrement_destination_counts() to use the new_top() value for the source space to determine whether to enqueue regions whose destination_count goes to 0.
13-12-2008

EVALUATION The heap has objects in both from_space and to_space, and to_space is at lower addresses than from_space. Data copied from to_space to from_space will trigger this assert, although it is not really an error. This bug has existed since par compact was first integrated in jdk 5 update 6, but was only recently exposed by 6765745.
13-12-2008