JDK-8028710 : G1 does not retire allocation buffers after reference processing work
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs25
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2013-11-20
  • Updated: 2015-01-21
  • Resolved: 2014-04-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
8u40Fixed 9 b12Fixed
When -XX:+ParallelRefProcEnabled is active and run in debug mode G1 crashes with

Internal Error at g1CollectedHeap.hpp:1823, pid=30589, tid=139755213526784
assert(_priority_buffer[pr]->is_retired()) failed: alloc buffers should all retire at this point.

Do you want to debug the problem?

To debug, run 'gdb /proc/30589/exe 30589'; then switch to thread 139755213526784 (0x00007f1b4bdb9700)
Enter 'yes' to launch gdb automatically (PATH must include gdb)
Otherwise, press RETURN to abort...

during reference processing.

The problem is that after soft reference processing, the allocation buffers are not retired. Unknown impact on product mode, but the possibility of heap "corruption" as the remainders of the allocation buffers are not filled with dummy objects (I think).

I think this issue has been in since G1 supports reference processing during STW pauses (r2720, bug#6484982)

Patch attached.

Preliminary ILW: impact H (assertion failure), likelhood L (needs debug build), workaround M (disable the given switch) -> would be P3, but this only occurs with a debug build, and there have been successful runs without problems on large applications with product builds, so P4 for now
noreg-other justification: proper retirement of the allocation buffers is checked by the code at all times now.

Release team: Approved for deferral.

SQE is OK to defer. Hanging 8-SQE-OK.

8-defer-request justification: - tried to reproduce any issue on a product build choosing settings that stresses this code did not yield any crash (e.g. with full verification); there is no suspicion that any of the existing failures is due to that issue. It seems that this is either an overzealous assert or the problem cannot be triggered because other code prevents triggering it. - while there is a fix for this particular assert (attached), above testing showed that finding a situation and writing a reproducer that shows the fix for the issue will take a considerable time (or is simply impossible since other code hides this issue). - the crash/assertion occurs only on a debug build and -XX:+ParallelRefProcEnabled. ParallelRefProcEnabled is disabled by default, so you need to explicitly enable it in a debug build. - workaround (if there is an issue) is simple by not using a debug build and not enabling ParallelRefProcEnabled; note that enabling ParallelRefProcEnabled on most benchmarks decreases performance (see JDK-7068229 for some numbers) so there is low incentive to enable it.

ILW is as Thomas states HLM and after discussions with SQE we decided to follow the guidelines and make it P3 for now.