JDK-7157141 : crash in 64 bit with corrupted oops
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: hs23
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_10
  • CPU: x86
  • Submitted: 2012-03-27
  • Updated: 2013-07-18
  • Resolved: 2012-04-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 Other
7u40Fixed 8Fixed hs23.2Fixed
Related Reports
Relates :  
Description
This is a copy of 7155505 since that bug id was going to be used for other tracking purposes.  Various GC related crashes are occurring with what look like stale oops in 64 bit only.  Turning on -XX:+VerifyBeforeGC shows failures of this sort:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (instanceKlass.cpp:2408), pid=5760, tid=1107667264
#  guarantee(false) failed: boom
#
# JRE version: 7.0_04-b16
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.0-b17 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /var/opt/sun/cacao2/instances/oem-ec/hs_err_pid5760.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

Running with -XX:+VerifyRememberedSets indicates that it's missing card marks.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (cardTableRS.cpp:356), pid=8838, tid=1085045056
#  guarantee(obj == NULL || (HeapWord*)obj >= _boundary) failed: pointer 0x00000006aba68610 at 0x00000007122d9c9c on clean card crosses boundary0x0000000712150000## JRE version: 7.0_02-b13
# Java VM: Java HotSpot(TM) 64-Bit Server VM (22.0-b10-never-hsx22 mixed mode linux-amd64 compressed oops)
# Core dump written. Default location: /var/opt/sun/cacao2/instances/oem-ec/core or core.8838
#
# An error report file with more information is saved as:
# /var/opt/sun/cacao2/instances/oem-ec/hs_err_pid8838.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

It happens with different collectors but seems to be sensitive to GC heap size settings.  Changing the max heap value will cause the problem to appear or disappear.  Changing other flags like -XX:-UseCompressedOops or turning off TieredCompilation may cause it to disappear too.

Comments
not verified: bug description doesn't contain any information about reproducing
05-06-2013

EVALUATION http://hg.openjdk.java.net/hsx/hsx23.2/hotspot/rev/3b1b50b3ad62
22-05-2012

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/fd09f2d8283e
03-04-2012

WORK AROUND There's no definite workaround but changing heap values is likely to make the issue disappear. This has only been seen once on Red Hat 5.6.
27-03-2012

EVALUATION What's occurring is that the computed byte_map_base and the polling page are ending up at the same address. In rare cases C2 generates some code that uses but the card table but accidentally matches immP_poll/loadConP_poll pattern: operand immP_poll() %{ predicate(n->get_ptr() != 0 && n->get_ptr() == (intptr_t)os::get_polling_page()); match(ConP); // formats are generated automatically for constants and base registers format %{ %} interface(CONST_INTER); %} instruct loadConP_poll(rRegP dst, immP_poll src) %{ match(Set dst src); format %{ "movq $dst, $src\t!ptr" %} ins_encode %{ AddressLiteral polling_page(os::get_polling_page(), relocInfo::poll_type); __ lea($dst$$Register, polling_page); %} ins_pipe(ialu_reg_fat); %} This puts a poll_Relocation on the card table value and improperly relocates it, resulting in a corrupted value of the card table base. Card marks performed with this value are wrong and bad things occur.
27-03-2012