United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-7157141 crash in 64 bit with corrupted oops
JDK-7157141 : crash in 64 bit with corrupted oops

Details
Type:
Bug
Submit Date:
2012-03-27
Status:
Closed
Updated Date:
2013-07-18
Project Name:
JDK
Resolved Date:
2012-04-11
Component:
hotspot
OS:
solaris_10
Sub-Component:
compiler
CPU:
x86
Priority:
P2
Resolution:
Fixed
Affected Versions:
hs23
Fixed Versions:
hs24 (b07)

Related Reports
Backport:
Backport:
Backport:
Backport:
Backport:
Relates:

Sub Tasks

Description
This is a copy of 7155505 since that bug id was going to be used for other tracking purposes.  Various GC related crashes are occurring with what look like stale oops in 64 bit only.  Turning on -XX:+VerifyBeforeGC shows failures of this sort:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (instanceKlass.cpp:2408), pid=5760, tid=1107667264
#  guarantee(false) failed: boom
#
# JRE version: 7.0_04-b16
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.0-b17 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /var/opt/sun/cacao2/instances/oem-ec/hs_err_pid5760.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

Running with -XX:+VerifyRememberedSets indicates that it's missing card marks.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (cardTableRS.cpp:356), pid=8838, tid=1085045056
#  guarantee(obj == NULL || (HeapWord*)obj >= _boundary) failed: pointer 0x00000006aba68610 at 0x00000007122d9c9c on clean card crosses boundary0x0000000712150000## JRE version: 7.0_02-b13
# Java VM: Java HotSpot(TM) 64-Bit Server VM (22.0-b10-never-hsx22 mixed mode linux-amd64 compressed oops)
# Core dump written. Default location: /var/opt/sun/cacao2/instances/oem-ec/core or core.8838
#
# An error report file with more information is saved as:
# /var/opt/sun/cacao2/instances/oem-ec/hs_err_pid8838.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

It happens with different collectors but seems to be sensitive to GC heap size settings.  Changing the max heap value will cause the problem to appear or disappear.  Changing other flags like -XX:-UseCompressedOops or turning off TieredCompilation may cause it to disappear too.

                                    

Comments
WORK AROUND

There's no definite workaround but changing heap values is likely to make the issue disappear.  This has only been seen once on Red Hat 5.6.
                                     
2012-03-27
EVALUATION

What's occurring is that the computed byte_map_base and the polling page are ending up at the same address.  In rare cases C2 generates some code that uses but the card table but accidentally matches immP_poll/loadConP_poll pattern:

operand immP_poll() %{
  predicate(n->get_ptr() != 0 && n->get_ptr() == (intptr_t)os::get_polling_page());
  match(ConP);

  // formats are generated automatically for constants and base registers
  format %{ %}
  interface(CONST_INTER);
%}

instruct loadConP_poll(rRegP dst, immP_poll src) %{
  match(Set dst src);
  format %{ "movq    $dst, $src\t!ptr" %}
  ins_encode %{
    AddressLiteral polling_page(os::get_polling_page(), relocInfo::poll_type);
    __ lea($dst$$Register, polling_page);
  %}
  ins_pipe(ialu_reg_fat);
%}

This puts a poll_Relocation on the card table value and improperly relocates it, resulting in a corrupted value of the card table base.  Card marks performed with this value are wrong and bad things occur.
                                     
2012-03-27
EVALUATION

http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/fd09f2d8283e
                                     
2012-04-03
EVALUATION

http://hg.openjdk.java.net/hsx/hsx23.2/hotspot/rev/3b1b50b3ad62
                                     
2012-05-22
not verified: bug description doesn't contain any information about reproducing
                                     
2013-06-05



Hardware and Software, Engineered to Work Together