United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-5043395 : 1.4.2_04 Server VM - C2 crash in PhaseCFG::ScheduleLate on Solaris

Details
Type:
Bug
Submit Date:
2004-05-06
Status:
Resolved
Updated Date:
2005-02-23
Project Name:
JDK
Resolved Date:
2004-06-15
Component:
hotspot
OS:
solaris_9,solaris_8,generic
Sub-Component:
compiler
CPU:
sparc,generic
Priority:
P2
Resolution:
Fixed
Affected Versions:
1.4.2,1.4.2_02,1.4.2_03,1.4.2_04,1.4.2_05
Fixed Versions:
1.4.2_06 (06)

Related Reports
Backport:
Relates:

Sub Tasks

Description
BEA is seeing the following crash in J2SE 1.4.2_04 using -server
only on Solaris 9.


(dbx) where
current thread: t@10
=>[1] _libc_read(0x0, 0xff3435e4, 0x400, 0xff392528, 0xff340430,
0x0), at 0xff31e5c8
  [2] __filbuf(0xff34027c, 0xff3439e4, 0xff33c000, 0x0, 0x400,
0x0), at 0xff30ea9c
  [3] fgets(0xff343a4c, 0xff33fc40, 0xff34027c, 0xff33c000,
0xff3439e4, 0x3ff), at 0xff3112e4
  [4] os::message_box(0xe13803d8, 0xfe5c3484, 0xb, 0xfe1cd0f4,
0x3191, 0x0), at 0xfe499f30
  [5] os::handle_unexpected_exception(0xea9e8, 0xb, 0xfe1cd0f4,
0xe1380918, 0xb, 0x0), at 0xfe496688
  [6] JVM_handle_solaris_signal(0xfe1cd0f4, 0xe1380918,
0xe1380660, 0x3400, 0x35ec, 0x0), at 0xfe1d90ac
  [7] __sighndlr(0xb, 0xe1380918, 0xe1380660, 0xfe1d875c,
0xe1381e14, 0xe1381e04), at 0xff37b840
  [8] sigacthandler(0xb, 0xe1381d70, 0x0, 0x0, 0x0, 0xff38e000),
at 0xff3784e0
  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [9] PhaseCFG::ScheduleLate(0x1, 0x0, 0xe1380d28, 0x1d48b70,
0x1169058, 0x1ce7b34), at 0xfe1cd0f4
  [10] PhaseCFG::GlobalCodeMotion(0xe1380ac8, 0x1d48b70, 0x3400,
0xe1380ec0, 0x366c, 0x0), at 0xfe1ce01c
  [11] Compile::Code_Gen(0xe1381270, 0xfe5335c4, 0xe1381184,
0xfe570000, 0x0, 0x0), at 0xfe1d2bc0
  [12] Compile::Compile(0xfe5333f9, 0xb237bc, 0x210d5a4, 0x60d9a8,
0xffffffff, 0x1), at 0xfe2008e8
  [13] C2Compiler::compile_method(0x2bb78, 0xe1381a8c, 0x0,
0x244b290, 0xffffffff, 0x0), at 0xfe1fd08c
  [14] CompileBroker::invoke_compiler_on_method(0xada, 0x0,
0xffffffff, 0xfe5aee50, 0xfe5bbbe4, 0xea9e8),
 at 0xfe1fc850
  [15] CompileBroker::compiler_thread_loop(0xfe533c01, 0xfe5af218,
0xea9e8, 0xeaf98, 0x306d10, 0xfe269254
), at 0xfe2ac1f8
  [16] JavaThread::run(0xea9e8, 0xb, 0x40, 0x0, 0xa, 0xff38e000),
at 0xfe26927c
  [17] _start(0xea9e8, 0xff38f688, 0x1, 0x1, 0xff38e000, 0x0), at
0xfe26575c
 
---------------------------------------------------------------------------------------------------------
-
 
  Unexpected Signal : 11 occurred at PC=0xFE1CD0F4
Function=[Unknown. Nearest: JVM_FillInStackTrace+0x4CE8]
Library=/export/home/posys/81sp3L6/jdk142_04/jre/lib/sparc/server/libjvm.so



                                    

Comments
WORK AROUND


###@###.### 2004-05-14

Exclude from compilation the next method
oracle/jdbc/driver/OraclePreparedStatement executeBatch
                                     
2004-05-14
SUGGESTED FIX


###@###.### 2004-05-25

John suggested the next fix:

--- /net/jano/export/hotspot/users2/kvn/142_05/src/share/vm/opto/parse1.cpp	Tue May 25 16:05:05 2004
***************
*** 1659,1670 ****
    for (uint i = TypeFunc::Parms; i < monoff; i++) {
      ensure_phi(i);
    }
!   if (monoff < map()->req() && is_osr_parse()) {
!     // However, OSR methods do not have strictly structured control flow.  
!     // (Bug 4426707)
!     for (uint m = 0; m < nof_monitors; m++) {
!       ensure_phi(map()->jvms()->monitor_obj_offset(m));
!     }
    }
  }
  
--- 1659,1670 ----
    for (uint i = TypeFunc::Parms; i < monoff; i++) {
      ensure_phi(i);
    }
!   // Even monitors need Phis, though they are well-structured.
!   // This is true for OSR methods, and also for the rare cases where
!   // a monitor object is the subject of a replace_in_map operation.
!   // See bugs 4426707 and 5043395.
!   for (uint m = 0; m < nof_monitors; m++) {
!     ensure_phi(map()->jvms()->monitor_obj_offset(m));
    }
  }

The next changes are to bailout the compilation when we get incorrect graph
during GlobalCodeMotion (for 1.4.2 VM).

--- /net/jano/export/hotspot/users2/kvn/142_05/src/share/vm/opto/gcm.cpp	Tue May 25 15:44:02 2004
***************
*** 820,825 ****
--- 820,833 ----
      // in the dominator tree.  Thus the Node will dominate all its uses.
      Block *least = LCA;           // Least execution frequency
      
+     if( _bbs[self->_idx]->_dom_depth > least->_dom_depth ) {
+       if( PrintOptoBailouts ) {
+         tty->print_cr("unschedulable graph detected during ScheduleLate");
+       }
+       C->set_result(Compile::Comp_no_retry); // Bailout without retry
+       _root = NULL;
+       return;
+     }
      // Must clone guys stay next to use; no hoisting allowed.
      // Also cannot hoist guys that alter memory or are otherwise not
      // allocatable (hoisting can make a value live longer, leading to
***************
*** 866,877 ****
          // the earliest legal location.  Capture the least execution frequency.
          while( LCA != early ) {
            LCA = LCA->_idom;         // Follow up the dominator tree
  
            // Don't hoist machine instructions to the root basic block
            if (mach && LCA == rootBlock)
              break;
  
-           assert( LCA, "" );        // unscheduable graph assertion
            uint start_lat = node_latency.at_grow(LCA->_nodes[0]->_idx);
            uint end_idx   = LCA->end_idx();
            uint end_lat   = node_latency.at_grow(LCA->_nodes[end_idx]->_idx);
--- 874,892 ----
          // the earliest legal location.  Capture the least execution frequency.
          while( LCA != early ) {
            LCA = LCA->_idom;         // Follow up the dominator tree
+           if( LCA == NULL ) {
+             if( PrintOptoBailouts ) {
+               tty->print_cr("LCA == NULL detected during ScheduleLate");
+             }
+             C->set_result(Compile::Comp_no_retry); // Bailout without retry
+             _root = NULL;
+             return;
+           }
  
            // Don't hoist machine instructions to the root basic block
            if (mach && LCA == rootBlock)
              break;
  
            uint start_lat = node_latency.at_grow(LCA->_nodes[0]->_idx);
            uint end_idx   = LCA->end_idx();
            uint end_lat   = node_latency.at_grow(LCA->_nodes[end_idx]->_idx);
***************
*** 959,964 ****
--- 974,982 ----
    roots.push(C->top());
    while( roots.size() ) {       // While worklist is not empty
      if( !roots.pop()->schedule_early(visited,roots,_bbs) ) {
+       if( PrintOptoBailouts ) {
+         tty->print_cr("unschedulable graph detected during schedule_early");
+       }
        // Bailout without retry
        C->set_result( Compile::Comp_no_retry );
        _root = NULL;
***************
*** 983,988 ****
--- 1001,1009 ----
    // loop nesting depth that is lowest in the dominator tree.  
    visited.Clear();
    ScheduleLate(visited, stack, node_latency);
+   if ( _root == NULL ) {
+     return;   // Bailout without retry
+   }
    unique = C->unique();
  
    // Detect implicit-null-check opportunities.  Basically, find NULL checks 


The next fix is backport to 1.4.2 from 1.5.
The method NOT_PRODUCT( verify_graph_edges(); ) is called at the beginning 
of the Optimize() and code_gen() methods. And it should be called 
after has_root() check.

--- /net/jano/export/hotspot/users2/kvn/142_05/src/share/vm/opto/compile.cpp	Tue May 25 15:44:58 2004
***************
*** 456,465 ****
      Finish_Warm();
    }
  
-   NOT_PRODUCT( verify_graph_edges(); )
    // Now optimize
    if (has_root())  Optimize();
!   NOT_PRODUCT( verify_graph_edges(); )
    // Now generate code
    if (has_root())  Code_Gen();
  
--- 456,464 ----
      Finish_Warm();
    }
  
    // Now optimize
    if (has_root())  Optimize();
! 
    // Now generate code
    if (has_root())  Code_Gen();
  
***************
*** 533,539 ****
      kit.gen_stub(stub_function, stub_name, is_fancy_jump, pass_tls, return_pc);
    }
  
-   NOT_PRODUCT( verify_graph_edges(); )
    Code_Gen();
  
  
--- 532,537 ----
***************
*** 601,607 ****
      _stub_name = "i2c";
      NOT_PRODUCT( print_opto_verbose_signature(j_sig, _stub_name); )
      Generate_Interpreter_To_Compiled_Graph(j_sig);
-     NOT_PRODUCT( verify_graph_edges(); )
      Code_Gen();
      if (!has_root())  return;
      env()->register_i2c_adapter(method,
--- 599,604 ----
***************
*** 613,619 ****
      _stub_name = "c2i";
      NOT_PRODUCT( print_opto_verbose_signature(j_sig, _stub_name); )
      Generate_Compiled_To_Interpreter_Graph(j_sig, method->interpreter_entry());
-     NOT_PRODUCT( verify_graph_edges(); )
      Code_Gen();
      if (!has_root())  return;
      env()->register_c2i_adapter(method,
--- 610,615 ----
***************
*** 686,692 ****
      set_initial_gvn(NULL);
    } // End of gvn, etc.
  
-   NOT_PRODUCT( verify_graph_edges(); )
    Code_Gen();
    if (!has_root())  return;
  
--- 682,687 ----
***************
*** 1430,1441 ****
  
      cfg.Estimate_Block_Frequency();
      cfg.GlobalCodeMotion(m,unique(),proj_list);
-     NOT_PRODUCT( verify_graph_edges(); )
- 
      if (cfg._root == NULL) {
        set_root(NULL);
        return;
      }
      debug_only( cfg.verify(); )
    }
    NOT_PRODUCT( verify_graph_edges(); )
--- 1425,1436 ----
  
      cfg.Estimate_Block_Frequency();
      cfg.GlobalCodeMotion(m,unique(),proj_list);
      if (cfg._root == NULL) {
        set_root(NULL);
        return;
      }
+     NOT_PRODUCT( verify_graph_edges(); )
+ 
      debug_only( cfg.verify(); )
    }
    NOT_PRODUCT( verify_graph_edges(); )

==========================================================================

###@###.### 2004-06-02

Call new function is_monitor_box() defined in callnode.hpp
instead of is_monitor_use() in the ensure_phi() method.
Add asserts to make sure we call ensure_phi() only for parser maps.

--- /net/jano/export/hotspot/users2/kvn/142_05/src/share/vm/opto/parse1.cpp
***************
*** 1512,1517 ****
--- 1512,1518 ----
      }
  
      // Update all the non-control inputs to map:
+     assert(TypeFunc::Parms == newin->jvms()->locoff(), "parser map should contain only youngest jvms");
      for (uint j = 1; j < newin->req(); j++) {
        Node* m = map()->in(j);   // Current state of target.
        Node* n = newin->in(j);   // Incoming change to target state.
***************
*** 1656,1670 ****
    uint monoff = map()->jvms()->monoff();
    uint nof_monitors = map()->jvms()->nof_monitors();
  
    for (uint i = TypeFunc::Parms; i < monoff; i++) {
      ensure_phi(i);
    }
!   if (monoff < map()->req() && is_osr_parse()) {
!     // However, OSR methods do not have strictly structured control flow.  
!     // (Bug 4426707)
!     for (uint m = 0; m < nof_monitors; m++) {
!       ensure_phi(map()->jvms()->monitor_obj_offset(m));
!     }
    }
  }
  
--- 1657,1672 ----
    uint monoff = map()->jvms()->monoff();
    uint nof_monitors = map()->jvms()->nof_monitors();
  
+   assert(TypeFunc::Parms == map()->jvms()->locoff(), "parser map should contain only youngest jvms");
    for (uint i = TypeFunc::Parms; i < monoff; i++) {
      ensure_phi(i);
    }
!   // Even monitors need Phis, though they are well-structured.
!   // This is true for OSR methods, and also for the rare cases where
!   // a monitor object is the subject of a replace_in_map operation.
!   // See bugs 4426707 and 5043395.
!   for (uint m = 0; m < nof_monitors; m++) {
!     ensure_phi(map()->jvms()->monitor_obj_offset(m));
    }
  }
  
***************
*** 1732,1738 ****
    } else if (jvms->is_stk(idx)) {
      t = block()->stack_type_at(idx - jvms->stkoff());
    } else if (jvms->is_mon(idx)) {
!     assert(!jvms->is_monitor_use(idx), "no phis for boxes");
      t = TypeInstPtr::BOTTOM; // this is sufficient for a lock object
    } else if ((uint)idx < TypeFunc::Parms) {
      t = o->bottom_type();  // Type::RETURN_ADDRESS or such-like.
--- 1734,1740 ----
    } else if (jvms->is_stk(idx)) {
      t = block()->stack_type_at(idx - jvms->stkoff());
    } else if (jvms->is_mon(idx)) {
!     assert(!jvms->is_monitor_box(idx), "no phis for boxes");
      t = TypeInstPtr::BOTTOM; // this is sufficient for a lock object
    } else if ((uint)idx < TypeFunc::Parms) {
      t = o->bottom_type();  // Type::RETURN_ADDRESS or such-like.


--- /net/jano/export/hotspot/users2/kvn/142_05/src/share/vm/opto/callnode.hpp
242a243,246
>   bool is_monitor_box(uint off)    const {
>     assert(is_mon(off), "should be called only for monitor edge");
>     return (0 == bitfield(off - monoff(), 0, logMonitorEdges));
>   }
244c248
< 						   && 0 == bitfield(off - monoff(), 0, logMonitorEdges))
---
> 						   && is_monitor_box(off))
                                     
2004-09-17
EVALUATION


###@###.### 2004-05-14

Call to uncommon trap has input Phi node which is in
the deeper block (in dominator tree) than the call's block.
During schedule_early() we map the call to Phi's block when
we looking for earliest call's input.
Later during ScheduleLate() we take Phi's block as earliest 
and the original call's block (it still has call's uses) as
LCA (Lowest common ancestor). Then we walk up the dominator 
tree from this LCA to the earliest block. And we end up in 
the root block since dom depth of the LCA high then the earliest 
block so we can not go there. The root block doesn't have 
an immediate dominator (it is NULL) so we SEGV when try to use it.

Now I have to figure out why we have this wrong input Phi node.


###@###.### 2004-05-25

I reproduced the problem with small test.
The debug version of VM gives the next assert:

# Java VM: Java HotSpot(TM) Server VM (1.4.1-internal-debug mixed mode)
#
# assert(!nocreate, "Cannot build a phi for a block already parsed.")
#
# Error ID: /net/jano/export/hotspot/users2/kvn/142_05/src/share/vm/opto/parse1.cpp, 1727 [ Patched ]

The problem in Parse::ensure_phis_everywhere() where we are not 
generating Phis for monitors assuming that they are fine.
But even monitors need Phis, though they are well-structured.
This is true for OSR methods (bug 4426707), and also for the rare cases where
a monitor object is the subject of a replace_in_map operation (this bug).

Also we need to bail out from compilation instead of crash when we
get incorrect graph.
                                     
2004-09-17
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
1.4.2_06
generic
tiger-rc

FIXED IN:
1.4.2_06
tiger-rc

INTEGRATED IN:
1.4.2_06
tiger-b56
tiger-rc


                                     
2004-09-17



Hardware and Software, Engineered to Work Together