United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-5034475 : 64-bit hotspot: getfield doesn't sign-extend int (Regression since 1.4.0)

Details
Type:
Bug
Submit Date:
2004-04-20
Status:
Resolved
Updated Date:
2004-06-25
Project Name:
JDK
Resolved Date:
2004-05-03
Component:
hotspot
OS:
solaris_8
Sub-Component:
compiler
CPU:
sparc
Priority:
P2
Resolution:
Fixed
Affected Versions:
1.4.2_04
Fixed Versions:
1.4.2_06 (06)

Related Reports
Backport:

Sub Tasks

Description
This bug has two titles:

64-bit hotspot: getfield doesn't sign-extend int (Regression since 1.4.0)

and:

Tableswitch instruction crashes JVM: 64-bit, compiled method returning the switch parameter

The "Tableswitch" title describes the likely impact (the impact we've seen in 
the real world) but the first title describes the real problem (ie. maybe 
tableswitch is innocent).


Problem:
-------
The tableswitch codelet is entered with the switch value (int)
in o0.

In sparcv9 mode, with a hotspot-compiled method setting the switch paramter,
0x00000000ffffffff is in this register.  When the index into the
bytecode table is made, the result is an index way too large resulting
in a SEGV and crash.

An interpreted method will have put 0xffffffffffffffff in o0.  This satisfies
the arithmetic for finding the relevant place in the tableswitch bytecode table.

Impact:
------
This has been seen in a production application at a customer site,
and has now been reproduced with the test app below.

1.4.0 FCS and 1.4.0_04 are OK.  
1.4.1 and 1.4.2 in FCS and update versions appear bad, tested up to 1.4.2_04.
1.5 beta so far has refused to compile the getInt method in the testcase.

To reproduce:
------------

Save source below as Test.java and compile with:
	javac Test.java

Run with:
	java -d64 -XX:+PrintCompilation Test prime
	
As long as the getInt method got comiled, the VM exits with:

cheesypoof(5.9)$ java -d64 -XX:+PrintCompilation Test prime
Priming...
  1       Test::<init> (70 bytes)
  2       Test::doTest (119 bytes)
  1%      Test::<init> @ 25 (70 bytes)
  3       Test::getInt (5 bytes)
doing test...
i= -1

Unexpected Signal : 11 occurred at PC=0xFFFFFFFF3941572C
Function=[Unknown.]
Library=(N/A)

NOTE: We are unable to locate the function name symbol for the error
      just occurred. Please refer to release documentation for possible
      reason and solutions.


Current Java thread:

Dynamic libraries:
0x100000000     /net/cafebabe.uk/export/apps/products/java/jdk/prodn/j2sdk1.4.2_04/bin/sparcv9/java
0xffffffff7f300000      /usr/lib/64/libthread.so.1
0xffffffff7f500000      /usr/lib/64/libdl.so.1
0xffffffff7ef00000      /usr/lib/64/libc.so.1
0xffffffff7f100000      /usr/platform/SUNW,Sun-Blade-1000/lib/sparcv9/libc_psr.so.1
0xffffffff7e400000      /net/cafebabe.uk/export/apps/products/java/jdk/prodn/j2sdk1.4.2_04/jre/lib/sparcv9/server/libjvm.so
0xffffffff7e200000      /usr/lib/64/libCrun.so.1
0xffffffff7e000000      /usr/lib/64/libsocket.so.1
0xffffffff7de00000      /usr/lib/64/libnsl.so.1
0xffffffff7dc00000      /usr/lib/64/libm.so.1
0xffffffff7d900000      /usr/lib/64/libsched.so.1
0xffffffff7ed00000      /usr/lib/64/libw.so.1
0xffffffff7d600000      /usr/lib/64/libmp.so.2
0xffffffff7d400000      /usr/lib/64/librt.so.1
0xffffffff7d100000      /usr/lib/64/libaio.so.1
0xffffffff7cf00000      /usr/lib/64/libmd5.so.1
0xffffffff7cd00000      /usr/platform/SUNW,Sun-Blade-1000/lib/sparcv9/libmd5_psr.so.1
0xffffffff7c900000      /net/cafebabe.uk/export/apps/products/java/jdk/prodn/j2sdk1.4.2_04/jre/lib/sparcv9/native_threads/libhpi.so
0xffffffff7c100000      /net/cafebabe.uk/export/apps/products/java/jdk/prodn/j2sdk1.4.2_04/jre/lib/sparcv9/libverify.so
0xffffffff7be00000      /net/cafebabe.uk/export/apps/products/java/jdk/prodn/j2sdk1.4.2_04/jre/lib/sparcv9/libjava.so
0xffffffff7bc00000      /net/cafebabe.uk/export/apps/products/java/jdk/prodn/j2sdk1.4.2_04/jre/lib/sparcv9/libzip.so
0xffffffff2f800000      /usr/lib/locale/en_GB.ISO8859-1/sparcv9/en_GB.ISO8859-1.so.2

Heap at VM Abort:
Heap
 def new generation   total 2112K, used 155K [0xffffffff2fc00000, 0xffffffff2fe20000, 0xffffffff31150000)
  eden space 2048K,   7% used [0xffffffff2fc00000, 0xffffffff2fc26ea0, 0xffffffff2fe00000)
  from space 64K,   0% used [0xffffffff2fe00000, 0xffffffff2fe00000, 0xffffffff2fe10000)
  to   space 64K,   0% used [0xffffffff2fe10000, 0xffffffff2fe10000, 0xffffffff2fe20000)
 tenured generation   total 1408K, used 0K [0xffffffff31150000, 0xffffffff312b0000, 0xffffffff33c00000)
   the space 1408K,   0% used [0xffffffff31150000, 0xffffffff31150000, 0xffffffff31150200, 0xffffffff312b0000)
 compacting perm gen  total 16384K, used 1515K [0xffffffff33c00000, 0xffffffff34c00000, 0xffffffff37c00000)
   the space 16384K,   9% used [0xffffffff33c00000, 0xffffffff33d7af68, 0xffffffff33d7b000, 0xffffffff34c00000)

Local Time = Tue Apr 20 10:47:16 2004
Elapsed Time = 2
#
# HotSpot Virtual Machine Error : 11
# Error ID : 4F530E43505002EF 01
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.2_04-b05 mixed mode)
#
# An error report file has been saved as hs_err_pid6072.log.
# Please refer to the file for further information.
#
Abort(coredump)
cheesypoof(5.9)$ 


If the getInt method did not get hotspot-compiled, the app will not crash
the VM: this often happens the first time you run that version of java
with the testcase.  Running it a second time is generally faster due to caching,
so will usually compile getInt and show the crash...


That test application code in full:

=================================================================
//
// Crash the VM in a TableSwitch interpreter codelet
// java -d64 -XX:+PrintCompilation Test prime
//
// To NOT crash, run without -d64
// or without the "prime" flag to avoid hotspotting the getInt method
//
//


class Test {
	
	private int someInt = -1;

	private int getInt() {

		return someInt;

	}

	private void doTest(int delay) {
		int i = 0xffffffff;

		if (delay==0)
			return;

		i = getInt();
		System.out.println("i= " + i);

		switch (getInt()) {
			case -1:
			System.out.println("was -1");
			break;
			
			case 0:
			case 1:
			case 2:
			case 3:
			case 4:
			System.out.println("was NOT -1");
			break;
	
			default:
			System.out.println("was NOT -1");			
			break;

		} // end switch
		
		System.out.println("done switch");



	}

	Test(boolean prime) {
	int i=0;
	
		if (prime) {
			System.out.println("Priming...");
			for (int j=0; j<50000;j++) {
				i = getInt();
				doTest(0);
			}
			
			delay(2000);
		}
		
		System.out.println("doing test...");
		doTest(5000);
	}



	public static void main(String args[]) {

	Test t;
	
		if (args.length>0) {
			t = new Test(true);
		} else {
			t = new Test(false);
		}

	}

  private void delay(int time) {
    try {  Thread.sleep(time);
                } catch (InterruptedException e) {
                  System.out.println("Interrupted!");
                        e.printStackTrace();
                }
  }

}

=================================================================


                                    

Comments
EVALUATION

On sparcv9, a 32 bit int that resides in a register is only 32 "straight": the upper 32 bits of the register are undefined.

Since we are using the int to add to an address for indexing into the table, we must sign extend the int to be 64 bits long in TemplateTable::tableswitch().

###@###.### 2004-04-21

There is something to be said for the idea that the 64bit interpreter is
sloppy with the tosca for integers. It mostly attempts to keep the tosca
as 64bits but doesn't in a few places. If it had done everything as
64bits then the need for br/brx could have been eliminated as br would
do the equivalent of brx on 64bit and a whole class of bugs eliminated.
Since it doesn't the bandaid fix here is to specifically widen the
tosca to 64bits in tableswitch. An examination of similar bytecodes
suggests that this was the only one that was broken.

I added two attachments that reproduced the bug using 1.5 before it
was corrected.

###@###.### 2004-04-23
                                     
2004-04-23
PUBLIC COMMENTS

This bug has two titles:

64-bit hotspot: getfield doesn't sign-extend int (Regression since 1.4.0)

and:

Tableswitch instruction crashes JVM: 64-bit, compiled method returning the switch parameter

The "Tableswitch" title describes the likely impact (the impact we've seen in 
the real world) but the first title describes the real problem (ie. maybe 
tableswitch is innocent).
                                     
2004-07-08
SUGGESTED FIX

Add something like this at the top of tableswitch() in templateTable_sparc.cpp:

#ifdef _LP64
  // Sign extend int to 64 bits: we'll be adding
  // to an address below.
  __ sra( Otos_i, 0, Otos_i );
#endif

###@###.### 2004-04-21

*** 1,7 ****
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "@(#)templateTable_sparc.cpp    1.228 04/01/05 16:38:35 JVM"
  #endif
  /*
   * Copyright 2004 Sun Microsystems, Inc.  All rights reserved.
   * SUN PROPRIETARY/CONFIDENTIAL.  Use is subject to license terms.
   */
--- 1,7 ----
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "@(#)templateTable_sparc.cpp    1.229 04/04/23 10:43:24 JVM"
  #endif
  /*
   * Copyright 2004 Sun Microsystems, Inc.  All rights reserved.
   * SUN PROPRIETARY/CONFIDENTIAL.  Use is subject to license terms.
   */

*** 1577,1586 ****
--- 1577,1591 ----
    __ add(Lbcp, BytesPerInt, O1);
    __ and3(O1, -BytesPerInt, O1);
    // load lo, hi
    __ ld(O1, 1 * BytesPerInt, O2);     // Low Byte
    __ ld(O1, 2 * BytesPerInt, O3);     // High Byte
+ #ifdef _LP64
+   // Sign extend the 32 bits
+   __ sra ( Otos_i, 0, Otos_i );
+ #endif /* _LP64 */
+ 
    // check against lo & hi
    __ cmp( Otos_i, O2);
    __ br( Assembler::less, false, Assembler::pn, default_case);
    __ delayed()->cmp( Otos_i, O3 );
    __ br( Assembler::greater, false, Assembler::pn, default_case);
                                     
2004-07-08
WORK AROUND

Create a .hotspot_compiler file and exclude the problem method by specifying:

exclude path/to/class method

However, a real-world application may be vulnerable in MANY places.


Run without -d64 to revert to 32-bit mode.  Again, may not be practical.


Recode your switch statement: the value has to go directly from the compiled method into the switch to cause a problem.  If the returned int is stored in an intervening int, there is no problem.  ie.

                int x = getInt();
                switch (x) {

Rather than:
		switch(getInt()) {


Recoding to avoid JVM crashes is again not a practical thing to ask customers.
                                     
2004-07-08
CONVERTED DATA

BugTraq+ Release Management Values

COMMIT TO FIX:
1.4.2_06
generic
tiger-beta2

FIXED IN:
1.4.2_06
tiger-beta2

INTEGRATED IN:
1.4.2_06
tiger-beta2


                                     
2004-07-08



Hardware and Software, Engineered to Work Together