United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6242664 : String.offsetByCodePoints doesn't work for Strings returned by String.substring

Details
Type:
Bug
Submit Date:
2005-03-18
Status:
Resolved
Updated Date:
2010-08-03
Project Name:
JDK
Resolved Date:
2005-09-12
Component:
core-libs
OS:
linux
Sub-Component:
java.lang
CPU:
x86
Priority:
P3
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:

Related Reports
Backport:

Sub Tasks

Description
FULL PRODUCT VERSION :
java version "1.5.0_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_01-b08)
Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_01-b08, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Linux hylas 2.6.9-gentoo-r14 #1 Mon Jan 31 13:57:09 EST 2005 x86_64 AMD Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux


EXTRA RELEVANT SYSTEM CONFIGURATION :
Tested in Netbeans 4.0

A DESCRIPTION OF THE PROBLEM :
If you get a String back from String.substring(), and try to run .offsetByCodePoints(0,1) on it, it will return a code point index that appears to be relative to the source string (the one you called .subtring() on.)

This is incorrect, since the specification of String.substring() says that it returns a new String().


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the sample code included.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Since I'm running with a basic US locale, I expect the sample code to print

i=1  j=1  k=1

Certainly, there should be no difference between the values printed for j and k.

ACTUAL -
The sample code prints:

i=1  j=4  k=1



ERROR MESSAGES/STACK TRACES THAT OCCUR :
No error codes or exceptions are generated.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
package foo;

public class BugTestClass {
    
    public static void main(String args[]) {

        String myString = "abcdef";
        int i = myString.offsetByCodePoints(0,1);
        String sub = myString.substring(3);
        int j = sub.offsetByCodePoints(0,1);
        int k = new String(sub).offsetByCodePoints(0,1);
        System.out.println("i=" + i + "  j=" + j + "  k=" + k);
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
The workaround is to create a new String as in the example for k above, or like so:

String source = "abcdef";
String sub = new String( source.substring(3) );

Now sub.offsetByCodePoints will work as expected.
###@###.### 2005-03-18 09:18:16 GMT

                                    

Comments
EVALUATION

Offset value in String class needs to be considered.
                                     
2005-08-18
WORK AROUND

Use Character.offsetByCodePoints(string, index, codePointOffset).
###@###.### 2005-06-03 08:09:51 GMT
                                     
2005-06-03
SUGGESTED FIX

1) Change the 3rd parameter of the invocation of Character.offsetByCodePointsImpl from "offset+index" to "index" in String.offsetByCodePoints
2) Make Character.offsetByCodePointsImpl aware of its "start" parameter. Currently indexing into array parameter "a" assumes this array starts at zero rather than at "start." Note the initialization of "x" and the use of "a[x++]", which do not include "start")
3) Add tests of the codepoint methods involving String.substring so they will operate on strings whose "offset" fields will be nonzero.

###@###.### 2005-04-05 13:23:00 GMT
                                     
2005-04-05



Hardware and Software, Engineered to Work Together