JDK-4080617 : API: String.trim() method not working for all Unicode characters with SPACE prop
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 1.1.2,1.1.4,1.1.6
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • OS: solaris_2.6,windows_95
  • CPU: x86,sparc
  • Submitted: 1997-09-20
  • Updated: 2004-06-22
  • Resolved: 2001-10-28
Related Reports
Relates :  
Relates :  
Description

Name: tb29552			Date: 09/19/97


The trim() method does not remove all types of "white spaces" ;it ignores all the non-ASCII
spaces such as ideographic space (\u3000), em-space (\u2001), en-space (\u2000) etc.

Basically all Unicode characters with the SPACE property should be removed by trim(). There
are about 20 of them.

This is an annoying problem when you deal with Asian text, or text comming from desktop
publishing package where many different type of spaces are used.

Thanks


company - ILE , email - ###@###.###
======================================================================

Comments
WORK AROUND Name: tb29552 Date: 09/19/97 Rely on the character property not on the character value in trim(). ======================================================================
22-09-2004

PUBLIC COMMENTS ...
22-09-2004

EVALUATION This is probably one of the cases where independent of our choice a significant number of applications won't be happy with our solution and implement their own anyway. If we redefine the set of characters to be considered whitespace characters for trim(), we should take our cues from java.lang.Character.isWhitespace(). norbert.lindenberg@Eng 1998-09-09 The API spec should be revisited. The spec says that this method may be used to trim whitespace which is a reference to Character.isSpace() which is deprecated and replaced by isWhitespace(). Therefore, trim() should trim the whitespace characters defined by isWhitespace(), however, which doesn't match the first descriptions of the API spec. I'd consider this as an API spec bug rather than an RFE. masayoshi.okutsu@Eng 2000-02-09 While the current behavior may not be ideal for internationalized applications, it is to specification. The recent clarification to the spec for this method serves to make this more explicit. Furthermore, changing the behavior in the manner suggested could easily break existing applications. Perhaps a separate RFE should be filed for providing a new call (trimWhitespace() ?) to provide the desired functionality. ###@###.### 2001-10-28
28-10-2001