United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6792400 Avoid loading of Normalizer resources for simple uses
JDK-6792400 : Avoid loading of Normalizer resources for simple uses

Details
Type:
Bug
Submit Date:
2009-01-11
Status:
Resolved
Updated Date:
2010-07-29
Project Name:
JDK
Resolved Date:
2009-02-13
Component:
core-libs
OS:
generic
Sub-Component:
java.text
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
6
Fixed Versions:
6u14 (b02)

Related Reports
Backport:

Sub Tasks

Description
Attempt to use Normalizer.normalize() on simple ASCII strings should not cause loading and parsing of unorm.icu. This file is over 100k and initialization is not cheap. It also requires opening and reading directory of resource.jar file.

In many situations Normalizer object is used to "do it safe" while it has no impact on result of execution of the program. E.g. Normalizer is used get canonical names of certificates but many names are ASCII and do not need to be normalized!

Unfortunatelly, some of these "uses" of Normalizer happen on startup of application, e.g. in webstart code 
performs verification of certificates and triggers full initialization of Normalizer. 

Perhaps it can be possible to do simple check of whether request for normalization is trivial before performing full normalization for first time. See suggested fix for possible solution.

                                    

Comments
SUGGESTED FIX

diff -r e811a72bfbf4 addon/sun/text/normalizer/NormalizerBase.java
--- a/addon/sun/text/normalizer/NormalizerBase.java	Sun Jan 11 20:32:53 2009 +0300
+++ b/addon/sun/text/normalizer/NormalizerBase.java	Sun Jan 11 21:19:47 2009 +0300
@@ -699,11 +699,26 @@ public final class NormalizerBase implem
      * @param options The normalization options, ORed together (0 for no options).
      * @return String The decomposed string 
      * @stable ICU 2.6
-     */         
+     */
+    private static boolean stillLazy = true; 
+
     public static String decompose(String str, boolean compat, int options) {
         
         int[] trailCC = new int[1];
         int destSize=0;
+
+        if (stillLazy) {
+            char c[] = str.toCharArray();
+            for(int i=0;i<c.length;i++) {
+                if (c[i] > 127) {
+                    stillLazy = false;
+                    break;
+                }
+            }
+            if (stillLazy)
+                return str;
+        }
+
         UnicodeSet nx = NormalizerImpl.getNX(options);
         char[] dest;
                                     
2009-01-11
EVALUATION

Changed NormalizerBase.java to handle ASCII-only text special no matter if the initialization is complete. Performance tests show it's now 20 times faster to process ASCII-only text while the overhead is negligible for non-ASCII text.
                                     
2009-02-02



Hardware and Software, Engineered to Work Together