JDK-6636319 : Encoders should implement isLegalReplacement(byte[] repl)
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio.charsets
  • Affected Version: 7
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2007-12-01
  • Updated: 2025-07-23
  • Resolved: 2009-03-27
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7
7 b53Fixed
Related Reports
Relates :  
Description
CharsetEncoder provides an implementation of isLegalReplacement,
but it is not very efficient.

     * <p> The default implementation of this method is not very efficient; it
     * should generally be overridden to improve performance.  </p>
     *
      */
    public boolean isLegalReplacement(byte[] repl) {

The intent was likely that JDK-provided Encoders would implement this,
but it has not happened.

This is particularly important because creation of an Encoder causes an
entire decoding operation to be performed, which makes creation of Encoders
much more expensive than creation of Decoders.

For a number of Encoders, the implementation is a trivial { return true; }

Comments
EVALUATION And here are results for the above microbenchmark on solaris-sparc, showing almost an order of magnitude improvement: $ for f in -server -client; do echo $f; mergeBench dolphin concurrent jr $f -dsa -da -Dcharset=UTF-8 NewEncoderMicroBenchmark; done -server Merged results for dolphin vs. concurrent running jr -server -dsa -da -Dcharset=UTF-8 NewEncoderMicroBenchmark ==> javac -Xlint:all NewEncoderMicroBenchmark.java ==> java -server -dsa -da -Dcharset=UTF-8 NewEncoderMicroBenchmark Method Millis Ratio vs. dolphin new UTF-8 encoder 1261 1.000 0.172 -client Merged results for dolphin vs. concurrent running jr -client -dsa -da -Dcharset=UTF-8 NewEncoderMicroBenchmark ==> javac -Xlint:all NewEncoderMicroBenchmark.java ==> java -client -dsa -da -Dcharset=UTF-8 NewEncoderMicroBenchmark Method Millis Ratio vs. dolphin new UTF-8 encoder 2420 1.000 0.116
01-12-2007

EVALUATION Here's a microbenchmark testing the time to create an encoder for a given Charset /* * Copyright (c) 2007 Sun Microsystems, Inc. All Rights Reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 only, as * published by the Free Software Foundation. * * This code is distributed in the hope that it will be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License * version 2 for more details (a copy is included in the LICENSE file that * accompanied this code). * * You should have received a copy of the GNU General Public License version * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. * * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, * CA 95054 USA or visit www.sun.com if you need additional information or * have any questions. */ /* * This is not a regression test, but a micro-benchmark. * * I have run this as follows: * * for f in -client -server; do mergeBench dolphin . jr -dsa -da $f NewEncoderMicroBenchmark.java; done * * @author Martin Buchholz */ import java.util.*; import java.nio.charset.*; import java.util.concurrent.*; import java.util.regex.Pattern; public class NewEncoderMicroBenchmark { abstract static class Job { private final String name; public Job(String name) { this.name = name; } public String name() { return name; } public abstract void work() throws Throwable; } private static void collectAllGarbage() { final java.util.concurrent.CountDownLatch drained = new java.util.concurrent.CountDownLatch(1); try { System.gc(); // enqueue finalizable objects new Object() { protected void finalize() { drained.countDown(); }}; System.gc(); // enqueue detector drained.await(); // wait for finalizer queue to drain System.gc(); // cleanup finalized objects } catch (InterruptedException e) { throw new Error(e); } } /** * Runs each job for long enough that all the runtime compilers * have had plenty of time to warm up, i.e. get around to * compiling everything worth compiling. * Returns array of average times per job per run. */ private static long[] time0(Job ... jobs) throws Throwable { final long warmupNanos = 10L * 1000L * 1000L * 1000L; long[] nanoss = new long[jobs.length]; for (int i = 0; i < jobs.length; i++) { collectAllGarbage(); long t0 = System.nanoTime(); long t; int j = 0; do { jobs[i].work(); j++; } while ((t = System.nanoTime() - t0) < warmupNanos); nanoss[i] = t/j; } return nanoss; } private static void time(Job ... jobs) throws Throwable { long[] warmup = time0(jobs); // Warm up run long[] nanoss = time0(jobs); // Real timing run long[] milliss = new long[jobs.length]; double[] ratios = new double[jobs.length]; final String nameHeader = "Method"; final String millisHeader = "Millis"; final String ratioHeader = "Ratio"; int nameWidth = nameHeader.length(); int millisWidth = millisHeader.length(); int ratioWidth = ratioHeader.length(); for (int i = 0; i < jobs.length; i++) { nameWidth = Math.max(nameWidth, jobs[i].name().length()); milliss[i] = nanoss[i]/(1000L * 1000L); millisWidth = Math.max(millisWidth, String.format("%d", milliss[i]).length()); ratios[i] = (double) nanoss[i] / (double) nanoss[0]; ratioWidth = Math.max(ratioWidth, String.format("%.3f", ratios[i]).length()); } String format = String.format("%%-%ds %%%dd %%%d.3f%%n", nameWidth, millisWidth, ratioWidth); String headerFormat = String.format("%%-%ds %%%ds %%%ds%%n", nameWidth, millisWidth, ratioWidth); System.out.printf(headerFormat, "Method", "Millis", "Ratio"); // Print out absolute and relative times, calibrated against first job for (int i = 0; i < jobs.length; i++) System.out.printf(format, jobs[i].name(), milliss[i], ratios[i]); } private static Job[] filter(Pattern filter, Job[] jobs) { if (filter == null) return jobs; Job[] newJobs = new Job[jobs.length]; int n = 0; for (Job job : jobs) if (filter.matcher(job.name()).find()) newJobs[n++] = job; // Arrays.copyOf not available in JDK 5 Job[] ret = new Job[n]; System.arraycopy(newJobs, 0, ret, 0, n); return ret; } /** * Usage: [-Diterations=N] [-Dcharset=CHARSET] [-Dfilter=REGEXP] */ public static void main(String[] args) throws Throwable { final int iterations = Integer.getInteger("iterations", 10000000); final String regex = System.getProperty("filter"); final Pattern filter = (regex == null) ? null : Pattern.compile(regex); final String csn = System.getProperty("charset", "UTF-8"); final Charset charset = Charset.forName(csn); Job[] jobs = { //---------------------------------------------------------------- new Job("new " + csn + " encoder") { public void work() throws Throwable { for (int i = 0; i < iterations; i++) charset.newEncoder(); }} }; time(filter(filter, jobs)); } }
01-12-2007

EVALUATION Yes. It is easy to test the correctness of the implementations by confirming that they agree with an actual decoding operation on all possible short byte sequences.
01-12-2007