Name: jl125535 Date: 02/25/2003
A DESCRIPTION OF THE PROBLEM :
The Java language has some bit manipulation operations for
ints and longs (shifts, logical operations), but other
important ones are missing, e.g. rotate left/right,
count number of bits, find the first set bit, etc. Writing
these by hand is tedious, and avoids the opportunity for
incredibly fast implementations (see below).
The new methods could be static methods of java.lang.Math
(or maybe java.lang.Integer/Long), in the same way that the
primitive floating point operations are supplanted by
static methods like Math.sin(). For example:
public static int rollLeft(int val, int distance);
public static int rollRight(int val, int distance);
public static int countBits(int val);
public static int firstSetBit(int val);
public static long rollLeft(long val, int distance);
public static long rollRight(long val, int distance);
public static int countBits(long val);
public static int firstSetBit(long val);
etc.
SPEED ISSUES
It is *not* a solution to use java.util.BitSet, as this is
far too heavyweight for (say) counting the number of 1's in
a single int. Many situations in which these kind of bit
operations are used call for an extremely fast
implementation.
A major benefit of adding static methods operating on
ints/longs instead, is that on many CPUs there are
specialized instructions corresponding directly to these
methods. A clever JIT could 'inline' them using these
instructions, avoiding even the overhead of a JNI method
call. (I think some JIT's use this technique for floating
point methods in java.lang.Math?)
VECTOR INSTRUCTIONS
Taking this idea further, it may be possible to leverage
the existence of MMX/AltiVec-style vector instructions by
adding java.lang.Math methods for saturated addition, etc.
These could operate on ints/longs or maybe on arrays of
bytes (depending on the instructions they're simulating).
They might seem a little specialized for the Java
libraries, but would be justified by their extraordinary
speed. An analogous case is System.arraycopy(), which is
basically there to leverage the speed of the underlying CPU.
REPRODUCIBILITY :
This bug can be reproduced always.
CUSTOMER WORKAROUND :
Write your own bit manipulation methods. In some cases
(e.g. bit counting, vector operations) they could be orders
of magnitude slower than a native processor instruction.
(Review ID: 181745)
======================================================================