United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6921969 optimize 64 long multiply for case with high bits zero
JDK-6921969 : optimize 64 long multiply for case with high bits zero

Details
Type:
Enhancement
Submit Date:
2010-02-01
Status:
Resolved
Updated Date:
2010-04-02
Project Name:
JDK
Resolved Date:
2010-02-09
Component:
hotspot
OS:
solaris_9
Sub-Component:
compiler
CPU:
sparc
Priority:
P3
Resolution:
Fixed
Affected Versions:
hs17
Fixed Versions:
hs17 (b09)

Related Reports
Backport:
Backport:

Sub Tasks

Description
Hi Tom, Christian, and others,

Here's a patch I'd like to contribute:
http://cr.openjdk.java.net/~rasbold/69XXXXX/webrev.00/

With it, C2 generates shorter long multiplication sequences on x86_32
when the high 32 bits are known to be zero.

Particularly, this applies to the loop in BigInteger.mulAdd():

   private final static long LONG_MASK = 0xffffffffL;

   static int mulAdd(int[] out, int[] in, int offset, int len, int k) {
       long kLong = k & LONG_MASK;
       long carry = 0;

       offset = out.length-offset - 1;
       for (int j=len-1; j >= 0; j--) {
           long product = (in[j] & LONG_MASK) * kLong +
                          (out[offset] & LONG_MASK) + carry;
           out[offset--] = (int)product;
           carry = product >>> 32;
       }
       return (int)carry;
   }

In my measurements, one of our internal microbenchmarks that uses
BigInteger.mulAdd sped up about 12%. Also, SPECjvm2008's crypto.rsa
and crypto.signverify improved about 7% and 2.3%, respectively.

                                    

Comments
EVALUATION

ChangeSet=http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/e8443c7be117,ChangeRequest=6921969
                                     
2010-02-04
EVALUATION

See description.
                                     
2010-02-04



Hardware and Software, Engineered to Work Together