JDK-8157485 : Improve 64-bit integer multiplication with Montgomery multiplication
  • Type: Enhancement
  • Component: security-libs
  • Affected Version: 9
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2016-05-20
  • Updated: 2022-05-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
Per the comments made on core-libs-dev [1], investigate improving the additions to java.lang.Math in the fix for JDK-5100935 using algorithms from [2].

[1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-May/041263.html
[2] http://cetinkoc.net/docs/j37.pdf

Note: The link to the reference paper cited in [1] was http://koclab.cs.ucsb.edu/docs/koc/j37.pdf which is now not found.
Comments
TD/DR: Our modPow is basically decent, but not great. We could do better, if we wanted to, but at the cost of importing some complex and difficult-to-verify software. The state of the art for Montgomery Multiplication, which substantially dominates all of modPow() and therefore a big slab of public-key crypto, is at https://github.com/openssl/openssl/blob/master/crypto/bn/asm/x86_64-mont5.pl. The OpenSSL implementation of RSA is 40% - 60% faster than OpenJDK on x86. Firstly, OpenSSL Montgomery Multiplication uses hand-carved assembly language which is unrolled, very carefully written, and takes care to fill all of the holes caused by instruction latencies with useful work. Secondly, Intel has added some instructions, https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-large-integer-arithmetic-paper.pdf. These instructions are intended to support this particular algorithm. We don't use them. We could simply import the Montgomery Multiplication from OpenSSL into OpenJDK. We already have some Apache-licensed crypto in OpenJDK, so it wouldn't require much additional legal input. However, the OpenSSL code is complex and looks like it would be difficult to maintain if anything came up, so we'd have to keep up with bug fixes from them. As far as I know it's impossible to write a really fast implementation of bignum modPow in any high-level language. The Montgomery multiplication algorithm isn't terribly complicated, but compilers and big out-of-order machines can't get very close to hand-carved assembly language.
13-07-2021

See https://www.microsoft.com/en-us/research/wp-content/uploads/1996/01/j37acmon.pdf
09-07-2021