Duplicate :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
The existing implementation of ` sun.security.provider.SHA3::implCompress` is done in Java. When compiled to native code, the C2 JIT code does not vectorize well and it contains some Java-isms like array-bound checks. That hurts performance for an algorithm that can be used to validate large amounts of data. Armv8.2 optionally provides "ARMv8.2-SHA, SHA2-512 and SHA3 functionality". To speed it up, we could simply create an intrinsic that implements the algorithm using the new SIMD instructions: - EOR3 Three-way Exclusive OR (page C7-1479) - RAX1 Rotate and Exclusive OR (page C7-1892) - XAR Exclusive OR and Rotate (page C7-2303) - BCAX Bit Clear and Exclusive OR (page C7-1418) This would also help eliminate the Java-isms. That is a similar approach to intrinsics for SHA1/SHA256/SHA512. Reference implementation for core SHA-3 transform using ARMv8.2 Crypto Extensions: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 Initial implementation: http://cr.openjdk.java.net/~fyang/8252204/webrev.00/ With a cycle-accurate aarch64 simulator, we tested test/micro/org/openjdk/bench/java/security/MessageDigests.java for performance gain. We witnessed 20% - 40% performance improvement depending on specific SHA3 digest length and size of the message.
|