Relates :
|
|
Relates :
|
|
Relates :
|
|
Relates :
|
Please review the following webrev which adds intrinsic support to allow some of the com/sun/crypto/provider methods to use AES instructions when a processor supports such instructions. Modern x86 processors have AES instructions to accelerate AES encryption and decryption but Hotspot does not have a way to generate such instructions. There is a way to hook in a native crypto library using PKCS11 and there are a few native libraries that support hardware AES instructions. However, these native PKCS11 libraries * do not scale well with multiple threads * are not supported on all platforms, for instance Hotspot does not have PKCS11 support on 64-bit Windows. * can be confusing to configure. Since this webrev adds intrinsic support for the default com/sun/crypto/provider classes, they are supported on all platforms and there is no additional configuration required. Measurements have shown that they scale very well will multiple threads. The rest of this mail describes the scope of the intrinsics and summarizes the source file changes. -- Tom Deneau Scope of the Intrinsics ----------------------- When creating a cipher the application specifies a "transformation" consisting of "algorithm/mode/padding". For more details see http://docs.oracle.com/javase/7/docs/api/javax/crypto/Cipher.html * These intrinsics kick in only when the algorithm part is "AES". A single block in AES is always 16 bytes and there are intrinsics for encrypting or decrypting a single block. These single-block intrinsics can work with any mode that uses AES and with any of the three AES key sizes (128, 192 or 256 bit). * A more optimized multi-block intrinsic can kick in if the algorithm/mode is "AES/CBC" (Cipher Block Chaining). Again all three AES key sizes are supported. There is no technical reason why we couldn't do multi-block intrinsics for the other modes (eg, ECB) but I want to get some feedback from the reviewers on the implementation before charging off on this path. * The padding part is handled by java routines outside of these intrinsics. Summary of Changes ------------------ http://cr.openjdk.java.net/~tdeneau/aes-intrinsics/webrev.01/ src/cpu/x86/vm/assembler_x86.cpp, hpp Defined the aes instructions which are used by the stub routines. src/cpu/x86/vm/stubGenerator_x86_64.cpp, Actual stub code for the aes intrinsics. As described earlier there are both single-block and multi-block intrinsic stubs. Note that the stubs make use of the "expanded key" which gets created each time the key changes. The expanded key is used by both the java code and the intrinsic AES instructions. The java code stores the "expanded key" in big-endian 32-bit integers. The x86 AES instructions require the expanded key to be in little-endian 128-bit words. Hence the pshufb instructions to get the key into the little-endian format src/cpu/x86/vm/vm_version_x86.cpp, hpp Detect and store the aes capability bit in cpuid. A global boolean command line flag UseAES can be used to turn off AES even if the hardware supports it. src/share/vm/classfile/vmSymbols.hpp src/share/vm/opto/runtime.cpp, hpp The usual definitions of class names, method names and signatures for the java methods that are being intrinsified and the signatures for the stubs src/share/vm/oops/methodOop.cpp Up until now, every intrinsic was replacing a routine that was loaded by the "default" (NULL) class loader. com/sun/crypto/provider is not loaded by the default class loader so we had to add a check here. src/share/vm/opto/escape.cpp escape analysis knows about certain stubs, but if it sees a leaf stub it also checks against a predefined list. So the new intrinsic names were added to the list. src/share/vm/opto/library_call.cpp src/share/vm/opto/callGenerator.cpp src/share/vm/opto/doCall.cpp The main logic for building up the calls to the stubs at compile time, assuming the platform has a stub and the global flags have not turned these intrinsics off. A new helper routine to load a field from an object was added since we ended up loading fields in a few places. For best performance, we wanted to hook into the multi-block encrypt and decrypt methods such as in CipherBlockChaining.java. This code is not AES-specific but handles CBC mode for any algorithm. (The algorithm part is handled by the enclosed "embeddedCipher" object). Thus at runtime we want to do the equivalent of an instanceof check on embeddedCipher and either call the stub (if it is AESCrypt) or call the original java code (if it is some other algorithm type). For the CipherBlockChaining.decrypt there is a further runtime check that the source and destination are not the same array which, because of the way CBC works would require cloning the source (cipher). Vladimir added some infrastructure to generate predicated intrinsics to solve the above problem. A particular intrinsic need only specify that it is predicated, and generate the particular guard node which if false will take the Java path. This infrastructure can be used for future intrinsics that have to make such a runtime choice. These changes from Vladimir are in callGenerator.cpp, doCall.cpp, and a small bit in library_call.cpp. src/share/vm/runtime/globals.hpp global flags were added to * turn off either AES encryption or AES decryption intrinsics separately * turn off the multi-block CBC/AES intrinsics. By default all of the above are on. These are really there for testing, for example one could encrypt using Java and decrypt using the intrinsics. Also, a UseAES flag to ignore the hardware capability as described above.
|