JDK-8317521 : Enhance detection of integral unsigned comparisons
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 22
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2023-10-04
  • Updated: 2023-10-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
When Java code needs to perform an "unsigned int" comparison, the conventional ways to do that is to use a pattern like `(MIN_VALUE ^ x) < (MIN_VALUE ^ y)` or `(MIN_VALUE + x) < (MIN_VALUE + y)` (or any combination permitted by commutativity).

Currently, the first pattern is translated by C2 exactly as written in Java, using two ^ and a signed < instructions. The second, however, is recognized as an "unsigned comparison pattern" and translated directly to an unsigned comparison, removing the two + operations.

Since both pattern are used by Java programmers when they need to express unsigned comparisons, the first pattern should be recognized as an "unsigned comparison pattern" as well.

Similarly for the `long` case.
Comments
Optimizing the " +- Integer.MIN_VALUE <=> y +- Integer.MIN_VALUE" pattern was added only recently with JDK-8276162. [~qamai] maybe it should simply be extended to cover the XOR case as well?
05-10-2023

Suggestion: Have an ideal graph rule which unconditionally changes X^MIN_INT to X+MIN_INT, for both 32 and 64 bits. Leave the intrinsic(s) alone. Maybe add a backend rule which code-generates X+MIN_INT and X-MIN_INT as a bit-xor operation, if one exists in the hardware (true on x86 and ARM), AND it is profitable. Rationale: ADD is more frequently optimized than XOR in the middle end of C2. So convert XOR to ADD. Re-convert to XOR (if profitable) only very late and on platforms where it counts. Consider having the ideal graph rule also for sub-word types. For example: byte b = …; b ^= 0x0080; byte c = 0x0080; b ^= c; might compile to: int b = SignExtendFrom8(…); b ^= 0x0080; int c = SignExtendFrom8(0x0080); b ^= c; // c = 0xFFFFFF80 which in the IR contains XORs of 0x0080 or 0xFFFFFF80. Those can be strength-reduced, at least in some cases, by additional ideal rules. In the case of 0x0080, the reduction to addition only works if the (b^0x0080) node is followed by a sign-extension. Which is probably common. In any case, I think adding rules like this is the high-leverage way to improve things. Not kludgey ad-hoc pattern matching in specific intrinsics. The general rule is (something like) that an XOR of a mask with only leading 1-bits on the right, followed by 0-bits on the left, is the same as an SUB of the same mask, as long as the range of the variable operand is suitably constrained, so that its own corresponding high bits are either all 0-bits or all 1-bits.
04-10-2023