JDK-8286795 : better 64-bit unsigned remainder
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 19
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2022-05-16
  • Updated: 2024-04-29
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Description
The current implementation of Long.remainderUnsigned appeals to Hackers Delight 9-3, but I think in part it falls short of using Warren's suggested technique, of halving the dividend and doubling the result (with a fixup).

Long.remainderUnsigned, as a stand-alone operation, performs an unsigned division, and then works backwards with a multiply to find out the remainder.  It can instead use Warren's technique directly.  Cut the dividend in half (`>>>1`), take the *remainder* (not the *quotient*, as in the current code), and then repair the approximate result by doubling it and adding the shifted-out bit.  Overflow can be avoided by using available signed-negative values.  The existing flow-free coding style can still be used to perform the fixup.

Here is a sketch of the code which avoids the extra multiply by working directly toward the remainder:

```
assert(dividend >= 0);  // negative dividend handled elsewhere as today
long r = ((dividend >>> 1) % divisor) << 1;  // use % not / here
r += (dividend & 1);   // reintroduce the shifted-out bit
r -= divisor;  // cancel impending overflow; set bit 63 only for a true negative
r += ((r >> (Long.SIZE - 1)) & divisor);  // increment by divisor if negative
assert(r >= 0 && r < dividend);  // must be
return r;
```

The above technique benefits platforms which do not have AD file support for unsigned remainder (UModL).

For platforms which *do* support UModL directly, the assembly code performs a runtime sign check which could in principle be avoided if static type analysis proves it is never taken.

This can be done in part by enhancing `UModLNode::Ideal` to replace the current node by a *signed* mod node, when the inputs are both always sign-positive.  This would then unlock further optimizations on the signed mod IR node.

Likewise, `UDivLNode::Ideal` should convert itself to signed div node if the inputs are both sign-positive.

Further more (on top of the previous suggestion) I suggest that the dynamic test in the assembly code be lifted out into IR.  That way it can be folded more systematically, and also the Ideal methods just mentioned above will apply more frequently, on the relevant IR paths.

Ultimately, the test for a negative divisor should be represented fully in IR flow or c-moves (for both div and mod cases), so that the AD file can handle the case of only positive divisors.  The fancy flow-free code in the assembly code for the negative divisor path should probably be handled by C2's logic for doing flow-free conditionals; that would be a better factoring of concerns than placing little bits of flow-free lore here and there in the assembler.
Comments
Corrected 3rd to last line in the snippet of the description to use signed right shit >> rather than unsigned >>>.
15-12-2022

Thanks [~jrose] for the remark. I'll open a similar core-libs issue for the Java code and assign it to myself.
15-12-2022