JDK-8158040 : G1 card indices overflow with heaps > 1TB
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 9
  • Priority: P2
  • Status: Resolved
  • Resolution: Not an Issue
  • Submitted: 2016-05-27
  • Updated: 2018-02-01
  • Resolved: 2016-06-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9Resolved
Related Reports
Relates :  
Description
If one specifies a 2 TB heap with G1, the CardIdx_t type (which is int) can not represent all cards in the heap any more.

I.e. 2^31 * 2^9 (card size in bytes) = 2^40 = 1 TB.

There are several calculations in the code (like actually calculating the card # for a given heap word :)) that are in danger of overflow in this case.

Note that there are applications using G1 with 1 TB heaps right now, and a customer intends to test on larger heaps in the near future.

Since the card size is hardcoded as 512 bytes, there is no workaround.

It would probably be best to use a size_t for it. Make sure that all operands calculating CardIdx_t values are also good.
Comments
After doing an initial evaluation and talking with Thomas and Kim, this does not appear to be an issue. It was thought that CardIdx_t type (which is int) was used into index cards in the entire heap. It turns out that it's an index into cards within a region. Since the max region size is 32Mb and the card size is a fixed 512 bytes, the "int" can't overflow. While investigating this CR, I tried changing the card size to 128. There are apparently dependencies in the code on card size being 512 because it crashed. I filed this enh: JDK-8158687 Card size is fixed at 512 bytes. It would be nice to be able to change this for performance.
04-06-2016

Does changing CardIdx_t from int to size_t fix the problem in the comment above? Will searching for CardTableModRefBS::card_shift and checking for potential problem be enough to solve the problem?
01-06-2016

One particular problematic place seems to be heapRegionRemSet.cpp:103ff: 102 size_t hw_offset = pointer_delta((HeapWord*)from, loc_hr->bottom()); 103 CardIdx_t from_card = (CardIdx_t) 104 hw_offset >> (CardTableModRefBS::card_shift - LogHeapWordSize); On 1TB heaps (2^40 bytes), hw_offset is in the range of 2^37 (implicit shift by 3 by pointer_delta()). In line 103 we assign from_card the result of hw_offset shifted by 6 (CardTableModRefBS::card_shift is constant 9 minus LogHeapWordSize = 3). The result is in the range of 2^31. This does not give an overflow yet, but from 1-2TB heaps we overflow into negative signs for from_card values (which might or might not be bad depending on the following code - I think that it is problematic), but starting at > 2TB heaps the code chops off significant bits when the results casting to CardIdx_t, ultimately loosing information. There may be other similar cases. This change should also try to reduce the number of casts in the code too. I fixed the CR title to read ">", not ">=".
31-05-2016