JDK-8041956 : remove ByteSize and WordSize classes
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9
  • Priority: P4
  • Status: Closed
  • Resolution: Won't Fix
  • OS: generic
  • CPU: generic
  • Submitted: 2014-04-25
  • Updated: 2020-09-24
  • Resolved: 2019-01-08
Related Reports
Relates :  
Description
This is placeholder to have some discussion about how we could remove ByteSize and WordSize.

Here is a comment from sizes.hpp that explains what these classes are for:

// The following two classes are used to represent 'sizes' and 'offsets' in the VM;
// they serve as 'unit' types. ByteSize is used for sizes measured in bytes, while
// WordSize is used for sizes measured in machine words (i.e., 32bit or 64bit words
// depending on platform).

One problem is that it introduces quite some verbosity to code like assembler code because in order to get a byte-sized offset you have to call in_bytes, e.g.:

  add(cache, cache, in_bytes(ConstantPoolCache::base_offset()));

The other downsize is the fact that a number of methods have a "in_bytes" suffix, e.g.:

  static int pool_holder_offset_in_bytes()  { return offset_of(ConstantPool, _pool_holder); }

And ugly example of how word-sizes are used is:

  static int size(int length)                    { return align_object_size(header_size() + length * in_words(ConstantPoolCacheEntry::size())); }

which is then called as:

    assert(in_words(ConstantPoolCacheEntry::size()) == 4, "adjust shift below");

Too many in_words calls for my taste; it makes the code difficult to read.
Comments
Runtime Triage: This is not on our current list of priorities. We will consider this feature if we receive additional customer requirements.
08-01-2019

Word-based sizing is ancient history. It was an expedient to allow size calculations to fit in 32-bits (ints) on 32-bit systems. It also helps avoid overflow if you have an extra factor of four before overflowing. All of these are excuses. We should (ideally) use a single type for size. That type should simply represent sizes in natural address units (which are bytes). The reason size_t is not exactly right is that we are sometimes sensitive to overflow. Perhaps we should define our own ByteSize (and get rid of WordSize). The difference between size_t and ByteSize would be operations with ByteSize would be resistant to overflow, unlike size_t, where there is no standard way to defend against overflow. Overflow detection is not just a preference. It is a security mandate.
25-04-2014

A laudable goal. Unfortunately a lot of code deals in words (or worse HeapWords), and the idea of using types to distinguish units, or explicit sizes in method names, was to avoid confusion when different people with different ideas of what was "correct" looked at each other's code. This was mostly an effort at documentation, since the code (mostly :-) works correctly. Also, and maybe especially if all sizes and offsets are in bytes, a lot of "int"s and "unsigned int"s should be changed to "long" or "size_t"s.
25-04-2014

In my opinion all methods returning offsets or sizes should return the values in byte-size. If a user wants something else like word-size offsets that user should do the conversion himself.
25-04-2014