JDK-8187091 : ReturnBlobToWrongHeapTest fails because of problems in CodeHeap::contains_blob()
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 10
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2017-09-01
  • Updated: 2019-09-13
  • Resolved: 2017-11-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10
10 b33Fixed
Related Reports
Relates :  
Relates :  
Description
We see failures in test/compiler/codecache/stress/ReturnBlobToWrongHeapTest.java which are cause by problems in CodeHeap::contains_blob() for corner cases with CodeBlobs of zero size:

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (heap.cpp:248), pid=27586, tid=27587
#  guarantee((char*) b >= _memory.low_boundary() && (char*) b < _memory.high()) failed: The block to be deallocated 0x00007fffe6666f80 is not within the heap starting with 0x00007fffe6667000 and ending with 0x00007fffe6ba000

The problem is that JDK-8183573 replaced

  virtual bool contains_blob(const CodeBlob* blob) const { return low_boundary() <= (char*) blob && (char*) blob < high(); }

by:

  bool contains_blob(const CodeBlob* blob) const { return contains(blob->code_begin()); }

But that my be wrong in the corner case where the size of the CodeBlob's payload is zero (i.e. the CodeBlob consists only of the 'header' - i.e. the C++ object itself) because in that case CodeBlob::code_begin() points right behind the CodeBlob's header which is a memory location which doesn't belong to the CodeBlob anymore.

This exact corner case is exercised by ReturnBlobToWrongHeapTest which allocates CodeBlobs of size zero (i.e. zero 'payload') with the help of sun.hotspot.WhiteBox.allocateCodeBlob() until the CodeCache fills up. The test first fills the 'non-profiled nmethods' CodeHeap. If the 'non-profiled nmethods' CodeHeap is full, the VM automatically tries to allocate from the 'profiled nmethods' CodeHeap until that fills up as well. But in the CodeCache the 'profiled nmethods' CodeHeap is located right before the non-profiled nmethods' CodeHeap. So if the last CodeBlob allocated from the 'profiled nmethods' CodeHeap has a payload size of zero and uses all the CodeHeaps remaining size, we will end up with a CodeBlob whose code_begin() address will point right behind the actual CodeHeap (i.e. it will point right at the beginning of the adjacent, 'non-profiled nmethods' CodeHeap). This will result in the above guarantee to fire, when we will try to free the last allocated CodeBlob (with sun.hotspot.WhiteBox.freeCodeBlob()).
Comments
Thank you very much Volker for very good analysis. That is why we don't see this problem in our Nightly testimony ng
06-09-2017

You currently only see this error in product builds on x86_64 for the following reason: - the CodeCache/CodeHeap is allocated in chunks (i.e. so called 'segments') of CodeCacheSegmentSize which is 128 on x86 with tiered compilation. - CodeBlob::size() is '112' in the product build (sizeof(CodeBlob) == CodeBlob::header_size() == 104 but it gets aligned to 112 by CodeBlob::align_code_offset()) - CodeBlob::code_begin() points to '(char*)this + 112' - allocating a CodeBlob of size zero will therefore request 112 bytes from the CodeHeap. The CodeHeap::allocate() method will add another 16 bytes to this (for the HeapBlock header which will be placed right before the actual CodeBlob) which results in a total amount of 128 bytes. These 128 bytes are exactly the size of a CodeCache segment. - for the CodeBlob of size zero which is allocated in the last segment of a code heap, CodeBlob::code_begin() will actually point beyond that Code Heap (and to the binning of the next CodeHeap if there is one). The last allocations looks as follows: Extension of CodeHeap 'non-profiled nmethods' failed. Trying to allocate in CodeHeap 'profiled nmethods'. CodeCache allocation: addr: 0x00007fcbed0f9f90, size: 0x70 This will actually allocate 0x80 (because of the extra HeapBlock) bytes from 0x00007fcbed0f9f80 (including) to 0x00007fcbed0fa000 (excluding) in the 'profiled nmethods' CodeHeap: [0x7fcbed0fa000 (low_boundrary) - 0x7fcbed514000 (high_boundrary), 0x7fcbed514000 (high)](CodeHeap 'non-profiled nmethods') [0x7fcbecce1000 (low_boundrary) - 0x7fcbed0fa000 (high_boundrary), 0x7fcbed0fa000 (high)](CodeHeap 'profiled nmethods') As you can see, the 'profiled nmethods' CodeHeap is placed right before the 'non-profiled nmethods' CodeHeap. CodeBlob::code_begin() of the last allocated CodeBlob will therefore point to '(char*)this + 112' = 0x00007fcbed0f9f90 + 112 = 0x7fcbed0fa000 which is already in the 'non-profiled nmethods' CodeHeap, although the last allocated CodeBlob (i.e. 0x00007fcbed0f9f90) was actually allocated in the 'profiled nmethods' CodeHeap. This will result in the following guarantee to fire (also see attached hs_err file): # guarantee((char*) b >= _memory.low_boundary() && (char*) b < _memory.high()) failed: The block to be deallocated 0x00007fcbed0f9f80 is not within the heap starting with 0x00007fcbed0fa000 and ending with 0x00007fcbed514000 In the non-product builds, sizeof(CodeBlob) == CodeBlob::header_size() is 120 bytes and CodeBlob::size() is 144. That will not fit into a single CodeHeap segment, but will also not completely fill up the second (i.e. last) segment of the allocated space. CodeBlob::code_begin() will therefore always point into the same, correct CodeHeap from which the CodeBlob on which it is called has been allocated.
06-09-2017

[~simonis] Volker, can you attach hs_err file? We do run the test in Nightly as part of tier3 testing. We run it with fastdebug VM and don't see the problem.
05-09-2017

ILW = Guarantee failure in code cache (cannot happen in production), only with code cache allocations of zero size, no workaround = MLH = P3
01-09-2017