JDK-7060842 : UseNUMA crash with UseHugreTLBFS running SPECjvm2008
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs22
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • CPU: generic
  • Submitted: 2011-06-29
  • Updated: 2011-11-25
  • Resolved: 2011-09-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 Other
7u2Fixed 8Fixed hs22Fixed
Related Reports
Relates :  
Relates :  
Description
SIGBUS with UseNUMA on Linux with UseHugeTLBFS running SPECjvm2008.

Command line to reproduce:

java -Xmx26g -Xms26g -Xmn22g -XX:+UseParallelOldGC  -XX:+AggressiveOpts -XX:+UseLargePages -XX:FreqInlineSize=650 -XX:+TieredCompilation -XX:+UseNUMA -XX:+UseHugeTLBFS ���jar SPECjvm2008.jar ���ikv compiler.compiler

Choose a heap size close to maximum that can be run on the test system: the
test case starts passing with smaller heap sizes.

Comments
EVALUATION See main CR
12-09-2011

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-rt/hotspot/rev/a20e6e447d3d
17-08-2011

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-main/hotspot/rev/a20e6e447d3d
17-08-2011

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a20e6e447d3d
06-08-2011

EVALUATION It seems to me this could be a race for the pages. Like we do madvise(DONTNEED) that uncommits the space, and then after a while the mutator touches this area and at this point there are not pages available in the virtual swap and we die. So, it seens like bug in madvise(DONTNEED) implementation, may be we should just switch back to doing this with mmap.
29-06-2011

WORK AROUND The problem can be bypassed by disabling UseAdaptiveSizePolicy.
29-06-2011

EVALUATION The issue seems to be in MutableNUMASpace::bias_region, which tries to free memory (free_memory) by a call to ::madvise(x,x, MADV_DONTNEED). If the os::free_memory call in bias_region is replaced with os::commit_memory (which is mmap) as it was done in earlier versions (free_memory was implemented via mmap previously on Linux), the test case starts passing.
29-06-2011