JDK-8024838 : Significant slowdown due to transparent huge pages
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: hs25,8
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2013-09-15
  • Updated: 2014-01-14
  • Resolved: 2013-10-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 Other
7u60Fixed 8Fixed hs25Fixed
Description
This bug is a follow-up to this thread:

http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-September/010803.html

The changes for JDK-8007074 are causing a significant slowdown in the execution of jdk_core and jdk_svc tests on a machine running Ubuntu 12.0.4.1 LTS (64-bit) on Intel Xeon E5345 hardware. See attached image for the test execution times to see the significant jump from 45mins to >1h to run the tests at concurrency=4. The jump from under 45 minutes to 70+ minutes corresponds to when jdk8/tl was updated from jdk8-b104 to jdk8-b106 (since up to hs25-b48).

From what I can tell, the changes in JDK-8007074 mean that large pages are being used when they weren't previously. Running with -XX:-UseLargePages restores the performance. 
Comments
The uptime on this system is long (158 days) so you may be right about page fragmentation. The system is busy at the moment but when I get a chance then I'll reboot it and see if I can duplicate this issue again.
03-10-2013

This could be a case of large page fragmentation. If there's insufficient contiguous memory to assemble the large pages, the OS may be trying reorganize memory to coalesce small pages so that it can satisfy the large page request. A simple test for this would be to reboot the machine experiencing the issue; assuming that the downtime is acceptable on that machine.
01-10-2013

I tried reproducing this on my laptop (2 core / 8GB, SSD, Ubuntu 13.04, transparent huge pages enabled) but could only see as good or better run times with default/-XX:+LargePages over -XX:-LargePages. I'll try getting some profiling set up for this on a system with more cores.
17-09-2013

I experienced a similar slowdown on my development system which is a 24 core / 32GB Xeon i7 box. It has no swap configured. Running Ubuntu x64 13.04 I have not (yet) tried disabling large pages.
16-09-2013

I initially assumed there was swapping but vmstat reports si/so as 0 so I assume not. This specific system has 8GB and the agent VMs (x4) are running with -Xmx256m. There are some tests that specify /othervm so there may be additional VMs running periodically (any additional VMs also inherit -Xmx256m). Clearly THP has an effect on this system, maybe more data is required from other systems to help characterize this issue.
16-09-2013

I can reproduce a regression although not as large as the one reported in this bug report. I do get a big performance hit the times the processes start swapping or have to evict the cached files, but I've seen the same affect without the large pages. Maybe it happens more often with transparent huge pages turned on. To verify the regression I've run with and without large pages by using the flag -XX:-UseLargePages and -XX:+UseLargePages. I've also verified that this is caused by transparent huge pages by turning of the madvise call to the OS: $ hg diff diff --git a/src/os/linux/vm/os_linux.cpp b/src/os/linux/vm/os_linux.cpp --- a/src/os/linux/vm/os_linux.cpp +++ b/src/os/linux/vm/os_linux.cpp @@ -2748,7 +2748,7 @@ if (UseTransparentHugePages && alignment_hint > (size_t)vm_page_size()) { // We don't check the return value: madvise(MADV_HUGEPAGE) may not // be supported or the memory may already be backed by huge pages. - ::madvise(addr, bytes, MADV_HUGEPAGE); + //::madvise(addr, bytes, MADV_HUGEPAGE); } } With this change I get the same performance as with -XX:-UseLargePages.
15-09-2013

I'll try to reproduce this.
15-09-2013

One data point is that running the java/io tests with jtreg normally takes about 35 seconds when running with -concurrency=8 (on a 8 core system). When switching to jdk8-b106 then the tests take more than 2 minutes.
15-09-2013