JDK-8058354 : SPECjvm2008-Derby -2.7% performance regression on Solaris-X64 starting with 9-b29
The Version table provides details related to the release that this issue/RFE will be addressed.
Unresolved : Release in which this issue/RFE will be addressed. Resolved: Release in which this issue/RFE has been resolved. Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.
The SPECjvm2008-Derby benchmark is showing a -2.70% statistically signfiicant performance regression on Solaris-X64 starting with 9-b29. This regression is reproducible, but subsequent reproductions don't indicate statistical significance.
Seems that I accidentally assigned this to myself.
Suggested release note (8u40), in the Known Issues section
When using G1 on Solaris where large pages are requested, the VM does not always use large pages when it could. This may result in significant throughput degradation, particularly on the Solaris x64 platform.
The fix I currently have has the same issue as above using the os::page_size_for_region() method that forces strict alignment.
verified on on sc14ia02 that w/o LargePages, the scores for jdk9b29 is 1685.47 and for the build (with 8054818 and 8038423) reverted is 1637.88
Further collecting DTLB misses, jdk9b29 has much more DTLB misses stalls for memcpy.
May be related to JDK-8064940, i.e. the auxiliary data structures might be allocated in a different order, getting non-large page aligned base addresses, inhibiting large pages for them.
This would be in line with the reduced score improvement, i.e. only parts of the data structures get proper alignment.
Did some testing with enabled and disabled large pages to see if they are the problem:
Configuration Score (normalized to b28-small pages)
b29-largepages 1.04 !!!!
(smallpages means -XX:-UseLargePages, largepages -XX:+UseLargePages)
I.e. there seems to be some problem with getting large pages with b29. Potentially b28 does some implicit pre-touching of data structures that is missing with b29. Will retest with b29 and a fix for pre-touching.
In your testing, did you also try to only revert JDK-8038423 (and keep JDK-8054818)?
(JDK-8038423 depends on JDK-8054818)
Did some more investigation of some log files for reruns:
build / total application time (s) / (young) gc time (s)
9b28 / 397 / 17 (4.3%)
9b29 / 390 / 20 (5.1%)
Score difference -10%
That seems too little of gc activity difference to explain a 10% performance/throughput difference.
reverted changes from 2 bugs:
The regression is gone. Hard to tell from profile what has caused the regression.