JDK-8359104 : gc/TestAlwaysPreTouchBehavior.java# fails on Linux
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 25,26
  • Priority: P2
  • Status: Open
  • Resolution: Unresolved
  • OS: linux
  • CPU: generic
  • Submitted: 2025-06-10
  • Updated: 2025-08-29
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 26
26Unresolved
Related Reports
Relates :  
Description
We see test gc/TestAlwaysPreTouchBehavior.java#ParallelCollector failing on Linux, for example Linux ppc64le RHEL 9.3 .
But also failures on SLES 15.6 were seen, so it is not only a problem showing up on RHEL .
We get too low RSS reported, example :
----------System.out:(2/50)----------
RSS: 257884160
Host available memory: 45643202560
----------System.err:(13/1146)----------
java.lang.RuntimeException: RSS of this process(257884160b) should be bigger than or equal to heap size(268435456b) (available memory: 45643202560): expected 257884160 > 268435456
	at jdk.test.lib.Asserts.fail(Asserts.java:715)
	at jdk.test.lib.Asserts.assertGreaterThan(Asserts.java:403)
	at gc.TestAlwaysPreTouchBehavior.main(TestAlwaysPreTouchBehavior.java:157)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:565)
	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
	at java.base/java.lang.Thread.run(Thread.java:1474)

Comments
The Kernel optimization (6.2 and newer) https://github.com/torvalds/linux/commit/f1a7941243c102a44e8847e3b94ff4ff3ec56f25 lead to more inaccurate VmRSS values in /proc/self/status and /proc/self/statm with more than 32 CPUs! As the most of our ppc64le test systems do have more than 32 logical CPUs we are facing the inaccurate Rss values. A blog post about this issue: https://bkmz.nl/rss_lie.html As the /proc/self/smaps_rollup will provide accurate Rss values this could be used instead, especially for the WhiteBox::rss(). I can't reproduce the higher latency, mentioned in the blog post in our systems. But as the smaps_rollup doesn't provide the same values as status used in query_process_memory_info we would need to read both.
29-08-2025

Given the inaccuracies of WhiteBox:rss() on Linux ppc64le, should we a) avoid the test on this platform or b) use something 'better' from /proc or from some other API ?
25-08-2025

Seems WhiteBox::rss() takes on Linux the values from here https://github.com/openjdk/jdk/blob/5cc86738411c36378b89d8f4932a54b3089cf22e/src/hotspot/os/linux/os_linux.cpp#L352 and finally here https://github.com/openjdk/jdk/blob/5cc86738411c36378b89d8f4932a54b3089cf22e/src/hotspot/os/linux/os_linux.cpp#L2316 (which is /proc/self/status) So this is too inaccurate ? If the first read fails, should we maybe read the rss values again ? Or should we use for the test the other , hopefully more accurate values : > If accurate values are required, use /proc/pid/smaps or /proc/pid/smaps_rollup instead, which are much slower but provide accurate, detailed information.
25-08-2025

I did some further investigation. Here my findings: GC tests always pretouch(gc/TestAlwaysPreTouchBehavior.java) is failing due to too low rss detected from /proc/self/status VmRSS(https://github.com/openjdk/jdk/blob/c74c60fb8b8aa5c917fc4e1c157cc8083f5797a0/src/hotspot/os/linux/os_linux.cpp#L2316) Failing at: SLES15 SP6, 6.4.0-150600.23.25-default, hugepage size 2m(default)/1g, THP madvice 2m, shared memory THP never, heap page size 64k RHEL 9.3, 5.14.0-362.13.1.el9_3.ppc64le, hugepage size 16m(default)/16g, THP madvice 16m, shared memory THP never, heap page size 64k RHEL 8.9, 4.18.0-513.11.1.el8_9.ppc64le, hugepage size 16m(default)/16g, THP madvice 16m, shared memory THP never, heap page size 64k Successful at: SLES15 SP4, 5.14.21-150400.24.55-default, hugepage size 2m(default)/1g, THP madvice 2m, shared memory THP never, heap page size 64k From the man page: VmRSS Resident set size. Note that the value here is the sum of RssAnon, RssFile, and RssShmem. This value is inaccurate; see /proc/pid/statm. /proc/pid/statm Provides information about memory usage, measured in pages. The columns are: resident (2) resident set size (inaccurate; same as VmRSS in /proc/pid/status) Some of these values are inaccurate because of a kernel-internal scalability optimization. If accurate values are required, use /proc/pid/smaps or /proc/pid/smaps_rollup instead, which are much slower but provide accurate, detailed information.
25-08-2025

The tests fail often on some machines since they were unproblemlisted by JDK-8334513. Not on all ppc64le machines. I guess there's some OS dependency.
09-07-2025

G1 collector for example fails too, example exception message : java.lang.RuntimeException: RSS of this process(257818624b) should be bigger than or equal to heap size(268435456b) (available memory: 45229080576): expected 257818624 > 268435456 at jdk.test.lib.Asserts.fail(Asserts.java:715) at jdk.test.lib.Asserts.assertGreaterThan(Asserts.java:403) at gc.TestAlwaysPreTouchBehavior.main(TestAlwaysPreTouchBehavior.java:157) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1474) So I change the subject title of this JBS issue.
12-06-2025

I checked and there were also some failures in gc/TestAlwaysPreTouchBehavior.java#G1 gc/TestAlwaysPreTouchBehavior.java#Shenandoah gc/TestAlwaysPreTouchBehavior.java#Z So far I see only failures on Linux ppc64le.
11-06-2025

Only on parallel gc? Only on ppc? I cannot reproduce the issue both on aarch64 and x64 locally. Also does not fail in our CI (or GHA).
11-06-2025