JDK-8202152 : test/hotspot/jtreg/runtime/whitebox/WBStackSize.java fails
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2018-04-23
  • Updated: 2018-05-08
  • Resolved: 2018-04-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11 b11Fixed
Related Reports
Relates :  
Description
The WBStackSize.java test failed in the most recent CI job:

java.lang.RuntimeException: getThreadFullStackSize value [1052672] should be within 90%..110% of the value returned by HotSpotDiagnosticMXBean
	at WBStackSize.main(WBStackSize.java:97)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
	at java.base/java.lang.Thread.run(Thread.java:831)

Comments
Updated webrev based on reviews: http://cr.openjdk.java.net/~kvn/8202152/webrev.01/
24-04-2018

To complete picture I ran with -XX:CompilerThreadStackSize=256 and everything working as expected. New thread and 512k stack was allocated for Java "MainThread" thread - nothing was reused.
24-04-2018

ILW = WBStackSize testbug after the feature - lazy allocation of compiler threads; regression; none = MHH = P2
24-04-2018

Mikael V. pointed me to: https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/allocatestack.c;h=b374f4794d0614ebc3af89edd857b798b7d9c6fd;hb=HEAD#l178 end result is that the test is fixed simply by using the same stacksize for regular Java threads and the compiler threads.
24-04-2018

Or not ... I said: > If it is clever enough it could reuse threads, but it has to ensure that the thread you get via pthread_create honours all of the attributes that were passed to pthread_create. I just realized the behaviour is compliant as the attr stacksize is a minimum. So if I ask for 256K and the OS has one lying around with 1MB then it can hand that to me. So likely this has been happening all the time, but because GC and compiler threads didn't terminate they were never recycled, and most of the other threads all have the same default -Xss stacksize. Can you test with with a CompilerThreadStackSize that is smaller than ThreadStackSize and see if the test still fails.
24-04-2018

Something very bad happening here if Thread 191694 appears to have the same stack region as Thread 191161!
24-04-2018

Ran with -Xlog:os+thread=Debug -XX:CompilerThreadStackSize=1536 [0.173s][info ][os,thread] Thread started (pthread id: 140324772644608, attributes: stacksize: 1536k, guardsize: 0k, detached). [0.173s][info ][os,thread] Thread is alive (tid: 191691, pthread id: 140324772644608). Added compiler thread C1 CompilerThread1 (available memory: 115042MB) [0.173s][debug][os,thread] Thread 191691 stack dimensions: 0x00007f9fe8220000-0x00007f9fe83a1000 (1540k). [0.173s][debug][os,thread] Thread 191691 stack guard pages activated: 0x00007f9fe8220000-0x00007f9fe8224000. Removing compiler thread C1 CompilerThread1 after 8 ms idle time [0.206s][debug][os,thread] Thread 191691 stack guard pages removed: 0x00007f9fe8220000-0x00007f9fe8224000. [0.206s][info ][os,thread] JavaThread exiting (tid: 191691). [0.206s][info ][os,thread] Thread finished (tid: 191691, pthread id: 140324772644608). [0.230s][info ][os,thread] Thread started (pthread id: 140324772644608, attributes: stacksize: 512k, guardsize: 0k, detached). [0.230s][info ][os,thread] Thread is alive (tid: 191694, pthread id: 140324772644608). [0.230s][debug][os,thread] Thread 191694 stack dimensions: 0x00007f9fe8220000-0x00007f9fe83a1000 (1540k). [0.230s][debug][os,thread] Thread 191694 stack guard pages activated: 0x00007f9fe8220000-0x00007f9fe8224000. ThreadStackSize VM option: 524288 Size of protected shadow pages: 90112 Full stack size: 1576960 java.lang.RuntimeException: getThreadFullStackSize value [1576960] should be within 90%..110% of the value
24-04-2018

can you test setting CompilerThreadStackSize explicitly to not 1MB and see if the size of the reported stack changes? Also can you find the logging that corresponds to: log_debug(os, thread)("Thread " UINTX_FORMAT " stack dimensions: " PTR_FORMAT "-" PTR_FORMAT " (" SIZE_FORMAT "k).", os::current_thread_id(), p2i(stack_base() - stack_size()), p2i(stack_base()), stack_size()/1024);
24-04-2018

Failed with debug build. I only saw this failure on Linux so far. The default CompilerThreadStackSize = 1024Kb. It is platform specific: src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp You need to comment next line to reproduce failure each time: http://hg.openjdk.java.net/jdk/jdk/file/d5cecd70fc0f/src/hotspot/share/compiler/compileBroker.cpp#l316 if (ct->idle_time_millis() < (c1 ? 500 : 100)) return false;
24-04-2018

If it is clever enough it could reuse threads, but it has to ensure that the thread you get via pthread_create honours all of the attributes that were passed to pthread_create.
24-04-2018

Looks like OS may reuse ID but it does not say it can reuse thread itself: "The system may reuse a thread ID after a terminated thread has been joined, or a detached thread has terminated." http://man7.org/linux/man-pages/man7/pthreads.7.html
24-04-2018

Is the test failing in a product or debug build? What is the default CompilerThread stacksize? I'm interested to see if the new thread getting the old thread's stackszie or whether it is the platform default.
24-04-2018

I can see the pthread id apparently being reused. But the OS should never pool threads like this. It may be an OS bug though ... need to see a lot more detail on thread lifecycle. :(
24-04-2018

It is not compiler thread anymore. It exited. Look: tid (threads id) is different - only pthread id the same.
24-04-2018

Or it could be Linux feature which keep pthreads in a pool and reuse it.
24-04-2018

This still makes absolutely no sense to me. A CompilerThread should never be executing the test.
24-04-2018

We don't reuse any threads ever.
24-04-2018

I finally reproduced the problem by commenting out idle time check in code: http://hg.openjdk.java.net/jdk/jdk/rev/fcd5df7aa235#l1.47 ThreadStackSize VM option: 524288 Size of protected shadow pages: 90112 Full stack size: 1052672 Thread: 'MainThread' Id: 19 This is what I see with -Xlog:os+thread=Info -XX:+TraceCompilerThreads and crashing on exception (to get hs_err file). Following the same pthread id 139742357448448 [0.139s][info][os,thread] Thread started (pthread id: 139742357448448, attributes: stacksize: 1024k, guardsize: 0k, detached). [0.139s][info][os,thread] Thread is alive (tid: 189139, pthread id: 139742357448448). Added compiler thread C1 CompilerThread4 (available memory: 115061MB) Removing compiler thread C1 CompilerThread4 after 2 ms idle time [0.158s][info][os,thread] JavaThread exiting (tid: 189139). [0.158s][info][os,thread] Thread finished (tid: 189139, pthread id: 139742357448448). [0.234s][info][os,thread] Thread started (pthread id: 139742357448448, attributes: stacksize: 512k, guardsize: 0k, detached). [0.234s][info][os,thread] Thread is alive (tid: 189146, pthread id: 139742357448448). And from hs_err file (the size is 1Mb instead of asked 512k): =>0x00007f1868735800 JavaThread "MainThread" [_thread_in_vm, id=189146, stack(0x00007f184d836000,0x00007f184d937000)] I talked before that may be we reuse threads because Compiler threads are JavaThread and they are part of next list (from hs_err file): Threads class SMR info: _java_thread_list=0x00007f17d01b2920, length=15, elements={ 0x00007f186801c800, 0x00007f1868531800, 0x00007f1868534000, 0x00007f1868550000, 0x00007f1868553000, 0x00007f1868555800, 0x00007f1868558000, 0x00007f18686aa000, 0x00007f18686a5800, 0x00007f1868735800, 0x00007f17d01da800, 0x00007f17d01dc000, 0x00007f17d01dd800, 0x00007f17d01df000, 0x00007f17d01b1000 }
24-04-2018

I wonder if jtreg ever messes with the stacksize of the thread it creates to run the test? It doesn't appear to the main thread of the JVM: at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) at java.base/java.lang.Thread.run(Thread.java:831)
24-04-2018

It doesn't report the ThreadStackSize value that it read, as part of the error message. ThreadStackSize VM option: 524288 == 512KB Size of protected shadow pages: 90112 == 88KB Full stack size: 1052672 == 1028KB So looks like default platform stack size rather than what was set by the test. Strange ...
24-04-2018

[~dholmes] David, what do you mean "actual config size"? It is not rounding. I listed output from failing case: ThreadStackSize VM option: 524288 Size of protected shadow pages: 90112 Full stack size: 1052672 ----------System.err:(13/1080)---------- java.lang.RuntimeException: getThreadFullStackSize value [1052672] should be within 90%..110% of the value returned It compares value of flag ThreadStackSize (set by test with flag -Xss512k) and current thread stack size Thread::current()->stack_size() which returns 1052672.
24-04-2018

The test wording "value returned by HotSpotDiagnosticMXBean" is misleading as the MXBean has no involvement here. The error also doesn't report the actual config size that was read. This may be completely unrelated to the change to the test that was done - which should not be needed as far as I can see as the test should never be run in a compiler thread. Need to see where the original test was failing, what machines etc. It may be a rounding issue with large pages for example. Or some other glitch reading the configured size.
24-04-2018

http://cr.openjdk.java.net/~kvn/8202152/webrev.00/
23-04-2018

The simplest solution is to use the same CompilerThreadStackSize for this test run: diff -r 5d2adef239d6 test/hotspot/jtreg/runtime/whitebox/WBStackSize.java --- a/test/hotspot/jtreg/runtime/whitebox/WBStackSize.java +++ b/test/hotspot/jtreg/runtime/whitebox/WBStackSize.java @@ -29,7 +29,7 @@ * @build sun.hotspot.WhiteBox * @run driver ClassFileInstaller sun.hotspot.WhiteBox * sun.hotspot.WhiteBox$WhiteBoxPermission - * @run main/othervm -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -Xss512k WBStackSize + * @run main/othervm -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -Xss512k -XX:CompilerThreadStackSize=512 WBStackSize */
23-04-2018

long actualStackSize = wb.getThreadStackSize(); WB_ENTRY(jlong, WB_GetThreadStackSize(JNIEnv* env, jobject o)) return (jlong) Thread::current()->stack_size(); WB_END Looks like we can't rely on thread's name: boolean isCompilerThread = Thread.currentThread().getName().indexOf(" CompilerThread")
23-04-2018

----------System.out:(3/97)---------- ThreadStackSize VM option: 524288 Size of protected shadow pages: 90112 Full stack size: 1052672 ----------System.err:(13/1080)---------- java.lang.RuntimeException: getThreadFullStackSize value [1052672] should be within 90%..110% of the value returned by HotSpotDiagnosticMXBean at WBStackSize.main(WBStackSize.java:97)
23-04-2018