JDK-8231031 : runtime/ReservedStack/ReservedStackTest.java fails after jsr166 refresh
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 14
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-09-15
  • Updated: 2025-01-24
  • Resolved: 2021-05-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17
17 b23Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8231033 :  
Description
After the push for JDK-8229442 and related jsr166 updates the ReservedStackTest has started to fail:

Java HotSpot(TM) 64-Bit Server VM warning: Potentially dangerous stack overflow in ReservedStackAccess annotated method java.util.concurrent.locks.ReentrantLock$Sync.lock()V [1]
Java HotSpot(TM) 64-Bit Server VM warning: Potentially dangerous stack overflow in ReservedStackAccess annotated method java.util.concurrent.locks.ReentrantLock.lock()V [1]
Java HotSpot(TM) 64-Bit Server VM warning: Potentially dangerous stack overflow in ReservedStackAccess annotated method ReservedStackTest$ReentrantLockTest.lockAndCall(I)V [1]
java.lang.Error: FAILED: ReentrantLock 391 looks corrupted
	at ReservedStackTest$RunWithSOEContext.run(ReservedStackTest.java:204)
	at java.base/java.lang.Thread.run(Thread.java:830)
STATUS:Failed.`main' threw exception: java.lang.Error: FAILED: ReentrantLock 391 looks corrupted
Comments
Changeset: d215743a Author: David Holmes <dholmes@openjdk.org> Date: 2021-05-13 01:14:38 +0000 URL: https://git.openjdk.java.net/jdk/commit/d215743a91555c4edabd116b1899765d5a283dc7
13-05-2021

I'm switching this to a hotspot->runtime test bug and updating the test so that it prohibits inlining of ReentrantLock.lock(). If that seems to make the test more reliable (initial results are promising) then I will create a PR for this change.
11-05-2021

[~martin] Note my last comment was not in response to yours, as I have been preparing that comment for the past 90 minutes.
11-01-2021

Okay here is why I think this test is failing and why Fred's fix changes things. There is IMO a flaw with the RSA mechanism where the compiler promotes the RSA annotation to the method which inlines a RSA annotated method. If such a method triggers a StackOverflowError then it will be incorrectly flagged as "Potentially dangerous stack overflow in ReservedStackAccess annotated method ...". So in the test the lockAndcall method is designed to call itself recursively until we get a SOE, but lockAndCall invokes lock() which is a RSA annotated method that gets inlined into lockAndCall(). Hence lockAndCall() is treated as a RSA method by the JIT and so when the expected SOE happens it gets incorrectly flagged as just described and the test fails. With Fred's fix lock() is no longer annotated as a RSA method, so as long as the real RSA method is not recursively inlined into lockAndCall(), then the SOE is no longer treated as being RSA related and so the test passes. The intent with RSA is that we annotate the actual method that must not encounter the SOE that will lead to state corruption. Hence with the current code we should only annotate initialTryLock and tryAcquire variants, rather than the top-level lock() etc. But that is not a correctness issue, more an optimization. I don't see any good solution to the problem caused by inlining - if we inline a RSA method we still need to apply the stack guard for that method, and we don't want to disable inlining of such methods. We also see the warning for the lock() method - which suggests that the RSA mechanism is actually failing. However, I think the problem there is that because lockAndCall gets flagged as RSA the reserved page is partially consumed by lockAndCall, not leaving enough stack for the lock() method to successfully complete. I think the only way to make the test reliable is to always force inlining off.
11-01-2021

Is it really the consensus of the hotspot team that we should apply the patch in https://bugs.openjdk.java.net/browse/JDK-8231031?focusedCommentId=14289938&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14289938 IIRC there was a deliberate decision not to apply this to jsr166. (but why?)
11-01-2021

This issue seems to have fallen through the cracks and we have not applied the changes that Fred had suggested (not withstanding the imperfections of the RSA mechanism).
11-01-2021

I think the problem is that the ReservedStackAccess annotation is adopted by the whole nmethod but that stack execution space should only be available to calls underneath the annotated inline frame. Every other call site in the nmethod would need to be guarded by a check whether the current frame is in the reserved space and should throw an SOE instead of executing the call. Basically execution can't be allowed to escape the current frame in that case. Anyway, I understand that it's a difficult problem.
27-05-2020

I keep hoping we can someday banish StackOverflowError entirely by implementing growable or splittable stacks. But that's very hard. (OTOH, Loom team needs to implement similar features?) We the jsr166 maintainers are not entirely happy with ReservedStackAccess, but will include changes that hotspot team have consensus agreement on, but we don't have such a consensus today.
27-05-2020

So is there any plan to fix the ReservedStackAccess implementation? It seems like it's currently broken to me. I put a longer comment over at https://bugs.openjdk.java.net/browse/JDK-8213567?focusedCommentId=14340995&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14340995 but the gist is that if the jtreg test of ReservedStackAccess requires an exclude directive to work correctly then how would we expect ReservedStackAccess to work correctly in production?
27-05-2020

Assigning to Martin to integrate into the JSR 166 CVS repo.
06-02-2020

Just guessing... I wonder if and which class-loading methods also need @ReservedStackAccess? Or whether the approach may always fail in the presence of class-loading within a lock.
24-09-2019

Test now passes on all platforms as expected. Moving back to core-libs. A reminder that the test will need to be removed from the hotspot/jtreg/ProblemList.txt at the same time as the code is fixed. Thanks.
17-09-2019

I'm running Fred's patch through some additional testing and then will likely hand bug back to Doug/Martin. I'm also perplexed somewhat by the fix. I can see that these adjustments on the annotation placement make more sense as they minimise the protected region, but I don't see how they can be functionally incorrect in their current placement. Perhaps it is simply a matter of realigning things with how the test tries to force the error. ??
17-09-2019

Thanks. We'll incorporate these. I apparently don't understand the mechanics that make one way to cover paths more effective than another, so will just take your advice.
17-09-2019

Some adjustments are required on the set of methods being annotated with @ReservedStackAccess. diff --git a/src/java.base/share/classes/java/util/concurrent/locks/ReentrantLock.java b/src/java.base/share/classes/java/util/concurrent/locks/ReentrantLock.java --- a/src/java.base/share/classes/java/util/concurrent/locks/ReentrantLock.java +++ b/src/java.base/share/classes/java/util/concurrent/locks/ReentrantLock.java @@ -147,13 +147,11 @@ */ abstract boolean initialTryLock(); - @ReservedStackAccess final void lock() { if (!initialTryLock()) acquire(1); } - @ReservedStackAccess final void lockInterruptibly() throws InterruptedException { if (Thread.interrupted()) throw new InterruptedException(); @@ -161,7 +159,6 @@ acquireInterruptibly(1); } - @ReservedStackAccess final boolean tryLockNanos(long nanos) throws InterruptedException { if (Thread.interrupted()) throw new InterruptedException(); @@ -220,6 +217,7 @@ static final class NonfairSync extends Sync { private static final long serialVersionUID = 7316153563782823691L; + @ReservedStackAccess final boolean initialTryLock() { Thread current = Thread.currentThread(); if (compareAndSetState(0, 1)) { // first attempt is unguarded @@ -238,6 +236,7 @@ /** * Acquire for non-reentrant cases after initialTryLock prescreen */ + @ReservedStackAccess protected final boolean tryAcquire(int acquires) { if (getState() == 0 && compareAndSetState(0, acquires)) { setExclusiveOwnerThread(Thread.currentThread()); @@ -256,6 +255,7 @@ /** * Acquires only if reentrant or queue is empty. */ + @ReservedStackAccess final boolean initialTryLock() { Thread current = Thread.currentThread(); int c = getState(); @@ -276,6 +276,7 @@ /** * Acquires only if thread is first waiter or empty */ + @ReservedStackAccess protected final boolean tryAcquire(int acquires) { if (getState() == 0 && !hasQueuedPredecessors() && compareAndSetState(0, acquires)) {
17-09-2019

It is not yet determined this is a test bug. The j.u.c changes may have broken the default configuration for reserved stack access i.e. it now uses more stack.
16-09-2019

The test has been excluded so dropping back to P3. Note the test/hotspot/jtreg/ProblemList.txt will need to be updated as part of this patch.
16-09-2019

I'm unclear why the test has: @Override @jdk.internal.vm.annotation.ReservedStackAccess public void run() { as the annotation is restricted to trusted classes by default and the test does not change that flag setting. ??? The warning on lockAndcall is likely just an inlining artifact. The test explicitly restricts the inlining depth so the recent changes may have had an impact on that. I'll take a look at how this is supposed to work and see what changes might have caused the problem.
16-09-2019

At this point I'm stuck and thinking this is more likely a hotspot than a libraries bug. David, can you find a better owner? We should exclude the test for now.
16-09-2019

I'm seeing: OpenJDK 64-Bit Server VM warning: Potentially dangerous stack overflow in ReservedStackAccess annotated method ReservedStackTest$ReentrantLockTest.lockAndCall(I)V [2] but that makes no sense to me. ReservedStackTest$ReentrantLockTest.lockAndCall has no such annotation... A VM bug? An inlining artifact?
16-09-2019

I looked a bit at this failing test. We're going to need help from hotspot/VM folk. ReentrantLock still has @ReservedStackAccess that should prevent SOE in the middle of locking. This test should use the captured SOE for diagnosis, but just throws it away. Not sure why a test method is annotated @ReservedStackAccess . Here's an experimental change: --- a/test/hotspot/jtreg/runtime/ReservedStack/ReservedStackTest.java +++ b/test/hotspot/jtreg/runtime/ReservedStack/ReservedStackTest.java @@ -141,16 +141,13 @@ public String getResult() { if (!stackOverflowErrorReceived) { - return "ERROR: Not conclusive test: no StackOverflowError received"; + return "ERROR: Inconclusive test: no StackOverflowError received"; } for (int i = 0; i < LOCK_ARRAY_SIZE; i++) { if (lockArray[i].isLocked()) { if (!lockArray[i].isHeldByCurrentThread()) { - StringBuilder s = new StringBuilder(); - s.append("FAILED: ReentrantLock "); - s.append(i); - s.append(" looks corrupted"); - return s.toString(); + if (soe != null) soe.printStackTrace(); + return String.format("FAILED: ReentrantLock %d looks corrupted", i); } } } @@ -191,7 +188,7 @@ } @Override - @jdk.internal.vm.annotation.ReservedStackAccess + //@jdk.internal.vm.annotation.ReservedStackAccess public void run() { counter = 0; decounter = deframe; After this, I can see the top of the stack: java.lang.StackOverflowError at java.base/java.util.concurrent.locks.AbstractOwnableSynchronizer.setExclusiveOwnerThread(AbstractOwnableSynchronizer.java:74) at java.base/java.util.concurrent.locks.ReentrantLock$NonfairSync.initialTryLock(ReentrantLock.java:227) at java.base/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:152) at java.base/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:324) at ReservedStackTest$ReentrantLockTest.lockAndCall(ReservedStackTest.java:170) It's not surprising that the lock looks corrupted if it throws while trying to assign itself as owner. But isn't ReservedStackAccess supposed to prevent that?
16-09-2019