JDK-6699669 : Hotspot server leaves synchronized block with monitor in bad state
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 6,6u7,7
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: generic,linux
  • CPU: generic,x86
  • Submitted: 2008-05-09
  • Updated: 2019-11-18
  • Resolved: 2011-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u14Fixed 7 b56Fixed hs14Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.6.0_03"
Java(TM) SE Runtime Environment (build 1.6.0_03-b05)
Java HotSpot(TM) Server VM (build 1.6.0_03-b05, mixed mode)

Also present in the original 1.6.0 b105

FULL OS VERSION :
Linux 2.4.x
Linux 2.6.x
Windows XP
Windows 2003

EXTRA RELEVANT SYSTEM CONFIGURATION :
Only present when using server VM. Client VM does not produce the problem, and -Xint (disabling compiled code) prevents the problem.

A DESCRIPTION OF THE PROBLEM :
I stumbled on this problem when a thread starting blocking on a synchronized method, but it was the only thread attempting to obtain the monitor on the object. It manifests as a monitor that one thread says will say another thread is locking, but the other thread won't be locking it. If the thread that is blamed exits, it then blames another random thread.

For reasons I don't understand, two important conditions are needed: a loop with many iterations (I was guessing this had to do with the compile threshold), and a lot of local variables. If the loop with many iterations is executed within a synchronized block inside a method with a lot of local variables, when it leaves the synchronized block, something is wrong with the monitor. If it enters a second synchronized block on the same object, the management package will show it holding the monitor after it exits the block, but with a null stack frame. If the thread just continues on and a different thread tries to obtain the monitor, the second thread will say the first thread still holds the monitor, although the management package shows that it does not. If the first thread exits, it will then say a different thread holds the monitor, sometimes even itself.

I have included two programs demonstrating both variations of the problem.

THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: No

THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the included programs, LockTest and LockTest2 using the server VM

EXPECTED VERSUS ACTUAL BEHAVIOR :
If the bug were not present, it would exit the main() function. Instead, the second thread blocks forever waiting for the monitor to be released.

The main() function runs the faulty code in a thread (call it thread A). After the function runs, thread A sleeps for 10 seconds while main starts a second thread (thread B) to run the faulty code a second time. While waiting for the second thread to finish, main() prints the state of thread B and the lock it is waiting on. For the first 10 seconds of its life, thread B is waiting on thread A. When thread A finishes sleeping and exits, thread B will claim another thread holds the monitor (ironically, it often blames itself, but sometimes the compiler thread).

Using the client VM, the message "Exiting main()" should appear in the messages, and the program should end after the two threads finish sleeping for 10 seconds. This is the expected behavior.

I first noticed the problem running under 1.6.0 b105, and updated to 1.6.0_03, but the problem was still present. I've tested under the various OSes listed above. I tried the program (sans management package stuff) under 1.5.0_04 on an XP machine, and it didn't exhibit the bug. Just to double-check, I ran the same class file under 1.6.0_03 and it failed.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
Output for LockTest:
main: About to start first thread: Thread-0
Thread-0: Entering test()
Thread-0: Checkpoint #0, number of monitors: 1
Thread-0: MonitorInfo[0]=java.lang.Object@14b7453, StackDepth=3, frame=LockTest.test(LockTest.java:28)
Thread-0: Checkpoint #1, number of monitors: 0
Thread-0: Checkpoint #2, number of monitors: 1
Thread-0: MonitorInfo[0]=java.lang.Object@14b7453, StackDepth=3, frame=LockTest.test(LockTest.java:35)
Thread-0: Checkpoint #3, number of monitors: 1
Thread-0: MONITOR HELD AFTER EXITING SYNCHRONIZED BLOCK!
Thread-0: MonitorInfo[0]=java.lang.Object@14b7453, StackDepth=-1, frame=No Frame
Thread-0: Leaving test()
main: About to start second thread: Thread-1
Thread-1: Entering test()
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-0
Thread-0: exiting
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-1
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-1
Thread-1 state: BLOCKED, LockName: java.lang.Object@14b7453, LockOwnerName: Thread-1
...

Output for LockTest2:
main: About to start first thread: Thread-0
Thread-0: Number of monitors: 0
Thread-0: Entering test()
Thread-0: Leaving test(), n=-2014260032
Thread-0: Number of monitors: 0
main: About to start second thread: Thread-1
Thread-1: Number of monitors: 0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: Thread-0
Thread-0: exiting
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: CompilerThread1
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: CompilerThread1
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: CompilerThread1
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: CompilerThread1
Thread-1 state: BLOCKED, LockName: LockTest2@1034bb5, LockOwnerName: CompilerThread1
...

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
*** LockTest.java ***
import java.lang.management.ThreadInfo;
import java.lang.management.ManagementFactory;

public class LockTest
{
  private Object syncObj = new Object();

  public LockTest()
  {  }

  public void test()
  {

    int localVar0, localVar1, localVar2, localVar3, localVar4, localVar5, localVar6, localVar7, localVar8, localVar9;
    int localVar10, localVar11, localVar12, localVar13, localVar14, localVar15, localVar16, localVar17, localVar18, localVar19;
    int localVar20, localVar21, localVar22, localVar23, localVar24, localVar25, localVar26, localVar27, localVar28, localVar29;
    int localVar30, localVar31, localVar32, localVar33, localVar34, localVar35, localVar36, localVar37, localVar38, localVar39;
    int localVar40, localVar41, localVar42, localVar43, localVar44, localVar45, localVar46, localVar47, localVar48, localVar49;
    
    String threadName = Thread.currentThread().getName();
    System.out.println(threadName + ": Entering test()");

    synchronized (syncObj)
    {
      int n = 0;
      for (int i = 0; i < 10000000; i++)
        n += i;
      logMonitors((n & 1) >>> (n & 1));
    } // end sync
    
    logMonitors(1);
    
    synchronized (syncObj)
    {
      logMonitors(2);
    }
    logMonitors(3);

    System.out.println(threadName + ": Leaving test()");
    return;
  }
  
  public static void logMonitors(int n)
  {
    String threadName = Thread.currentThread().getName();
    java.lang.management.MonitorInfo[] monitors = java.lang.management.ManagementFactory.getThreadMXBean()
                                                  .getThreadInfo(new long[] { Thread.currentThread().getId() }, true, true)[0].getLockedMonitors();
    System.out.println(threadName + ": Checkpoint #" + n + ", number of monitors: " + monitors.length);
    if ((n == 3) && (monitors.length > 0))
      System.out.println(threadName + ": MONITOR HELD AFTER EXITING SYNCHRONIZED BLOCK!");
    for (int i = 0; i < monitors.length; i++)
    {
      java.lang.management.MonitorInfo monitor = monitors[i];
      System.out.println(threadName + ": MonitorInfo[" + i + "]=" + monitor.toString() + ", StackDepth=" + monitor.getLockedStackDepth()
                         + ", frame=" + ((monitor.getLockedStackFrame() == null) ? "No Frame" : monitor.getLockedStackFrame().toString()));
    }
    return;
  }
  
  public static void main(String[] argv)
  {
    LockTest lt = new LockTest();
    RunLockTest rlt = new RunLockTest(lt, 10000);
    Thread t = new Thread(rlt);
    System.out.println("main: About to start first thread: " + t.getName());
    t.start();
    
    synchronized (rlt)
    {
      while (!rlt.finishedTest)
      {
        try
        {  rlt.wait();  }
        catch (Exception e)
        {  }
      }
    }
    
    rlt = new RunLockTest(lt, 0);
    t = new Thread(rlt);
    System.out.println("main: About to start second thread: " + t.getName());
    t.start();
    while (t.getState() != Thread.State.TERMINATED)
    {
      try
      {  Thread.currentThread().sleep(1000);  }
      catch (Exception e)
      {  }
      ThreadInfo[] tis = ManagementFactory.getThreadMXBean().getThreadInfo(new long[] { t.getId() }, true, true);
      if (tis.length == 0)
        System.out.println("Unable to find the thread " + t.getName());
      else
      {
        ThreadInfo ti = tis[0];
        if (ti == null)
          System.out.println("Unable to get info for thread: " + t.getName());
        else
          System.out.println(t.getName() + " state: " + ti.getThreadState().toString() +
                             ", LockName: " + ti.getLockName() + ", LockOwnerName: " + ti.getLockOwnerName());
      }
    }
    // Never gets here
    System.out.println("Exiting main()");
    return;
  }
  
  public static class RunLockTest implements Runnable
  {
    public LockTest lt;
    public boolean finishedTest = false;
    public long sleepLength;
    
    public RunLockTest(LockTest _lt, long _sleepLength)
    {
      lt = _lt;
      sleepLength = _sleepLength;
    }
    
    public void run()
    {
      lt.test();
      synchronized (this)
      {
        finishedTest = true;
        this.notify();
      }
      if (sleepLength > 0)
      {
        try
        {  Thread.currentThread().sleep(sleepLength);  }
        catch (Exception e)
        {  }
      }
      System.out.println(Thread.currentThread().getName() + ": exiting");
      return;
    }
  }
  
}



*** LockTest2.java ***
import java.lang.management.ThreadInfo;
import java.lang.management.ManagementFactory;

public class LockTest2
{
  private Object syncObj = new Object();

  public LockTest2()
  {  }

  public synchronized void test()
  {

    int localVar0, localVar1, localVar2, localVar3, localVar4, localVar5, localVar6, localVar7, localVar8, localVar9;
    int localVar10, localVar11, localVar12, localVar13, localVar14, localVar15, localVar16, localVar17, localVar18, localVar19;
    int localVar20, localVar21, localVar22, localVar23, localVar24, localVar25, localVar26, localVar27, localVar28, localVar29;
    int localVar30, localVar31, localVar32, localVar33, localVar34, localVar35, localVar36, localVar37, localVar38, localVar39;
    int localVar40, localVar41, localVar42, localVar43, localVar44, localVar45, localVar46, localVar47, localVar48, localVar49;
    
    String threadName = Thread.currentThread().getName();
    System.out.println(threadName + ": Entering test()");

    int n = 0;
    for (int i = 0; i < 10000000; i++)
      n += i;
    System.out.println(threadName + ": Leaving test(), n=" + n);
    return;
  }
  
  public static void logMonitors()
  {
    String threadName = Thread.currentThread().getName();
    java.lang.management.MonitorInfo[] monitors = java.lang.management.ManagementFactory.getThreadMXBean()
                                                  .getThreadInfo(new long[] { Thread.currentThread().getId() }, true, true)[0].getLockedMonitors();
    System.out.println(threadName + ": Number of monitors: " + monitors.length);
    for (int i = 0; i < monitors.length; i++)
    {
      java.lang.management.MonitorInfo monitor = monitors[i];
      System.out.println(threadName + ": MonitorInfo[" + i + "]=" + monitor.toString() + ", StackDepth=" + monitor.getLockedStackDepth()
                         + ", frame=" + ((monitor.getLockedStackFrame() == null) ? "No Frame" : monitor.getLockedStackFrame().toString()));
    }
    return;
  }
  
  public static void main(String[] argv)
  {
    LockTest2 lt = new LockTest2();
    RunLockTest rlt = new RunLockTest(lt, 10000);
    Thread t = new Thread(rlt);
    System.out.println("main: About to start first thread: " + t.getName());
    t.start();
    
    synchronized (rlt)
    {
      while (!rlt.finishedTest)
      {
        try
        {  rlt.wait();  }
        catch (Exception e)
        {  }
      }
    }
    
    rlt = new RunLockTest(lt, 0);
    t = new Thread(rlt);
    System.out.println("main: About to start second thread: " + t.getName());
    t.start();
    while (t.getState() != Thread.State.TERMINATED)
    {
      try
      {  Thread.currentThread().sleep(1000);  }
      catch (Exception e)
      {  }
      ThreadInfo[] tis = ManagementFactory.getThreadMXBean().getThreadInfo(new long[] { t.getId() }, true, true);
      if (tis.length == 0)
        System.out.println("Unable to find the thread " + t.getName());
      else
      {
        ThreadInfo ti = tis[0];
        if (ti == null)
          System.out.println("Unable to get info for thread: " + t.getName());
        else
          System.out.println(t.getName() + " state: " + ti.getThreadState().toString() +
                             ", LockName: " + ti.getLockName() + ", LockOwnerName: " + ti.getLockOwnerName());
      }
    }
    // Never gets here
    System.out.println("Exiting main()");
    return;
  }
  
  public static class RunLockTest implements Runnable
  {
    public LockTest2 lt;
    public boolean finishedTest = false;
    public long sleepLength;
    
    public RunLockTest(LockTest2 _lt, long _sleepLength)
    {
      lt = _lt;
      sleepLength = _sleepLength;
    }
    
    public void run()
    {
      logMonitors();
      lt.test();
      logMonitors();
      synchronized (this)
      {
        finishedTest = true;
        this.notify();
      }
      if (sleepLength > 0)
      {
        try
        {  Thread.currentThread().sleep(sleepLength);  }
        catch (Exception e)
        {  }
      }
      System.out.println(Thread.currentThread().getName() + ": exiting");
      return;
    }
  }
  
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
I found this out while isolating the bug that it was sensitive to the number of local variables. When I removed code while trying to come up with a test case, I usually removed the local variables with them, causing the bug to disappear (to my frustration). After much gnashing of teeth, I found out that this and the large loops were the key pieces. Reducing the number of local variables makes the problem go away. For example, on a Windows 2003 machine I tried it on, commenting out the lines declaring localVars 30-49 fixed it, and on a Linux 2.4 machine, commenting out the line declaring localVars 40-49 fixed it.

I suppose running with -Xint would also fix it, but changing the code is a better solution if at all possible.

Comments
EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/b9fba36710f2
07-04-2009

SUGGESTED FIX The following suggested fix was provided by SAP via CR 6712847 - which was closed as being a duplicate of this CR, 6699669. Here again, is the suggested fix from SAP: -- SAP's initial proposed fix was setting _owner to the sender frame sp. Changing SharedRuntime::OSR_migration_begin in vm/runtime/sharedRuntime.cpp from ObjectSynchronizer::inflate_helper(kptr2->obj()); to ObjectSynchronizer::inflate_helper(kptr2->obj())->set_owner((void*)fr.sender_sp()); Further discovery, however, indicates that ObjectMonitor::set_owner() not only resets _owner but rather unexpectedly alters some other fields as well. Licensee has changed their fix to account for this: sharedRuntime.cpp: ObjectMonitor *m = ObjectSynchronizer::inflate_helper(kptr2->obj()); if (m->owner() != thread) { m->set_owner((void*)fr.sender_sp(), true); } objectMonitor.inline.hpp: inline void ObjectMonitor::set_owner(void* owner, bool set_only_owner) { _owner = owner; if (!set_only_owner) { _recursions = 0; _count = 0; } }
02-10-2008

EVALUATION It appears the highest_lock field should be updated when locks are inflated. The change to remove the OSR adapter frame exposed this problem.
12-05-2008

EVALUATION I've reproduced this on 64 bit. It seems like the _owner field of a monitor points at a stack lock from a frame that doesn't exist anymore. I think inflation should be correcting the owner to be a pointer to the Thread to avoid this problem.
12-05-2008