JDK-6625723 : Excessive ThreadLocal storage used by ReentrantReadWriteLock
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.concurrent
  • Affected Version: 7
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2007-11-04
  • Updated: 2013-08-22
  • Resolved: 2008-03-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7
6u51Fixed 7 b25Fixed
Related Reports
Relates :  
Description
As of jdk6, each ReentrantReadWriteLock stores a value into a ThreadLocal
for each Thread where that lock acquires a read lock.  
If m is the number of locks, and n is the number of threads each lock is used with,
then the memory overhead is O(m*n).
Although annoying, this may be acceptable when either m or n is small,
but some users have large values for m and n, and for them
this memory overhead is a showstopper.  E.g. in 

http://forum.java.sun.com/thread.jspa?threadID=5114887

(rest of description is text of user complaint)

I'm happy user of java 5 concurrency utilities - especially read/write locks. We have a system with hundreds of thousands of objects (each protected by read/write lock) and hundreds of threads. I have tried to upgrade system to jdk6 today and to my surprise, most of the memory reported by jmap -histo was used by thread locals and locks internal objects...

As it turns out, in java 5 every lock had just a counter of readers and writers. In java 6, it seems that every lock has a separate thread local for itself - which means that there are 2 objects allocated for each lock for each thread which ever tries to touch it... In our case, memory usage has gone up by 600MB just because of that.

I have attached small test program below. Running it under jdk5 gives following results:

Memory at startup 114
After init 4214
One thread 4214
Ten threads 4216


With jdk6 it is

Memory at startup 124
After init 5398
One thread 8638
Ten threads 39450


This problem alone makes jdk6 completly unusable for us. What I'm considering is taking ReentranceReadWriteLock implementation from JDK5 and using it with rest of JDK6. There are two basic choices - either renaming it and changing our code to allocate the other class (cleanest from deployment point of view) or putting different version in bootclasspath. Will renaming the class (and moving it to different package) work correctly with jstack/deadlock detection tools, or they are expecting only JDK implementation of Lock ? Is there any code in new jdk depending on particular implementation of RRWL ?

Why this change was made btw ? Only reason I can see is to not allow threads to release read lock taken by another threads. This is a nice feature, but is it worth wasting gigabyte of heap ? How this would scale to really big number of threads ?

Test program

import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.locks.*;
 
public class LockTest {
 
  static AtomicInteger counter = new AtomicInteger(0);
  static Object foreverLock = new Object();
  
  
  public static void main(String[] args) throws Exception {
 
    dumpMemory("Memory at startup ");
    
    final ReadWriteLock[] locks = new ReadWriteLock[50000];
    for ( int i =0; i < locks.length; i++ ) {
      locks[i] = new ReentrantReadWriteLock();
    }
    dumpMemory("After init ");
    
    Runnable run = new Runnable() {
      public void run() {
        for ( int i =0; i< locks.length; i++ ) {
          locks[i].readLock().lock();
          locks[i].readLock().unlock();
        }
        counter.incrementAndGet();
        synchronized(foreverLock) {
          try {
            foreverLock.wait();
          } catch (InterruptedException e) {
            e.printStackTrace();
          }
        }
      }
    };
    
    
    new Thread(run).start();
    
    while ( counter.get() != 1 ) {
      Thread.sleep(1000);
    }
    
    dumpMemory("One thread ");
    
    for ( int i =0; i < 9; i++ ) {
      new Thread(run).start();
    }
    
    while ( counter.get() != 10 ) {
      Thread.sleep(1000);
    }
    
    dumpMemory("Ten threads ");
    
    System.exit(0);
    
  }
  
 
  private static void dumpMemory(String txt ) {
    System.gc();
    System.gc();
    System.gc();
    System.out.println(txt + (Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory())/1024);
  }
  
}

Comments
http://hg.openjdk.java.net/jdk7/hotspot/jdk/rev/da49dce73a07
13-06-2013

EVALUATION Yes, using ThreadLocal storage even after a lock is no longer used in a thread is unacceptable, and must be fixed, even at the cost of introducing more runtime overhead. Fortunately, this overhead can be minimized, for example to only occur on contented locks.
04-11-2007