United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6404306 : IllegalArgumentException:Invalid thread ID occurs in 6.0b76

Details
Type:
Bug
Submit Date:
2006-03-27
Status:
Closed
Updated Date:
2013-06-21
Project Name:
JDK
Resolved Date:
2006-05-04
Component:
hotspot
OS:
windows_xp
Sub-Component:
runtime
CPU:
x86
Priority:
P4
Resolution:
Fixed
Affected Versions:
6
Fixed Versions:

Related Reports
Backport:
Relates:
Relates:
Relates:

Sub Tasks

Description
There is sometimes teh case when ThreadID  gottten by TheadMXBean.getAllThreadIds() becomes zero.
The attached program works to pass the threadID gotten by getAllThreadIds() to getThreadInfo().
The passed threadID is zero, IllegalArgumentException arises.

REPRODUCE:
 1) Compile the sttached test program: ThreadInfoGetter.java
 2) Launch "java ThreadInfoGetter"
K:\shares2\threadid-becomes-zero>java ThreadInfoGetter
java.lang.IllegalArgumentException: Invalid thread ID entry
        at sun.management.ThreadImpl.getThreadInfo0(Native Method)
        at sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:147)
        at sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:128)
        at ThreadInfoGetter.run(ThreadInfoGetter.java:33)
---
Invalid thread ID entries = [0, 8, 5, 4, 3, 2]
Loop count = 19800

K:\shares2\threadid-becomes-zero>java -version
java version "1.6.0-beta2"
Java(TM) SE Runtime Environment (build 1.6.0-beta2-b76)
Java HotSpot(TM) Client VM (build 1.6.0-beta2-b76, mixed mode)

FREQUENCY:
  2 times of 10 trials with test program

INVESTIGATION:
 When AttachCurrentThread() is called , there seems short period which threadID is zero.
 (threadID is not assined)
 During such short period, if unassigned threadID is referred, IllegalArgumentException occurs.

                                    

Comments
EVALUATION

The specification for Thread.getId states that it returns a positive value. Hence a zero return value is not correct. It is apparent from the Thread code and the sun.management.ThreadImpl code that a zero thread id is invalid.
                                     
2006-03-28
To make a Java call a native thread has to look like a Java thread: It has to have a Java thread object and reside in the threads list. The thread must exist in the thread's list prior to the Java call so that GC and safepoints will work correctly if they occur during the Java call (whether triggered by the current thread or not).

So attach_current_thread:
- creates a  JavaThread object
- initializes its basic VM state and TLS
- under the Threads_lock:
     - initializes active handles
     - adds it to the Threads list
- creates a default initualized java.lang.Thread object
- binds the JavaThread to the Thread
- sets the Thread's priority
- binds the Thread to the JavaThread
- if a name is supplied then
  - creates a Java String for the name
  - invokes the Thread(ThreadGroup tg, String name) constructor
- else
  - invokes the Thread(ThreadGroup tg, Runnable r) constructor (which
    creates a default name)
- sets the daemon status of the Thread
- invokes ThreadGroup.add passing in the new Thread object (this
  emulates what would occur when a normal Thread was started)
- sets the Thread state to runnable
- informs JVMTI and JVMPI of a "thread start" event

Note there is a little bit of trickery here. The Thread constructor expects to ascertain some properties of the new Thread from the current "parent" Thread. But in this case the current Thread IS the new Thread, so it has to have all the right properties set for when the constructor queries it for its own attributes. 

The problems: during the Java call to the Thread constructor the newly attached thread has a partly constructed Thread object:
 1. the name is null until after the name is assigned,
 2. the group is null until assigned
 3. the context classloader is null until assigned (but may be null
    anyway)
 4.the TID is zero until assigned

So far we have seen (1) trigger a VM crash due to accessing the null name (fixed by an explicit null check); and (4) causes this IllegalArgumentException in the ThreadMXBean code. I think (3) is  safe because the CCL can be null. (2) might also cause a problem because a null group is only expected for threads that have terminated; but at this stage the Thread won't be found by any code the enumerates the ThreadGroups in the system - so this would only be a problem if the VM cared and I don't think it does.

Possible solutions for this include:

(a) perform the necessary initialization in native code so that the Thread object is never observed with default initialized fields

Setting the TID is possible but the problem is giving it a valid TID value: it must be unique across all live threads and it mustn't change. We could maintain a native "nextTID" value that starts at max and works down to avoid conflict with the Thread version that starts at zero and works up.

Setting the name might be possible, if creating a Java String isn't a Java call. It doesn't appear to be but I'm not clear on how the allocation is handled and whether GC might get involved. Setting the right name is harder because we don't know what count the Thread class is up to. This might seem minor but someone is bound to complain if they suddenly get Thread-49875645321 instead of Thread-5. 

(b) Make the sequence of attaching the partly initialized Thread and completing the call to the constructor, atomic.

Holding any lock whilst making the up-call into Java would be very risky. To make it unacceptably risky you need to realize that the Thread constructor invokes methods on the installed SecurityManager, and these are non-final methods and so to all intents and purposes, we could end up executing application-defined code, which could do anything.

(c) have the Thread object construction performed by the VM thread and hand it back to the native thread

This is certainly possible, but the performance implications make it impractical. For a point of reference: asking the VMThread to obtain the stracktrace of the current thread is 10X slower than getting the current thread's stacktrace directly. We do not want to bring the VM to a safepoint each time JNI_AttachCurrentThread is called.

(d) allow the native thread to temporarily impersonate an existing fully-initialized Thread  

When the VM is created we create a java.lang.Thread object with a valid name (eg "jni_attaching_thread") and TID. Instead of binding the JavaThread with the newly allocated Thread object we bind it with this pre-existing object for the duration of the constructor call. (Note the preexisting Thread would have to satisfy the "parent" role that the Thread constructor expects.) After the constructor we re-bind the JavaThread to the now fully constructed new Thread.

This seems doable. The problem is dealing with concurrent JNI_AttachCurrentThread calls. If the pre-initialized Thread is shared then we need to serialize JNI_AttachCurrentThread by taking a lock. This is less risky for a dedicated lock than using the ThreadList lock, but still holds some risk. There are also performance implications if we serialize JNI_Attach_CurrentThread

(e) patch the library code to skip Threads with default initialized fields (in this case have the ThreadMXBean ignore any thread with a TID of zero)

A simple and immediate fix, but probably short-sighted. It would probably comes as a surprise to most library developers that they might encounter partly initialized Thread objects, and it seems likely that other bugs like this exist in library code.

(f) hide the partly initialized thread from those parts of the system that might fail if they encounter a partly initialized Thread

The list of JavaThreads is obtained using the ThreadsListEnumerator, and this gets used by three clients:
 - the Thread class (getAllStackTraces()) and ThreadMXbean class - both via JVM_GetAllThreads; and
 - JVMTI via JVMTIEnv::GetAllThreads

We could add state to JavaThread that tracks if the thread "is attaching", the enumerator could then skip threads that are attaching.

This solution would be the obvious choice if not for one thing: the attaching thread might be a thread of interest to Thread.getAllStackTraces, or the JVMTI client. The reason being that, as stated previously, the thread could end up executing arbitrary application code via the Thread constructor and its calls to the SecurityManager; further the thread may hold monitor locks due to calls on ThreadGroup and/or SecurityManager. Note: VM operations like dumping stacks, tracing deadlocks etc do not use the ThreadsListEnumerator so would have direct control over the threads they see - and must take care to watch for things like NULL thread names.

I think (f) is the way to proceed, but it requires feedback from the MMX, Thread and JVMTI folk as to whether this change in behaviour would be acceptable. The MMX folk should be okay with the change as they are the ones getting the exception because of this. JVMTI might be more concerned, but note that during the constructor call the JVMTI "thread started" events have not been posted for this thread anyway.
                                     
2006-04-04
Despite the small timing windows in which this failure can theoretically occur. It seems that it can reliably occur on some systems due to the launcher thread becoming the DestroyJavaVM thread.
                                     
2006-04-07
SUGGESTED FIX

Track threads in the process of attaching to the VM and elide those threads from the set of live threads.
                                     
2006-04-13
After discussions involving jvmti folk I am fixing the current specific problem by eliding any threads in the process of attaching from the set of threads returned by JVM_GetAllThreads. This impacts the MMX code and the Thread.dumpStacks code, but has no effect on JVMTI code.

The more general problem of partially constructed attaching threads is being tracked under CR 641293
                                     
2006-04-13
EVALUATION

There is a small window of time during which a native thread that is in the process of attaching to the VM can be seen to have a partly initialized Thread object. This leads to a null name and a zero TID for the thread - niether of which are valid values. This problem arises not only for application use of JNI but also for the VM itself, due to the way in which the initial startup thread detaches itself and then re-attaches to become the DestroyJavaVM thread. It is the thread becoming the DestroyJavaVM thread that actually triggers the failure in the test program. Such threads should not be visible to the management code whilst in this state.
                                     
2006-04-17



Hardware and Software, Engineered to Work Together