JDK-8015603 : Missleading error message when failing to create a Java thread
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • Submitted: 2013-05-29
  • Updated: 2013-05-29
  • Resolved: 2013-05-29
Related Reports
Duplicate :  
Description
When we fail to create a Java thread we fail with a message that look something like:

Exception in thread "Thread-0" java.lang.OutOfMemoryError: unable to create new native thread
  at java.lang.Thread.start0(Native Method)
  at java.lang.Thread.start(Thread.java:691)
  at com.sun.tools.jdi.EventQueueImpl.startTimerThread(EventQueueImpl.java:140)
  at com.sun.tools.jdi.EventQueueImpl.removeUnfiltered(EventQueueImpl.java:185)
  at com.sun.tools.jdi.EventQueueImpl.remove(EventQueueImpl.java:96)
  at nsk.jdi.ClassPrepareEvent.referenceType.refType001$1EventHandler.run(refType001.java:99)

I find this confusing since it makes me think that we failed to create a thread because we ran out of memory. But the real reason is often (if not always) something else.

Here is what happens:

JVM_StartThread() in jni.cpp calls creates a new JavaThread. In the constructor for JavaThread we call os::create_thread(), which does "new OSThread(NULL, NULL)". This call could of course fail if we are out of memory, but since OSThread is a CHeapObj it has the overloaded operator new that does "vm_exit_out_of_memory(size, OOM_MALLOC_ERROR, "AllocateHeap");" if the allocation fails. So, the message "unable to create new native thread" can never happen from this failed allocation.

If the allocation succeeds we call thread->set_osthread() with the newly created OSThread. But later in the os::create_thread() method we call "thread->set_osthread(NULL)" if we encounter any issues, such as not being able to create a pthread or reaching the thread limit.

Eventually JVM_StartThread() checks if osthread is NULL and then throws the OOME with the message "unable to create new native thread".

So, it seems to me like "OutOfMemoryError: unable to create new native thread" in most cases does not mean that we are out of memory. It means that we have other issues with creating threads. It would be good to know the real reason why we could not create a thread.

I would suggest that we try to propagate information up from os::create_thread() about what actually went wrong. That way users of it can get a more useful error message when they detect a failure. JVM_StartThread() is where this happens more commonly, but there are more places in the VM where we (directly or indirectly) create new JavaThreads and print the same message:

$ grep -nr "unable to create new native thread" src
src/share/vm/compiler/compileBroker.cpp:915:                                    "unable to create new native thread");
src/share/vm/gc_implementation/shared/concurrentGCThread.cpp:209:                                    "unable to create new native thread");
src/share/vm/prims/jvm.cpp:2842:        "unable to create new native thread");
src/share/vm/prims/jvm.cpp:2845:              "unable to create new native thread");
src/share/vm/runtime/os.cpp:361:                                      "unable to create new native thread");
src/share/vm/runtime/serviceThread.cpp:69:                                    "unable to create new native thread");
src/share/vm/services/attachListener.cpp:490:                                    "unable to create new native thread");

Making this a P3 since it often happens in our testing and causes confusion.

Impact: M (no crash, but we loose a lot of time investigating the wrong issue again and again)
Likelihood: M (happens every time we get thread creation issues)
Workaround: H (there is no way to find the real error message)

ILW=MMH -> P3
Comments
As discussed in JDK-7182040 we can not discern exactly why pthread_create fails as the two common causes produce the same EAGAIN error code. Suggestions for improving the error message are welcome. The fact that allocation of JavaThread/OSThread can never actually fail is a different matter. But given that we are far more likely to fail to create/start the native thread in low memory conditions, this is less of a concern and falls under the general umbrella of "the VM should not abort when it runs out of memory".
29-05-2013

What version of hotspot are you running when this failed? Can you add that to Affects Version?
29-05-2013

Running out of memory on OSThread creation is supposed to lead to OOME being thrown NOT vm_exit_out_of_memory! Something has gone awry. In all cases the "unable to create new native thread" is supposed to reflect that either the Thread instance (JavaThread, CompilerThread etc) or the OSThread could not be created; or OS level thread creation or start encountered an error. These CHeap allocations, for OSThread and Thread subclasses should be using the "no throw" variant of new. But as I said something has gone awry here because things are not wired up the way they should be. The real reason we end up throwing this OOME is normally because we have hit the thread/process limit due to a "bad" ulimit setting. Like most things in the VM it would be nice to pass that error information up but we have no mechanism for doing so (though one could imagine implementing an "errno" type mechanism per-thread for VM internal errors).
29-05-2013