JDK-8157010 : [Solaris] Clean out incorrect usage of library-level thread priority functions
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9
  • Priority: P4
  • Status: Closed
  • Resolution: Won't Fix
  • OS: solaris
  • Submitted: 2016-05-16
  • Updated: 2017-12-18
  • Resolved: 2017-12-18
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
Probing deeper into this much of the Solaris priority code is "broken" and has likely been that way for a very long time**. Starting with this comment:

// Note: There are three priority scales used on Solaris. Java priotities
// which range from 1 to 10, libthread "thr_setprio" scale which range
// from 0 to 127, and the current scheduling class of the process we
// are running in. This is typically from -60 to +60.

There is no "thr_setprio" scale that ranges from 0 to 127, yet that is what we use in the java-to-native mapping table. Consequently in os::set_native_priority:

  if (!fxcritical) {
    // Use thr_setprio only if we have a priority that thr_setprio understands
    status = thr_setprio(thread->osthread()->thread_id(), newpri);
  }

  int lwp_status =
          set_lwp_class_and_priority(osthread->thread_id(),
                                     osthread->lwp_id(),
                                     newpri,
                                     fxcritical ? fxLimits.schedPolicy : myClass,
                                     !fxcritical);
  if (lwp_status != 0 && fxcritical) {
    // Try again, this time without changing the scheduling class
    newpri = java_MaxPriority_to_os_priority;
    lwp_status = set_lwp_class_and_priority(osthread->thread_id(),
                                            osthread->lwp_id(),
                                            newpri, myClass, false);
  }
  status |= lwp_status;
  return (status == 0) ? OS_OK : OS_ERR;

when we execute thr_setprio it always fails with EINVAL if we pass 127 (which is what all Java priorities >= 5 map to). And consequently the function always returns OS_ERR, even though the LWP priority setting was fine - but nothing checks the return value of os::set_native_priority! It also means that in this code:

OSReturn os::get_native_priority(const Thread* const thread, int *priority_ptr) {
  int p;
  if (!UseThreadPriorities) {
    *priority_ptr = NormalPriority;
    return OS_OK;
  }
  int status = thr_getprio(thread->osthread()->thread_id(), &p);
...

thr_getprio returns a priority that bears no relation to the one from the mapping table, nor that of the LWP. os::get_priority attempts to map the value back into a Java priority, using the mapping table. That won't work as intended but given all threads by default will have the same actual priority they will get back the same value - albeit unrelated to any Java or OS priority notion.

Then we have the following comment block:

// Interface for setting lwp priorities.  If we are using T2 libthread,
// which forces the use of BoundThreads or we manually set UseBoundThreads,
// all of our threads will be assigned to real lwp's.  Using the thr_setprio
// function is meaningless in this mode so we must adjust the real lwp's priority

Given we only deal with T2 and bound threads this reinforces that the use of thr_setprio is pointless, as is the use of thr_getprio.

There is also use of Thread::get_priority in VMOperation in that the calling thread and its priority are stored as part of the VMOperation. The value is otherwise unused and presumably was put in place either to prepare for priority-based VM operations, or sometime in the distant past we had priority-ordered VM operations.

** I can find no reference to a 0 to 127 priority range on Solaris going back to the Solaris 8 multi-threading programming guide: http://docs.oracle.com/cd/E19455-01/806-5257/index.html 

In summary, when UseThreadPriorities is true we should only be dealing with LWP related priorities. The existing mapping table needs to be replaced - and dynamically initialized based on the actual scheduling class of the LWPs in the process. The VMOperation use of priority should be removed, as should any other related "dead" code - see comments.

Comments
This is not on our list of current priorities. If there are additional specific customer requirements, we will consider reopening this issue. Closing as WNF.
18-12-2017

Closing bugs because they currently don't have resources to address them is incorrect - that may change in the future, if the bug is real closing it means that it will most likely *never* get fixed, even if resources subsequently become available.
30-03-2017

Not a priority, closing as WNF.
30-03-2017

Note: this "bad code" causes no functional harm as the errors are ignored and all the real work is done by the lwp priority routines.
16-05-2016

The osthread->set_vm_created logic is also dead code. It isn't clear whether there was once a notion of not trying to change the priority of JNI attached threads, but the code is never checked.
16-05-2016