JDK-6671882 : memory access after free in solaris/vm/os_solaris.cpp
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 6
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_10
  • CPU: sparc
  • Submitted: 2008-03-06
  • Updated: 2012-10-08
  • Resolved: 2008-05-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u11-revFixed 7Fixed hs10Fixed
Description
<summary>
Inspection of the code in hotspot/src/os/solaris/vm/os_solaris.cpp
shows memory is free'd, but there are accesses through the memory
pointer in the next statement, leading to undefined behavior.
This is in isT2_libthread():

    FREE_C_HEAP_ARRAY(char, lwpArray);
    lwpSize = lwpArray->pr_nent * lwpArray->pr_entsize;

</summary>

FULL PRODUCT VERSION :
java version "1.6.0-beta2"
Java(TM) SE Runtime Environment (build 1.6.0-beta2-b86)
Java HotSpot(TM) Server VM (build 1.6.0-beta2-b86, mixed mode)

We're using ../jdk1.6.0/jre/lib/i386/server/libjvm.so as our shared lib we load
in the application.



FULL OS VERSION :
SunOS hotspot 5.10 Generic_118833-36 sun4v sparc SUNW,Sun-Fire-T1000

EXTRA RELEVANT SYSTEM CONFIGURATION :
CONTEXT :
---------
We use the JNI invocation API to load the VM into a native application. Running Solaris 10 with a 64 bit compiled native application using the VM (Hotspot / JRE 6u1 / JNI 1.6) we encounter a problem when calling the JNI function JNI_CreateJavaVM(). Important in this
context is that our native application uses the async. I/O API.



A DESCRIPTION OF THE PROBLEM :
PROBLEM :
---------
In the above context we see a single malloc() requesting +/- 94 Mb of memory. As our application runs under strict
memory requirements we did some further investigation and came to the below analysis :

In the debugger we see :

        [...]

=>  [4] malloc(size = 954437177U)
    [5] os::malloc(0x38e38e39, 0xd4b08d60, 0x388, 0x5a0e34, 0xfd9e6000, 0x4000), at 0xfd4451ec
    [6] isT2_libthread(0x8f000, 0x14258, 0xfd9e6000, 0xfd956d4a, 0x1df3c8, 0x38e38e39), at 0xfd806d70
    [7] os::Solaris::libthread_init(0x0, 0xfee3f2e8, 0xfd9f2f88, 0xfd9f2f8c, 0x80000, 0xfd9e6000), at 0xfd44af9c
    [8] os::init_2(0x2000, 0x80000, 0x16c00, 0x18000, 0xfd9e6000, 0xfd9fa160), at 0xfd44ada4
    [9] Threads::create_vm(0xfd9f3066, 0xfdffbc9b, 0x0, 0x15cb64, 0x59f524, 0xfd9e6000), at 0xfd889598
    [10] JNI_CreateJavaVM(0xd4ba73c0, 0xd4ba73c8, 0xd569a7b4, 0xfd9fa284, 0xfd446ad4, 0xfd9e6000), at 0xfd44688c

        [...]

When browsing the hotspot source code, we did check the file pointed out in the
VM log :

../solaris/vm/os_solaris.cpp#lwpArray
   4209 bool isT2_libthread() {
   4210   int i, rslt;
   4211   static prheader_t * lwpArray = NULL;
   4212   static int lwpSize = 0;
   4213   static int lwpFile = -1;
   4214   lwpstatus_t * that;
   4215   int aslwpcount;
   4216   char lwpName [128];
   4217   bool isT2 = false;
   4218
   4219 #define ADR(x)  ((uintptr_t)(x))
   4220 #define LWPINDEX(ary,ix)   ((lwpstatus_t *)(((ary)->pr_entsize * (ix)) + (ADR((ary) + 1))))
   4221
   4222   aslwpcount = 0;
   4223   lwpSize = 16*1024;
   4224   lwpArray = ( prheader_t *)NEW_C_HEAP_ARRAY (char, lwpSize);
   4225   lwpFile = open ("/proc/self/lstatus", O_RDONLY, 0);
   4226   if (lwpArray == NULL) {
   4227       if ( ThreadPriorityVerbose ) warning ("Couldn't allocate T2 Check array\n");
   4228       return(isT2);
   4229   }
   4230   if (lwpFile < 0) {
   4231       if ( ThreadPriorityVerbose ) warning ("Couldn't open /proc/self/lstatus\n");
   4232       return(isT2);
   4233   }
   4234   for (;;) {
   4235     lseek (lwpFile, 0, SEEK_SET);
   4236     rslt = read (lwpFile, lwpArray, lwpSize);
   4237     if ((lwpArray->pr_nent * lwpArray->pr_entsize) <= lwpSize) {
   4238       break;
   4239     }
   4240     FREE_C_HEAP_ARRAY(char, lwpArray);
   4241     lwpSize = lwpArray->pr_nent * lwpArray->pr_entsize;
   4242     lwpArray = ( prheader_t *)NEW_C_HEAP_ARRAY (char, lwpSize);
   4243     if (lwpArray == NULL) {
   4244         if ( ThreadPriorityVerbose ) warning ("Couldn't allocate T2 Check array\n");
   4245         return(isT2);
   4246     }
   4247   }
   4248
   4249   // We got a good snapshot - now iterate over the list.
   4250   for (i = 0; i < lwpArray->pr_nent; i++ ) {
   4251     that = LWPINDEX(lwpArray,i);
   4252     if (that->pr_flags & PR_ASLWP) {
   4253       aslwpcount++;
   4254     }
   4255   }
   4256   if ( aslwpcount == 0 ) isT2 = true;
   4257
   4258   FREE_C_HEAP_ARRAY(char, lwpArray);
   4259   close (lwpFile);
   4260   if ( ThreadPriorityVerbose ) {
   4261     if ( isT2 ) tty->print_cr("We are running with a T2 libthread\n");
   4262     else tty->print_cr("We are not running with a T2 libthread\n");
   4263   }
   4264   return (isT2);
   4265 }

The problem is L4240-L4242. First the call FREE_C_HEAP_ARRAY(char, lwpArray); at L4240, this results in a free() at the end on lwpArray.

 What we see is that this freed memory already gets assigned to another thread which does write something else in it. Depending on the data we get insane memory requests. Of course this does not always works out the wrong way, sometimes the data at the address pointed by lwpArray is still valid. The free should be done after using the value in the
allocated memory.

REQUEST :
---------
As the usage of already freed memory leads to inconsistent memory allocation behavior we would like to see the above being fixed.


THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try

THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :

Quite difficult as it happens in our appilication which is heavily
using async I/o / Threading with the VM as an embedded extension Can not
just give a couple of lines of Java

Pls. have a look at the code. By doing a simple code inspection one
can see the memory is being used after doing a free() on the same memory,
just swap the 2 lines of code.

ERROR MESSAGES/STACK TRACES THAT OCCUR :

not a crash. We see >100 Mb of memory being allocated on VM boot. This is
somewhat difficult to explain to customers running this in a small setup.

REPRODUCIBILITY :
This bug can be reproduced always.

CUSTOMER SUBMITTED WORKAROUND :

disable async I/O. Then the L2 thread calls are not done. This is however
not acceptable as performance really goes down doing so.

Comments
EVALUATION The webrev for this CR is located at http://javaweb.sfbay/~xl116366/webrev/6671882/. It was fixed in hs13. We might consider to back port this fix to hs11 which is shipped with 6u10 release.
02-07-2008

EVALUATION Needs to re-inspect the code and make sure every return path won't have memory leak and usage of dangling pointer.
31-03-2008

SUGGESTED FIX Swap these two lines: 4240 FREE_C_HEAP_ARRAY(char, lwpArray); 4241 lwpSize = lwpArray->pr_nent * lwpArray->pr_entsize;
06-03-2008