<summary> Inspection of the code in hotspot/src/os/solaris/vm/os_solaris.cpp shows memory is free'd, but there are accesses through the memory pointer in the next statement, leading to undefined behavior. This is in isT2_libthread(): FREE_C_HEAP_ARRAY(char, lwpArray); lwpSize = lwpArray->pr_nent * lwpArray->pr_entsize; </summary> FULL PRODUCT VERSION : java version "1.6.0-beta2" Java(TM) SE Runtime Environment (build 1.6.0-beta2-b86) Java HotSpot(TM) Server VM (build 1.6.0-beta2-b86, mixed mode) We're using ../jdk1.6.0/jre/lib/i386/server/libjvm.so as our shared lib we load in the application. FULL OS VERSION : SunOS hotspot 5.10 Generic_118833-36 sun4v sparc SUNW,Sun-Fire-T1000 EXTRA RELEVANT SYSTEM CONFIGURATION : CONTEXT : --------- We use the JNI invocation API to load the VM into a native application. Running Solaris 10 with a 64 bit compiled native application using the VM (Hotspot / JRE 6u1 / JNI 1.6) we encounter a problem when calling the JNI function JNI_CreateJavaVM(). Important in this context is that our native application uses the async. I/O API. A DESCRIPTION OF THE PROBLEM : PROBLEM : --------- In the above context we see a single malloc() requesting +/- 94 Mb of memory. As our application runs under strict memory requirements we did some further investigation and came to the below analysis : In the debugger we see : [...] => [4] malloc(size = 954437177U) [5] os::malloc(0x38e38e39, 0xd4b08d60, 0x388, 0x5a0e34, 0xfd9e6000, 0x4000), at 0xfd4451ec [6] isT2_libthread(0x8f000, 0x14258, 0xfd9e6000, 0xfd956d4a, 0x1df3c8, 0x38e38e39), at 0xfd806d70 [7] os::Solaris::libthread_init(0x0, 0xfee3f2e8, 0xfd9f2f88, 0xfd9f2f8c, 0x80000, 0xfd9e6000), at 0xfd44af9c [8] os::init_2(0x2000, 0x80000, 0x16c00, 0x18000, 0xfd9e6000, 0xfd9fa160), at 0xfd44ada4 [9] Threads::create_vm(0xfd9f3066, 0xfdffbc9b, 0x0, 0x15cb64, 0x59f524, 0xfd9e6000), at 0xfd889598 [10] JNI_CreateJavaVM(0xd4ba73c0, 0xd4ba73c8, 0xd569a7b4, 0xfd9fa284, 0xfd446ad4, 0xfd9e6000), at 0xfd44688c [...] When browsing the hotspot source code, we did check the file pointed out in the VM log : ../solaris/vm/os_solaris.cpp#lwpArray 4209 bool isT2_libthread() { 4210 int i, rslt; 4211 static prheader_t * lwpArray = NULL; 4212 static int lwpSize = 0; 4213 static int lwpFile = -1; 4214 lwpstatus_t * that; 4215 int aslwpcount; 4216 char lwpName [128]; 4217 bool isT2 = false; 4218 4219 #define ADR(x) ((uintptr_t)(x)) 4220 #define LWPINDEX(ary,ix) ((lwpstatus_t *)(((ary)->pr_entsize * (ix)) + (ADR((ary) + 1)))) 4221 4222 aslwpcount = 0; 4223 lwpSize = 16*1024; 4224 lwpArray = ( prheader_t *)NEW_C_HEAP_ARRAY (char, lwpSize); 4225 lwpFile = open ("/proc/self/lstatus", O_RDONLY, 0); 4226 if (lwpArray == NULL) { 4227 if ( ThreadPriorityVerbose ) warning ("Couldn't allocate T2 Check array\n"); 4228 return(isT2); 4229 } 4230 if (lwpFile < 0) { 4231 if ( ThreadPriorityVerbose ) warning ("Couldn't open /proc/self/lstatus\n"); 4232 return(isT2); 4233 } 4234 for (;;) { 4235 lseek (lwpFile, 0, SEEK_SET); 4236 rslt = read (lwpFile, lwpArray, lwpSize); 4237 if ((lwpArray->pr_nent * lwpArray->pr_entsize) <= lwpSize) { 4238 break; 4239 } 4240 FREE_C_HEAP_ARRAY(char, lwpArray); 4241 lwpSize = lwpArray->pr_nent * lwpArray->pr_entsize; 4242 lwpArray = ( prheader_t *)NEW_C_HEAP_ARRAY (char, lwpSize); 4243 if (lwpArray == NULL) { 4244 if ( ThreadPriorityVerbose ) warning ("Couldn't allocate T2 Check array\n"); 4245 return(isT2); 4246 } 4247 } 4248 4249 // We got a good snapshot - now iterate over the list. 4250 for (i = 0; i < lwpArray->pr_nent; i++ ) { 4251 that = LWPINDEX(lwpArray,i); 4252 if (that->pr_flags & PR_ASLWP) { 4253 aslwpcount++; 4254 } 4255 } 4256 if ( aslwpcount == 0 ) isT2 = true; 4257 4258 FREE_C_HEAP_ARRAY(char, lwpArray); 4259 close (lwpFile); 4260 if ( ThreadPriorityVerbose ) { 4261 if ( isT2 ) tty->print_cr("We are running with a T2 libthread\n"); 4262 else tty->print_cr("We are not running with a T2 libthread\n"); 4263 } 4264 return (isT2); 4265 } The problem is L4240-L4242. First the call FREE_C_HEAP_ARRAY(char, lwpArray); at L4240, this results in a free() at the end on lwpArray. What we see is that this freed memory already gets assigned to another thread which does write something else in it. Depending on the data we get insane memory requests. Of course this does not always works out the wrong way, sometimes the data at the address pointed by lwpArray is still valid. The free should be done after using the value in the allocated memory. REQUEST : --------- As the usage of already freed memory leads to inconsistent memory allocation behavior we would like to see the above being fixed. THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes STEPS TO FOLLOW TO REPRODUCE THE PROBLEM : Quite difficult as it happens in our appilication which is heavily using async I/o / Threading with the VM as an embedded extension Can not just give a couple of lines of Java Pls. have a look at the code. By doing a simple code inspection one can see the memory is being used after doing a free() on the same memory, just swap the 2 lines of code. ERROR MESSAGES/STACK TRACES THAT OCCUR : not a crash. We see >100 Mb of memory being allocated on VM boot. This is somewhat difficult to explain to customers running this in a small setup. REPRODUCIBILITY : This bug can be reproduced always. CUSTOMER SUBMITTED WORKAROUND : disable async I/O. Then the L2 thread calls are not done. This is however not acceptable as performance really goes down doing so.
|