JDK-6480378 : Backport 5065001, 6259348 and others to 5.0 update release
  • Type: Bug
  • Component: client-libs
  • Sub-Component: java.awt
  • Affected Version:
    1.4.2_06,5.0,5.0u9,5.0u12,5.0u6,5.0u11,5.0u7,5.0u8 1.4.2_06,5.0,5.0u9,5.0u12,5.0u6,5.0u11,5.0u7,5.0u8
  • Priority: P1
  • Status: Resolved
  • Resolution: Fixed
  • OS:
    windows_2000,windows_2003,windows_xp windows_2000,windows_2003,windows_xp
  • CPU: x86
  • Submitted: 2006-10-10
  • Updated: 2014-02-27
  • Resolved: 2007-07-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7
5.0u14 b01Fixed 6-poolResolved 7Resolved
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Customer reports they are running against bug 6429965. They submitted 6 hotspot error files and a testcase. Customer's data and testcase is at:
/net/cores.central/cores/65182343.

My analysis shows the traces like the one in the bug report. The active thread is

 =>0x00a41e48 JavaThread "Finalizer" daemon [_thread_in_native, id=740]

with a stack

Stack: [0x061f0000,0x062f0000),     sp=0x062ef844,     free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
C=native code)
C  [ntdll.dll+0x10f3]
j  java.awt.Font.pDispose()V+0
J  java.awt.Font.finalize()V
v  ~RuntimeStub::alignment_frame_return Runtime1 stub
v  ~StubRoutines::call_stub
V  [jvm.dll+0x86401]
V  [jvm.dll+0xdb172]
V  [jvm.dll+0x862d2]
V  [jvm.dll+0x8b623]
C  [java.dll+0x2006]
J  java.lang.ref.Finalizer.runFinalizer()V
J  java.lang.ref.Finalizer.access$100(Ljava/lang/ref/Finalizer;)V
v  ~RuntimeStub::alignment_frame_return Runtime1 stub
j  java.lang.ref.Finalizer$FinalizerThread.run()V+11

hs_err_pid360.log is a little different as it crashed in a different 
spot in ntdll.dll.

Here is more information from the customer.

Background:
Latest release of started deployment in July, with subsequent deployments since then. This release includes an upgrade from Java 1.4.2 to Java 1.5.0. A high profile customer was upgraded to this new release the weekend of 30-Sep. This customer experienced application crashes three times during the week following the upgrade. This required us to undertake the undesirable action of rolling back the customer to the last known good release.

Problem:
HotSpot error files obtained from the customer machines indicated an Access Violation was occurring in the JVM, 1.5.0_07-b03. Specifically, the problem occurred in the Garbage Collector's Finalizer thread in the Windows DLL ntdll.dll, at ntdll.dll + 0x10f3.

Analysis:
The team analyzed the stack trace from the Hot Spot error file, identified the exported function call at ntdll.dll offset 0x10f3, and analyzed the JVM source code.

Defect:
There is a race condition in the collaboration between the scheduleDelete() method of Awt_Object.cpp and the WM_AWT_DISPOSE window message processing of Awt_Toolkit.cpp:

If the Finalizer thread is blocked in AwtObject::scheduleDelete() immediately after posting WM_AWT_DISPOSE, then Awt_Toolkit processing of WM_AWT_DISPOSE deletes the AwtObject instance before AwtObject has completed execution of its scheduleDelete() method.

When the Finalizer thread resumes execution within AwtObject::scheduleDelete(), the destructor for CriticalSection::Lock is automatically invoked. This invocation causes an invalid execution to occur, resulting in the Access Violation Exception within the JVM. See enclosed Hot Spot error file for details:

<<hs_err_pid436.log>>
Reproducing the JVM Defect:
Based on the analysis of the code identified above, the team has developed a simple test program. The test program proves that another thread can be scheduled by the OS in the middle of a Critical section => That the critical section execution is not an atomic operation. The awt_Font call to scheduleDelete() that posts the message for destroying itself is frozen by the OS at times, just after it has sent the message but before CriticalSection::~Lock() has been called. The awt_Toolkit thread processes the message and deletes the object. At this point, CriticalSection::~Lock() is executing illegally in a GPF (General Protection failure) Zone and causes ::LeaveCriticalSection() to crash as indicated in the HotSpot error log. See enclosed zip file containing sources, executable, and make file for this test program.

The files are at /net/cores.central/cores/65182343
After some discussion we have decided to completely backport some of the fixes (5065001, 6259348 and others) to 5.0 update release. So I'm changing this CR synopsis to match this change.

Comments
EVALUATION This backport must include the fixes for the following bugs: - 5065001 - 6229122 - performance regression caused by the fix for 5065001 - 6259348 - addition to the two fixes above - 6215905 - regression of 5065001 (some unnecessary statements were added) - 6225679 - regression of 5065001 (a wrong check was added) - 6275359 - regression of 5065001 (some security checks were moved into the privileged thread)
23-10-2006

SUGGESTED FIX The webrev of the fix can be found here: http://sa.sfbay.sun.com/projects/awt_data/5.0u12/6480378
23-10-2006

EVALUATION I do remember at least these CRs related to Win32 crashes, fixed in 6.0: 5065001, 6229122, 6259348. All of them should be backported.
16-10-2006