JDK-6174443 : VM crashes with core on Solaris 9 during hotspot compilation (1.4.2_04)
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 1.4.2_04,5.0,5.0u1
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_8,solaris_9
  • CPU: generic,sparc
  • Submitted: 2004-10-05
  • Updated: 2010-05-09
  • Resolved: 2005-03-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6
1.4.2_09Fixed 6 b28Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
HotSpot JVM crashes shortly after start up with core dump. The system can works a few days if it starts second time sucessfully.

DESCRIPTIONEND
TESTCASEBEGIN
a) System crashes with core dump and generates this message in console window:
		#
		# HotSpot Virtual Machine Error, Internal Error
		# Please report this error at
		# http://java.sun.com/cgi-bin/bugreport.cgi
		#
		# Java VM: Java HotSpot(TM) Server VM (1.4.2_04-b05 compiled mode)
		#
		# Error ID: 434F44452255464645520E435050005E 01
		#
		# Problematic Thread: prio=5 tid=0x000ed130 nid=0x8 runnable
		#
	b)  Information, obtained from core dump by pstack utility, leads us to believe - problem occurs in HotSpot Optimized compiler. According to information, generated by -XX:+PrintCompilation, system failed to compile native code from byte code of some class.
The class names, which system failes to compile, are different every time. The attempt to add such class/method name did not solve a problem - system still fails at different classes/methods.

VM version:
	 Java HotSpot(TM) Server VM (1.4.2_04-b05 compiled mode)
VM Options:
	MEM_ARGS=-Xms1024m -Xmx1024m -XX:MaxPermSize=256m -XX:PermSize=128m
	JAVA_OPTIONS= -Xverify:none -D{edocs specific application properties} -Dlog4j.debug -XX:+PrintCompilation -Xbatch -Xcomp -Xnoclassgc -Dweblogic.jsp.windows.caseSensitive=true
Platform: 
	System = SunOS
	Node = node_name_here
	Release = 5.9
	KernelID = Generic_117171-0
	Machine = sun4u	
	BusType = <unknown>
	Serial = <unknown>
	Users = <unknown>
	OEM# = 0
	Origin# = 1
	NumCPU = 4

Here is the pstack output from core :

core 'core' of 13714:   /export/home/bea/jdk142_04/bin/java -server -Xms1024m -Xmx1024m -XX:Ma
-----------------  lwp# 8 / thread# 8  --------------------
 ff31f63c _lwp_kill (6, 0, a93fd890, 0, 1, a93fd10c) + 8
 ff2b6ce0 abort    (0, a93fd920, 0, fffffff8, 0, a93fd949) + 100
 ff098498 void os::abort(int) (1, ff139002, a93fe1a0, ff16da1a, ff16d9bd, ff0000) + 80
 fef7e454 void report_error(int,const char*,int,const char*,const char*,...) (a93fe1bc, ff1c6ab4, ff138dcc, ff131028, ff138e
15, a93fe300) + 668
 fef7d9dc void report_fatal(const char*,int,const char*,...) (ff130fe3, 5e, ff131029, ff130fcc, 214, 25de9e0) + 58
 fed80ef0 CodeBuffer::CodeBuffer(int,int,int,int,int,int,BufferBlob*,relocInfo*,RelocateBuffer*,int,OopRecorder*,const char*
,int) (ff170000, 0, c00, 1000, 400, 0) + c0
 fedd3bbc void Compile::Fill_buffer() (a93ff500, 25de538, 10a, 214, 2ec2480, 2ec2bad) + 148
 fedd99bc void Compile::Output() (2ec22d0, 2ec2a80, 3, 0, 0, 0) + 8e8
 fedd2e84 void Compile::Code_Gen() (a93ff500, ff1335c4, a93ff414, ff170000, 0, 0) + 53c
 fee008e8 Compile::Compile(ciEnv*,ciScope*,ciMethod*,int,int,int) (ff1333f9, 2d7963c, 25dd004, 28defac, ffffffff, 1) + be0
 fedfd08c void C2Compiler::compile_method(ciEnv*,ciScope*,ciMethod*,int,int) (2b880, a93ffd1c, 0, 2939a90, ffffffff, 0) + 64
 fedfc850 void CompileBroker::invoke_compiler_on_method(CompileTask*) (53d2, 0, ffffffff, ff1aee50, ff1bbbe4, ed130) + 61c
 feeac1f8 void CompileBroker::compiler_thread_loop() (ff133c01, ff1af218, ed130, ed6e0, 306d10, fee69254) + 428
 fee6927c void JavaThread::run() (ed130, 8, 40, 0, 40, 0) + 284
 fee6575c _start   (ed130, 0, 0, 0, 0, 0) + 134
 ff3857b4 _lwp_start (0, 0, 0, 0, 0, 0)
###@###.### 10/5/04 18:38 GMT

Comments
EVALUATION This bug is a dup of 4925292. ###@###.### 10/8/04 14:04 GMT The fix for 4925292 is not sufficient for some instances of this failure. First, it would advisable in the long term to extend the 4925292 change to have "soft failure" handling on most of the CodeBuffer allocations and resizings. However, we'd first need to bake that into the current release before backporting to a 1.4.2 update. Second, the test that turns off compilation due insufficient codeCache space is too weak. HotSpot tests against the remaining available space, rather than the (safer) alternative, the largest segment available in the codeHeap. This fix would probably be sufficient to avoid most observations of this bug. Third, it was observed that the CI fails to free the associated codeBlob if a dependency violation causes a compilation to abort the creation of an nmethod. We have seen one example of such a leak that accelerates the filling of the codeCache such that this bug is triggered. This problem could also be easily fixed in an update. ###@###.### 2005-1-24 22:48:55 GMT We have implemented a fix for the third issue listed above. Additionally, we have fixed a fourth issue, which caused trivial dependency failures in the systemDictionary check in ciEnv.cpp. Those two fixes should be sufficient to cure the customer's problem as described here and in the associated escalation. As a result, we will close this bug as fixed. It should be noted that of the other two problem listed above, the first was not fixed, as the solution was too complicated for a backport. The second issue was also not fixed since an acceptable solution could not be found. Those problems remain and will have to be addressed under a separate, yet to be opened, bug. ###@###.### 2005-03-08 16:22:28 GMT
08-10-2004

WORK AROUND This bug is a dup. of 4925292 which was fixed in 5.0. There are several possible workarounds, use -client. The other workarounds involve changinge some CodeCache parameters. The vm tries to avoid crashing by running out of CodeCache by reserving a minimum amount of space before it will try a compile. This system is imperfect and in 1.4.2 and earlier if run out a particulat spot we can die. You can attempt to workaround this by changing the following variable: ReservedCodeCacheSize and CodeCacheMinimumFreeSpace. These values default to 32M and 500K respectively. CodeCacheMinimumFreeSpace is the amount of space we try to leave available when we back of on compiles. So increasing this to say 1M is the best hope for a workaround. You could also increase the total CodeCache size. So for example to allow 64M for the CodeCache and 2M for the "warning track" we'd have jvm arguments like: -XX:CodeCacheMinimumFreeSpace=2M -XX:+ReservedCodeCacheSize=64M hope this helps. ###@###.### 10/8/04 14:04 GMT
08-10-2004

SUGGESTED FIX +++ codeBuffer.cpp Fri Sep 19 10:50:49 2003 @@ -50,11 +50,12 @@ relocInfo *locs_memory, RelocateBuffer *locs_stub_memory, bool auto_free_blob, OopRecorder *oop_recorder, const char* name, - bool allow_resizing) { + bool allow_resizing, + bool soft_fail) { // Compute maximal alignment int alignSize = MAX2((intx) sizeof(jdouble), CodeEntryAlignment); assert(is_power_of_2(alignSize), ""); // Keep original instSize since Stubs are not oopSize aligned. @@ -86,13 +87,39 @@ // Warning: This memory will not be release when the CodeBuffer // is destroyed unless you set auto_free_blob! if ( blob == NULL ) { if (name == NULL) name = "CodeBuffer constructor"; BufferBlob* newblob = BufferBlob::create(totalSize + instsSlop, name); - if( newblob == NULL ) fatal1( "CodeCache: no room for %s", name); + if( newblob == NULL ) { + if (!soft_fail) { + fatal1( "CodeCache: no room for %s", name); + } + _blob = NULL; + insts = NULL; + _instsStart = NULL; + _instsStart = NULL; + _instsEnd = NULL; + _instsOverflow = NULL; + _instsEnd_before_stubs = NULL; + _instsOverflow_before_stubs = NULL; + _stubsStart = NULL; + _stubsEnd = NULL; + _stubsOverflow = NULL; + _constStart = NULL; + _constEnd = NULL; + _constOverflow = NULL; + _locsStart = NULL; + _locsEnd = NULL; + _locsOverflow = NULL; + _stubsReloc = NULL; + _stubs_reloc_count = 0; + _stubs_reloc_alloc = 0; + return; + } else { insts = newblob->instructions_begin(); _blob = newblob; + } } else { // [RGV] are there any fields in the blob needing re-initialization // if we reuse it? _blob = blob; +++ codeBuffer.hpp Fri Sep 19 09:45:03 2003 @@ -112,11 +112,12 @@ int locsStubSize, bool needs_oop_recorder, BufferBlob *blob = NULL, relocInfo *locs_memory = NULL, RelocateBuffer *locs_stub_memory = NULL, bool auto_free_blob = false, OopRecorder* oop_recorder = NULL, const char* name = NULL, - bool allow_resizing = false); + bool allow_resizing = false, + bool soft_fail = false); ~CodeBuffer(); static int insts_memory_size(int instsSize); static int locs_memory_size (int locsSize); +++ output.cpp Fri Sep 19 09:45:06 2003 @@ -754,12 +754,21 @@ } // nmethod and CodeBuffer count stubs as part of method's code. _code_buffer = new CodeBuffer(code_req, locs_req, stub_req, const_req, 0, false, 0, 0, 0, true /* Auto Free the buffer */, - NULL, NULL, labels_not_set); + NULL, NULL, labels_not_set, true /* soft failure */); + // Have we run out of code space? + if (_code_buffer->code_capacity() == 0) { + UseInterpreter = true; + UseCompiler = false; + AlwaysCompileLoopMethods = false; + record_failure("CodeCache is full"); + warning("CodeCache is full. Compiling has been disabled"); + return; + } _code_base = _code_buffer->code_begin(); _code_buffer->set_oop_recorder(recorder()->oop_recorder()); // fill in the nop array for bundling computations MachNode *_nop_list[Bundle::_nop_count]; ###@###.### 10/8/04 14:04 GMT This partial fix frees unregistered code blobs. The decompile count is incremented if a code blob goes unregistered (not needed for 1.4.2 backport). Some printing cleanups. http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2005/20050228175222.rasbold.c2_baseline5/workspace/webrevs/webrev-2005.03.01/index.html ###@###.### 2005-03-01 17:37:50 GMT See the PRT webrev below for the second part of the suggested fix. This fix avoids trivial dependency failures after compilation in the systemDictionary check. http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2005/20050307104202.rasbold.c2_baseline5/workspace/webrevs/webrev-2005.03.07/index.html ###@###.### 2005-03-08 16:22:29 GMT
08-10-2004