JDK-5081881 : SIG11 regularly in 1.4.2_05 on Linux.
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 1.4.2_05
  • Priority: P1
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux_redhat_3.0
  • CPU: x86
  • Submitted: 2004-08-03
  • Updated: 2005-01-12
  • Resolved: 2005-01-12
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
The customer is seeing regular SIG 11 errors in their application on both Solaris and Linux.  The cause isn't clear, and isn't repeatable on both - it seems to happen differently depending on which OS the application runs on.  The cause seems to be a bug in the GC system, and indeed they fairly regularly also see exits with this message:

	Fatal: CMSMarkStack is full

even when running with -XX:CMSMarkStackSize=64M.  These may or may not be unrelated problems.  One problem seems to exist when they use CMS, but another surfaces when they switch it off, so that's not a workaround.

They see two different - but related Error IDs:

	434F4E43555252454E542D41524B335745455027454E45524154494F4E0E4850500089
	- which appears to mean the same as the CMSMarkStack Full,

&	4F530E43505002EF which appears to suggest a stack corruption?

_dl_sysinfo_int80 also appears a *lot* in any backtraces from gdb.  I've attached a large amount of debugging data.

They are not using JNI or native threads, and unfortunately, the application here is *huge*, so a test-case seem unlikely.

Pstack output looks like:

15492: /home/xcolldev/java_1.4.2_05/j2re1.4.2_05/bin/java -server -Djava.security.policy=/sbcimp/dyn/data/RISK/XCOLL/LINUXUATBUILT/...
(No symbols found)
0xf65ebc32: ???? (805d314, 805d2fc, f6477d44, 805d314, 805d2fc, 0) + a4
0xf6222e1d: ???? (805d314, 805d2fc, f6477d44, 805efd8, 805efd8, feffa550) + 64
0xf62111c6: ???? (805d2c8, 0, 0, f6477d44, 805efd8, f6164670) + 20
0xf629d0a1: ???? (805813c, f61602d4, 805f078, 10004, f6380e2a, 0)
0xf616037e: ???? (f64662a0, f65c4e58, feffc704, 805830c, f6164670, 805f3ac) + 2060
0x08049b33: ???? (8, 805844c, feffc770, 0, f65c4e58, f6600020) + 40
0xf64a5748: ???? (8049250, 1a, feffc704, 8048dc0, 805430c, f65f7f50) + 1003908



###@###.### 2004-09-01: removed CMS reference from synopsis.
The crash does not need CMS.

Comments
EVALUATION See comments section. ###@###.### 2005-1-12 23:58:33 GMT
12-01-2005

PUBLIC COMMENTS At least one bug in GC/CMS in 1.4.2_06 caused SIG 11 crashes.
02-09-2004