JDK-6572927 : nsk/regression/b4687586 occasionally hangs on solaris-i586
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 7
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_10
  • CPU: x86
  • Submitted: 2007-06-22
  • Updated: 2010-08-19
  • Resolved: 2007-06-28
Related Reports
Duplicate :  
Description
/net/gtee.sfbay.sun.com/export/gtee2.0/suites/6.0/vm/src/nsk/regression/b4687586/b4687586.java timed out on a GC_Baseline nightly, so I tried to reproduce it.  I've had it hang twice now in 10000+ reruns.  I've attached the logs with the output of sending a SIGQUIT to the process, and a jstack of the second one, in case there's more detail in that.  

I've also attached the result of a jstack that David Holmes got in his preliminary investigations.  That failure seems different, but I'll put it here, but if necessary it should be spun off into its own bug.

The source for the test is in 

    /net/gtee.sfbay.sun.com/export/gtee2.0/suites/6.0/vm/src/nsk/regression/b4687586/b4687586.java

and I've been running it with

    $ $Deployed/JDK-7/bin/java -Xmx128m -XX:+PrintGCTimeStamps -XX:+PrintGCDetails b4687586 

I've been running on perf-lx10.SFBay, which is a 2x1.4GHz Solaris 10 i86pc box with 4GB of memory.  Each run of the test takes about 2 seconds.  That command line happens to get the -XX:+UseParallelGC collector and the tiered runtime compiler, but one of the attached hangs is from earlier testing with the -XX:+UseSerialGC and the client runtime compiler collector.  Both of the hangs I have are with a fastdebug build.  I just started trying the product build but haven't seen any hangs, yet.  If I see one, I'll attach its stack track to this bug.  I have a gcore from the first hang, if that would be useful.

I tried to reproduce the problem on solaris-sparc, but was unable to in 5000 runs.  Xiaobin tried to reproduce the problem on prt-sol-x64-1.sfbay and was unable to in 8000 runs.

I've also seen the JVM exit without a trace (in particular without the output from Universe::print() that -XX:+PrintGCDetails should produce), but that might be a different bug.  Be on the lookout for those.

I'm assigning this to runtime, because I don't think it's a GC problem but it could be a library problem.  I haven't tried simplifying the test, e.g., to exclude the AccessControl and Permissions library calls.
I've gotten a few more hangs.  I do notice that I've *only* gotten hangs from the fastdebug build.  I haven't seen a failure with a product build.  Maybe that's a clue.

Comments
EVALUATION The fix for 6571496 fixed the hang problem seen by the submitter. Close this as duplicate.
28-06-2007

EVALUATION I could also reproduce the hang when running the 5847th instance on perf-lx10.sfbay. "pstack <pid>" shows the same stack trace as bug 6571496. So I will close this bug as duplicate of bug 6571496 unless Peter disagrees.
27-06-2007

EVALUATION The classloading related hang is being addressed by CR 6571496.
26-06-2007