United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-8026848 : -XX:+G1SummarizeRSetStats can result in wrong exit code and crash

Details
Type:
Bug
Submit Date:
2013-10-18
Status:
Resolved
Updated Date:
2013-10-29
Project Name:
JDK
Resolved Date:
2013-10-22
Component:
hotspot
OS:
Sub-Component:
gc
CPU:
Priority:
P3
Resolution:
Fixed
Affected Versions:
hs25,8
Fixed Versions:
hs25 (b56)

Related Reports
Backport:

Sub Tasks

Description
Several tests failed on Windows x86_64 because of running jvm with -XX:+G1SummarizeRSetStats (and other keys) resulted in nonzero exit code. In original run -1073741819 exit code was got. (If you do it in cygwin instead of cmd you'll get 127.) Looks like enabling G1SummarizeRSetStats corrupts exit Code somehow as apart from exit code everything looks fine.

Then I added System.exit(33) to the end of main method and things went wrong. I started to constantly get crash with EXCEPTION_ACCESS_VIOLATION in msvcr100.dll. Attaching hs_err. Minidumps can be found in C:\Cygwin\export\local\repr1 at spbef05.

Steps to reproduce: 
1.) go to spbef05.ru.oracle.com 
2.) open cmd 
3.) cd C:\Cygwin\export\local\repr1
4.) .\jdk\bin\java  -XX:+UseG1GC -XX:-UseCompressedOops -Xmn4m -XX:InitiatingHeapOccupancyPercent=100 -XX:+PrintGC -XX:+UnlockDiagnosticVMOptions -XX:G1HeapRegionSize=1M -XX:+G1SummarizeRSetStats VerifySummaryOutput 0

Looks like issue is induced by this changeset: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c319b188c7b2

According to hs_err trouble occurs here:
Stack: [0x0000000002700000,0x0000000002800000],  sp=0x00000000027fdf60,  free space=1015k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [msvcr100.dll+0x5346e]
C  [msvcr100.dll+0x339e1]
C  [msvcr100.dll+0x33a35]    <-- This is __imp_vsnprintf I believe
V  [jvm.dll+0x255154]  outputStream::print_cr+0xb4
V  [jvm.dll+0x3ae550]  HRRSStatsIter::print_summary_on+0x130
V  [jvm.dll+0x3aec59]  G1RemSetSummary::print_on+0x2c9
V  [jvm.dll+0x3ada55]  G1RemSet::print_summary_info+0x55
V  [jvm.dll+0x39587c]  G1CollectedHeap::print_tracing_info+0x3c
V  [jvm.dll+0x1ee1cd]  before_exit+0x1fd
V  [jvm.dll+0x16b9e2]  JVM_Halt+0x82
C  0x0000000002be302e

Crash occurs at the following line of g1RemSetSummary.cpp file: 
189   void print_rs_mem_info_on(outputStream * out, size_t total) {
190     out->print_cr("    %8dK (%5.1f%%) by %zd %s regions", round_to_K(rs_mem_size()), rs_mem_size_percent_of(total), amount(), _name);
191   }

..while this inlined function prints information about "RegionTypeCounter _old;". Look at line 280 in the same file:
272   void print_summary_on(outputStream* out) {
273     RegionTypeCounter* counters[] = { &_young, &_humonguous, &_free, &_old, NULL };
274 
275     out->print_cr("\n Current rem set statistics");
276     out->print_cr("  Total per region rem sets sizes = "SIZE_FORMAT"K."
277                   " Max = "SIZE_FORMAT"K.",
278                   round_to_K(total_rs_mem_sz()), round_to_K(max_rs_mem_sz()));
279     for (RegionTypeCounter** current = &counters[0]; *current != NULL; current++) {
280       (*current)->print_rs_mem_info_on(out, total_rs_mem_sz());
281     }

I failed to understand why call of outputStream::print_cr crashes. All parameters it gets look good: pattern line is ok, two integers and double can't be ok, string "_name" equals to "Old" and is terminated with zero. Probably the clue is in outputStream::do_vsnprintf method that I don't see in stacktrace by some reason.

This crash can't be considered as crucial issue as it only occurs with -XX:+G1SummarizeRSetStats key that's not public. Nethertheless this is crash.

RULE gc/g1/TestSummarizeRSetStats.java Exception java.lang.RuntimeException: Expected to get exit value of [0]
RULE gc/g1/TestSummarizeRSetStatsPerRegion.java Exception java.lang.RuntimeException: Expected to get exit value of [0]
RULE gc/g1/TestSummarizeRSetStatsThreads.java Exception java.lang.RuntimeException: Expected to get exit value of [0]

All failures happened on the same host. I didn't check how it's reproducible on other hosts. Probably msvcr100.dll is guilty.
                                    

Comments
Have been able to reproduce this issue on my local windows laptop. Will try to rerun with updated format string, the current one uses %d to print size_t which is most likely the cause for the issue. 
                                     
2013-10-18
Build using SIZE_FORMAT instead of %d doesn't crash. Will prepare a fix for review next week.
                                     
2013-10-18
URL:   http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c7f403b05168
User:  stefank
Date:  2013-10-22 13:45:37 +0000

                                     
2013-10-22
URL:   http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/c7f403b05168
User:  amurillo
Date:  2013-10-25 23:52:44 +0000

                                     
2013-10-25



Hardware and Software, Engineered to Work Together