Relates :
|
|
Relates :
|
|
Relates :
|
FULL PRODUCT VERSION : JRE version: Java(TM) SE Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01) Java VM: Java HotSpot(TM) 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 ) FULL OS VERSION : Linux foo 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux A DESCRIPTION OF THE PROBLEM : I am creating a new bug report as I cannot update the old one (https://bugs.java.com/view_bug.do?bug_id=8172756) with new information. The current theory is that the crashes stem from card cache returning invalid results because of the overflow here: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/69aec2ca5d90/src/share/vm/gc_implementation/g1/heapRegionRemSet.cpp#l436 This can at least give from_card of -1 which is also used to mean that cache is empty (bonus: t also seems heaps of size > 2^40 should overlap in that cache.) In the case of overflow to -1, we can then end up with a situation where some region end up being reclaimed even when there are live pointers to objects in it. When those pointers are then later on followed in a marking phase, the pointers points into a (middle of) some new object and calling oopDesc::size() on that "object" will follow some trash pointers and cause crash. THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes STEPS TO FOLLOW TO REPRODUCE THE PROBLEM : We've been unable to reproduce the crash with a simple program, but we're able to trigger the assert assert(contains_reference(from), err_msg("We just found " PTR_FORMAT " in the FromCardCache", from)); with this small program, fastdebug JVM and specific command line parameters (listed in the code comments) https://gist.github.com/jmiettinen/62506fd75855edb346d88cc95697bf1e Currently, we're able to reproduce the actual failurearound once a week by running production software with a forced load pattern. We've run a patched OpenJDK JVM now without failures for a few week. The patch we tried was changing values in the card table from int to long to avoid overflow (https://gist.github.com/jmiettinen/1e1e8a0c71c2d1ae664335f1e875f227). ERROR MESSAGES/STACK TRACES THAT OCCUR : # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f24c84669f4, pid=289114, tid=0x00007c1152831700 # # JRE version: Java(TM) SE Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 ) # Problematic frame: # V [libjvm.so+0x4cb9f4] oopDesc::size()+0x44 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # --------------- T H R E A D --------------- Current thread (0x00007f24c0420800): ConcurrentGCThread [stack: 0x00007c1152731000,0x00007c1152832000] [id=289624] siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00000c9a00000dd4 Registers: RAX=0x00007f24c8f55cf9, RBX=0x00007f24c0477dd0, RCX=0x0000000000000001, RDX=0x0000000000000000 RSP=0x00007c1152830950, RBP=0x00007c1152830950, RSI=0x0000000000000000, RDI=0x00000c9a00000dcc R8 =0x00007cf618c10000, R9 =0x0000000000000001, R10=0x00007d3f80262f08, R11=0x00007f24c0474850 R12=0x00007cf618c10000, R13=0x00007be38f10eea0, R14=0x00007f24c03ff070, R15=0x00007f24c0474850 RIP=0x00007f24c84669f4, EFLAGS=0x0000000000010246, CSGSFS=0x0000000000000033, ERR=0x0000000000000004 TRAPNO=0x000000000000000e Top of Stack: (sp=0x00007c1152830950) 0x00007c1152830950: 00007c11528309b0 00007f24c85f4360 0x00007c1152830960: 00007c11528309b0 00007f23fc005010 0x00007c1152830970: 00007d3f80262e98 000000000000000e 0x00007c1152830980: 00007e31e25a9bf0 00007e31e25bc010 0x00007c1152830990: 00007e31e27b3a18 00007c1152830aa0 0x00007c11528309a0: 0000000000000001 00007f24c8f55cf9 0x00007c11528309b0: 00007c11528309f0 00007f24c88a6ee8 0x00007c11528309c0: 0000000300040001 00007e31e25b39f8 0x00007c11528309d0: 00007f24c0477dd0 00007c1152830af0 0x00007c11528309e0: 0000000101840000 0000004061000000 0x00007c11528309f0: 00007c1152830a20 00007f24c8469d34 0x00007c1152830a00: 000000000023acb3 0000004060cb673f 0x00007c1152830a10: 00007f24c03ff0f8 00007f24c0477dd0 0x00007c1152830a20: 00007c1152830b60 00007f24c846341e 0x00007c1152830a30: 3fe0000000000000 0000000000000000 0x00007c1152830a40: 00007f24c9777c53 00007be3b0287ea0 0x00007c1152830a50: 00007f24c0477ee0 00007c1152830af0 0x00007c1152830a60: 0100000000000000 4024000000000000 0x00007c1152830a70: 00007f24c0420801 000000000023fd02 0x00007c1152830a80: 3fefbb9dc59efb46 3fefbb9dc59efb46 0x00007c1152830a90: 00007f24c03ff110 00007e31e4000000 0x00007c1152830aa0: 00007f24c8ef6ad0 000000000100e400 0x00007c1152830ab0: 00007be32f3b4dd0 00007f24c8f006a0 0x00007c1152830ac0: 0000000000000b00 00007c1152830aa0 0x00007c1152830ad0: 00007f24c03ff070 00007f24c002a990 0x00007c1152830ae0: 00007f24c0477dd0 0000000000000000 0x00007c1152830af0: 00007f24c8ef6a50 00007f24c03ff0f8 0x00007c1152830b00: 00007f24c03ff070 00007f24c0477dd0 0x00007c1152830b10: 00007f24c8ef6ab0 00007f24c0477dd0 0x00007c1152830b20: 00007f24c002a990 0000000000000000 0x00007c1152830b30: 0000000000000000 0000000000000000 0x00007c1152830b40: 00007f24c0477dd0 00007f1db84d6a60 Instructions: (pc=0x00007f24c84669f4) 0x00007f24c84669d4: d1 48 d3 e7 48 03 38 8b 4f 08 83 f9 00 7e 19 f6 0x00007f24c84669e4: c1 01 75 18 89 c8 c9 c1 f8 03 c3 90 48 8b 7f 08 0x00007f24c84669f4: 8b 4f 08 83 f9 00 7f e7 75 1a 66 90 48 8b 07 4c 0x00007f24c8466a04: 89 c6 4c 8b 98 e8 00 00 00 c9 41 ff e3 0f 1f 80 Register to memory mapping: RAX=0x00007f24c8f55cf9: <offset 0xfbacf9> in /usr/lib/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so at 0x00007f24c7f9b000 RBX=0x00007f24c0477dd0 is an unknown value RCX=0x0000000000000001 is an unknown value RDX=0x0000000000000000 is an unknown value RSP=0x00007c1152830950 is an unknown value RBP=0x00007c1152830950 is an unknown value RSI=0x0000000000000000 is an unknown value RDI=0x00000c9a00000dcc is an unknown value R8 =0x00007cf618c10000 is pointing into object: 0x00007cf618800018 [J - klass: {type array long} - length: 1048576 R9 =0x0000000000000001 is an unknown value R10=0x00007d3f80262f08 is an oop [J - klass: {type array long} - length: 12 R11=0x00007f24c0474850 is an unknown value R12=0x00007cf618c10000 is pointing into object: 0x00007cf618800018 [J - klass: {type array long} - length: 1048576 R13=0x00007be38f10eea0 is an unknown value R14=0x00007f24c03ff070 is an unknown value R15=0x00007f24c0474850 is an unknown value Stack: [0x00007c1152731000,0x00007c1152832000], sp=0x00007c1152830950, free space=1022k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x4cb9f4] oopDesc::size()+0x44 V [libjvm.so+0x659360] CMTask::deal_with_reference(oopDesc*)+0x180 V [libjvm.so+0x90bee8] ObjArrayKlass::oop_oop_iterate_nv(oopDesc*, G1CMOopClosure*)+0x148 V [libjvm.so+0x4ced34] CMBitMapClosure::do_bit(unsigned long)+0x74 V [libjvm.so+0x4c841e] CMTask::do_marking_step(double, bool, bool)+0xa9e V [libjvm.so+0x4cf673] CMConcurrentMarkingTask::work(unsigned int)+0x293 V [libjvm.so+0xaeccff] GangWorker::loop()+0xcf V [libjvm.so+0x92a338] java_start(Thread*)+0x108 <SAME AS IN THE LINKED TICKET> REPRODUCIBILITY : This bug can be reproduced occasionally. ---------- BEGIN SOURCE ---------- // Running the test: // javac FromCardCacheTest.java // /opt/jdk1.8.0_144_fastdebug/bin/java -classpath . -Xms20M -Xmx20M -XX:HeapBaseMinAddress=2199011721216 -XX:+UseG1GC -verbose:gc FromCardCacheTest import java.lang.management.GarbageCollectorMXBean; import java.lang.management.ManagementFactory; import java.lang.reflect.Field; import sun.misc.Unsafe; public class FromCardCacheTest { private static int nArrays = 24; private static int arraySize = 131068; // object array, a bit less than half of a 1M region private static int byteArraySize = 524270; // byte array, a bit less than half of a 1M region private static final long heapBase = 2199011721216L; // less than 2TB public static void main(String[] args) { while (runTest()) { System.gc(); try { Thread.sleep(2000); } catch(Exception e) {} } } public static boolean runTest() { GarbageCollectorMXBean oldGc = null; GarbageCollectorMXBean youngGc = null; for (GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) { if (gc.getName().equals("G1 Old Generation")) oldGc = gc; if (gc.getName().equals("G1 Young Generation")) youngGc = gc; } if (oldGc == null) { System.out.println("G1 oldGc = null"); return false; } if (youngGc == null) { System.out.println("G1 youngGc = null"); return false; } long oldGcCount = oldGc.getCollectionCount(); System.out.println("Running test"); Object[][] arrays = new Object[nArrays][]; for (int i = 0; i < nArrays; i++) { arrays[i] = new Object[arraySize]; } Object[][] arrayWithCardMinus1 = new Object[1][]; arrayWithCardMinus1[0] = findArray(arrays, true, true); Object[] otherArray = findArray(arrays, false, false); if (arrayWithCardMinus1[0] == null) { System.out.println("Array with card -1 not found. Trying again."); return true; } else { System.out.println("Found an array with card -1."); } if (otherArray == null) { System.out.println("otherArray = null."); return true; } System.out.println("Modifying the last card in the array..."); final int modifyEntries = 10; final long youngGcCount = youngGc.getCollectionCount(); for (int i = 0; i < modifyEntries; i++) { if (oldGc.getCollectionCount() > oldGcCount) { System.out.println("Old gc done. Trying again."); return true; } final int arrayIndex = arraySize - i - 1; arrayWithCardMinus1[0][arrayIndex] = new Object[]{new byte[byteArraySize]}; final int card = getCard32Bit(getAddress(arrayWithCardMinus1[0]) + 4 * arrayIndex); System.out.println(arrayIndex + " " + String.format("0x%016x", getAddress(arrayWithCardMinus1[0]) + 16 + 4 * arrayIndex) + " set to " + String.format("0x%016x", getAddress(arrayWithCardMinus1[0][arrayIndex])) + ", card: " + card); if (card != -1) { arrayWithCardMinus1[0] = findArray(arrays, true, true); if (arrayWithCardMinus1[0] == null) { System.out.println("Array with card -1 not found. Trying again."); return true; } } try { Thread.sleep(100); } catch(Exception e) {} } System.out.println("Producting garbage to run gc."); final long t1 = System.currentTimeMillis(); oldGcCount = oldGc.getCollectionCount(); while (System.currentTimeMillis() - t1 < 60000) { if (oldGc.getCollectionCount() > oldGcCount) { System.out.println("Old gc done. Trying again."); return true; } otherArray[0] = new byte[30000]; otherArray[0] = null; try { Thread.sleep(100); } catch(Exception e) {} } System.out.println("The crash didn't reproduce. Trying again."); return true; } private static Object[] findArray(Object[][] arrays, boolean withCardMinus1, boolean printAddresses) { for (int i = 0; i < arrays.length; i++) { if (arrays[i] == null) continue; final long startAddress = getAddress(arrays[i]); final long lastCardAddress = startAddress + 4 * arrays[i].length; final int card = getCard32Bit(lastCardAddress); if (printAddresses) { System.out.println("array " + i + ", startAddress: " + String.format("0x%016x", startAddress) + ", lastCardAddress: " + String.format("0x%016x", lastCardAddress) + ", card: " + card); } if ((withCardMinus1 && card == -1) || (!withCardMinus1 && card != -1)) { Object[] foundArray = arrays[i]; arrays[i] = null; return foundArray; } } return null; } // Unsafe is only used for reading object addresses. private static final Unsafe UNSAFE; static { try { Field e = Unsafe.class.getDeclaredField("theUnsafe"); e.setAccessible(true); UNSAFE = (Unsafe)e.get((Object)null); } catch (IllegalAccessException | NoSuchFieldException var1) { throw new Error(var1); } } static Object[] addrArray = new Object[]{null}; static final long baseOffset = UNSAFE.arrayBaseOffset(Object[].class); static final int pageSize = UNSAFE.pageSize(); public static long getAddress(Object o) { addrArray[0] = o; final long address = (UNSAFE.getInt(addrArray, baseOffset) << 3) + heapBase - pageSize + 1024*1024L; addrArray[0] = null; return address; } public static int getCard32Bit(final long address) { final int card = (int)(address >> 9); return card; } } ---------- END SOURCE ----------
|