gunter.haug@sap.com reported the following crash: Native frames: (J=compiled Java code, j=interpreted, V=VM code (C/C++), v=VM code (generated), C=native code) V [libjvm.so+0xffffffff] CardTableModRefBS::process_chunk_boundaries(Space*,DirtyCardToOopClosure*,MemRegion,MemRegion,signed char**,unsigned long,unsigned long)+0xd4 (sp=0x000000011089eb40) (pc=0x0900000293b83154) V [libjvm.so+0xffffffff] CardTableModRefBS::process_stride(Space*,MemRegion,int,int,OopsInGenClosure*,CardTableRS*,signed char**,unsigned long,unsigned long)+0x1a0 (sp=0x000000011089ec30) (pc=0x0900000293b82f20) V [libjvm.so+0xffffffff] CardTableModRefBS::non_clean_card_iterate_parallel_work(Space*,MemRegion,OopsInGenClosure*,CardTableRS*,int)+0xe4 (sp=0x000000011089ed70) (pc=0x0900000293b827e4) V [libjvm.so+0xffffffff] CardTableModRefBS::non_clean_card_iterate_possibly_parallel(Space*,MemRegion,OopsInGenClosure*,CardTableRS*)+0x54 (sp=0x000000011089ee60) (pc=0x0900000293b80db4) V [libjvm.so+0xffffffff] CardTableRS::younger_refs_in_space_iterate(Space*,OopsInGenClosure*)+0x80 (sp=0x000000011089ef20) (pc=0x0900000293b892e0) V [libjvm.so+0xffffffff] Generation::younger_refs_in_space_iterate(Space*,OopsInGenClosure*)+0x3c (sp=0x000000011089efc0) (pc=0x0900000293a306bc) V [libjvm.so+0xffffffff] ConcurrentMarkSweepGeneration::younger_refs_iterate(OopsInGenClosure*)+0x4c (sp=0x000000011089f030) (pc=0x09000002933c956c) V [libjvm.so+0xffffffff] CardTableRS::younger_refs_iterate(Generation*,OopsInGenClosure*)+0x4c (sp=0x000000011089f0b0) (pc=0x0900000293b88d2c) V [libjvm.so+0xffffffff] GenCollectedHeap::gen_process_roots(int,bool,bool,SharedHeap::ScanningOption,bool,OopsInGenClosure*,OopsInGenClosure*,CLDClosure*)+0x19c (sp=0x000000011089f120) (pc=0x090000029341795c) V [libjvm.so+0xffffffff] ParNewGenTask::work(unsigned int)+0x1c8 (sp=0x000000011089f230) (pc=0x0900000293a3a6c8) V [libjvm.so+0xffffffff] GangWorker::loop()+0x164 (sp=0x000000011089f3e0) (pc=0x0900000293b850c4) V [libjvm.so+0xffffffff] GangWorker::run()+0x58 (sp=0x000000011089f4c0) (pc=0x0900000293b84eb8) V [libjvm.so+0xffffffff] java_start(Thread*)+0x1b8 (sp=0x000000011089f540) (pc=0x0900000292fd2f38) C [libpthread.a+0xffffffff] _pthread_body+0xec (sp=0x000000011089f790) (pc=0x0900000000520fec) This crash occurs from time to time since several years but only on non TSO platforms. - It only happens in opt builds. - Analysis of the assembly code revealed the actual crash site to be an array store to a pointer which is an argument to process_chunk_boundaries - The pointer is actually calculated in CardTableModRefBS::get_LNC_array_for_space - CardTableModRefBS::get_LNC_array_for_space doesn't enforce TSO on _last_LNC_resizing_collection[i] so the pointer to an uninitialized structure could become visible to other threads. Solution: Use OrderAccess::load_acquire and OrderAccess::release_store for accessing _last_LNC_resizing_collection[i]
|