JDK-8166207 : Use of Copy::conjoint_oops_atomic in global mark stack causes crashes on arm64
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 9
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2016-09-16
  • Updated: 2021-09-20
  • Resolved: 2016-09-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9 b139Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
hotspot/src/share/vm/oops/klass.inline.hpp:63), pid=18062, tid=18087
assert(!is_null(v)) failed: narrow klass value can never be zero

Current thread (0x0000007f9028f000):  VMThread "VM Thread" [stack: 0x0000007f29343000,0x0000007f29443000] [id=18087]

Stack: [0x0000007f29343000,0x0000007f29443000],  sp=0x0000007f2943fbe0,  free space=1010k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x10c4ea4]  VMError::report_and_die(int, char const*, char const*, std::__va_list, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x140
V  [libjvm.so+0x10c5aac]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, std::__va_list)+0x54
V  [libjvm.so+0x6ba3b0]  report_vm_error(char const*, int, char const*, char const*, ...)+0xe0
V  [libjvm.so+0xcd7dac]  MarkSweep::mark_object(oop)+0x448
V  [libjvm.so+0xcd8978]  void MarkSweep::mark_and_push<unsigned int>(unsigned int*)+0x114
V  [libjvm.so+0xcdbca8]  void InstanceRefKlass::oop_oop_iterate_ref_processing<true, MarkAndPushClosure>(oop, MarkAndPushClosure*)+0x6b0
V  [libjvm.so+0xcdbe68]  void InstanceRefKlass::oop_oop_iterate<true, MarkAndPushClosure>(oop, MarkAndPushClosure*)+0x18c
V  [libjvm.so+0xcd5b4c]  InstanceRefKlass::oop_oop_iterate_nv(oop, MarkAndPushClosure*)+0x34
V  [libjvm.so+0xcd2490]  MarkSweep::follow_stack()+0x3ec
V  [libjvm.so+0xcd3798]  MarkSweep::FollowRootClosure::do_oop(oop*)+0x134
V  [libjvm.so+0xdda2c0]  OopMapSet::all_do(frame const*, RegisterMap const*, OopClosure*, void (*)(oop*, oop*), OopClosure*)+0x374
V  [libjvm.so+0x784174]  frame::oops_code_blob_do(OopClosure*, CodeBlobClosure*, RegisterMap const*)+0x44
V  [libjvm.so+0x102fa0c]  JavaThread::oops_do(OopClosure*, CodeBlobClosure*)+0x1dc
V  [libjvm.so+0x1038f7c]  Threads::possibly_parallel_oops_do(bool, OopClosure*, CodeBlobClosure*)+0x58
V  [libjvm.so+0x8312bc]  G1RootProcessor::process_strong_roots(OopClosure*, CLDClosure*, CodeBlobClosure*)+0x90
V  [libjvm.so+0x7f3010]  G1MarkSweep::mark_sweep_phase1(bool&, bool)+0x3a8
V  [libjvm.so+0x7f6658]  G1MarkSweep::invoke_at_safepoint(ReferenceProcessor*, bool)+0x160
V  [libjvm.so+0x7b5944]  G1CollectedHeap::do_full_collection(bool, bool)+0x6b4
V  [libjvm.so+0x7b6994]  G1CollectedHeap::satisfy_failed_allocation_helper(unsigned long, unsigned char, bool, bool, bool, bool*)+0x12c
V  [libjvm.so+0x7b6abc]  G1CollectedHeap::satisfy_failed_allocation(unsigned long, unsigned char, bool*)+0x114
V  [libjvm.so+0x110847c]  VM_G1CollectForAllocation::doit()+0xcc
V  [libjvm.so+0x11066c0]  VM_Operation::evaluate()+0xb0
V  [libjvm.so+0x11025b8]  VMThread::evaluate_operation(VM_Operation*)+0x134
V  [libjvm.so+0x1102ffc]  VMThread::loop()+0x500
V  [libjvm.so+0x1103294]  VMThread::run()+0xd4
V  [libjvm.so+0xdfb408]  thread_native_entry(Thread*)+0x118
C  [libpthread.so.0+0x7e48]  start_thread+0xac

Comments
URL: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9f7f714bf3e8 User: lana Date: 2016-10-05 20:01:40 +0000
05-10-2016

URL: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/9f7f714bf3e8 User: tschatzl Date: 2016-09-19 23:12:34 +0000
19-09-2016

It would work. There is no non-atomic memory copy though - at least none with a similar name. I will fix this with JDK-8166314, when fixing Copy::conjoint_oops_atomic for 64 bit ARM.
19-09-2016

I saw on hotspot-gc-dev that the proposed fix is conjoint_memory_atomic, which seems fine, but in this case wouldn't a non-atomic copy also work?
19-09-2016

I think I found the problem. This is the first time conjoint_oops_atomic(oop* from, oop* to, size_t count) has been used to copy oops outside the heap. Looks like there is a bug on arm64 and aarch64 when UseCompressedOops is true. Arm64 will copy narrowOops instead, and aarch64 will assert.
18-09-2016

Could this be related to JDK-8159422 or JDK-8165808?
16-09-2016

Reassigning to compilers. The failure is occurring during a full collection where the serial Mark-Sweep is being used. 8158927 is also using the Mark-Sweep serial GC. In 8158927 the compiler generated oop maps are being scanned and a bad oop is encountered. This looks like a similar situation where GC is the victim of the bad oop map. 8158927 was closed as a duplicate of https://bugs.openjdk.java.net/browse/INTJDK-7623978 which indicates a hardware problem on some ARM machines. This failure is on ARM but not necessarily on a Odroid board (I don't know how to tell). This failure is on emb-sca-apm-xgene-28, Unknown AArch64 Processor rev 0 (aarch64) 0 MHz, 8 cores, 16G, Linux / Ubuntu 14.04, aarch64 Though the root cause of 8158927 was a hardware and the root cause here may be different, I still think that the problem is bad oop maps and not a GC bug. Since the oop maps are generated by the compilers, assigning it to compilers.
16-09-2016