JDK-8214285 : ZGC: SoftReference problem in SPECjbb2015
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 12
  • Priority: P2
  • Status: Resolved
  • Resolution: Duplicate
  • OS: linux
  • CPU: x86
  • Submitted: 2018-11-26
  • Updated: 2019-01-14
  • Resolved: 2019-01-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 12
12Resolved
Related Reports
Duplicate :  
Relates :  
Description
SpecJBB2015 Hangs occasionally when shutting down. This looks like the same issue that was encountered 1-2 years ago. Something with soft references being cleared incorrectly.

Pure build of JDK 12 using flags:

-XX:CICompilerCount=8 -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -Xmx16G -Xms16G -XX:ParallelGCThreads=8 -XX:ConcGCThreads=5 -XX:-CreateCoredumpOnCrash -XX:+PrintCompilation

This has only been observed on a Skylake machine so far.

Dump of problematic thread:

"Group1.Backend.CompositeBackend{Tier1}.6" #2941 daemon prio=10 os_prio=0 cpu=10472318,86ms elapsed=11949,97s tid=0x00007f667c29a7c0 nid=0x3269b runnable  [0x00007f65dbffd000]
   java.lang.Thread.State: RUNNABLE
        at java.util.concurrent.ConcurrentHashMap.putVal(java.base/ConcurrentHashMap.java:1012)
        at java.util.concurrent.ConcurrentHashMap.putIfAbsent(java.base/ConcurrentHashMap.java:1541)
        at org.spec.jbb.core.locks.DelegateLockManager.getLock(DelegateLockManager.java:61)
        at org.spec.jbb.hq.db.Storage.getCustomerBalance(Storage.java:343)
        at org.spec.jbb.hq.HQ.reserveCredit(HQ.java:311)
        at org.spec.jbb.hq.tx.CheckAndReserveCreditTransaction.execute(CheckAndReserveCreditTransaction.java:33)
        at org.spec.jbb.core.tx.SimpleTransactionExecutor.execute(SimpleTransactionExecutor.java:37)
        at org.spec.jbb.hq.HQ.execute(HQ.java:147)
        at org.spec.jbb.core.threadpools.AbstractPool.processOne(AbstractPool.java:71)
        at org.spec.jbb.core.threadpools.AbstractPool.process(AbstractPool.java:59)
        at org.spec.jbb.core.threadpools.AbstractPool.processLocally(AbstractPool.java:135)
        at org.spec.jbb.core.threadpools.EmptyMultiPool.processLocally(EmptyMultiPool.java:60)
        at org.spec.jbb.core.executor.AbstractBatchExecutor.handle(AbstractBatchExecutor.java:66)
        at org.spec.jbb.core.comm.Interconnect$Downlink.acceptRequest(Interconnect.java:1152)
        at org.spec.jbb.core.comm.Interconnect$Downlink.accept(Interconnect.java:1104)
        at org.spec.jbb.core.comm.Interconnect.forward(Interconnect.java:314)
        at org.spec.jbb.core.comm.Interconnect.access$700(Interconnect.java:96)
        at org.spec.jbb.core.comm.Interconnect$UplinkImpl.packetSend(Interconnect.java:951)
        at org.spec.jbb.core.comm.Interconnect$UplinkImpl.sendRequest(Interconnect.java:722)
        at org.spec.jbb.core.comm.Interconnect$UplinkImpl.sendRequest(Interconnect.java:816)
        at org.spec.jbb.core.tx.TransactionContext.sendRequest(TransactionContext.java:30)
        at org.spec.jbb.sm.tx.InStorePurchaseTransaction.doCheckout(InStorePurchaseTransaction.java:136)
        at org.spec.jbb.sm.tx.AbstractPurchaseTransaction.execute(AbstractPurchaseTransaction.java:63)
        at org.spec.jbb.core.tx.SimpleTransactionExecutor.execute(SimpleTransactionExecutor.java:37)
        at org.spec.jbb.sm.SM.execute(SM.java:303)
        at org.spec.jbb.core.threadpools.AbstractPool.processOne(AbstractPool.java:71)
        at org.spec.jbb.core.threadpools.ForkJoinBatchTask.exec(ForkJoinBatchTask.java:87)
        at java.util.concurrent.ForkJoinTask.doExec(java.base/ForkJoinTask.java:290)
        at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base/ForkJoinPool.java:1020)
        at java.util.concurrent.ForkJoinPool.scan(java.base/ForkJoinPool.java:1656)
        at java.util.concurrent.ForkJoinPool.runWorker(java.base/ForkJoinPool.java:1594)
        at java.util.concurrent.ForkJoinWorkerThread.run(java.base/ForkJoinWorkerThread.java:177)

Comments
I have run almost 200 iterations without a failure. Repro rate was ~1/15 so I feel confident that the problem was fixed by JDK-8215708 "ZGC: Add missing LoadBarrierNode::size_of()"
14-01-2019

We don't have any other way of reproducing it at this time. This has only ever been seen running SPECjbb2015 on a specific machine. We've done hundreds of runs on other machines (and on machine with identical spec as the machine this has failed on) without being able to reproduce, which is why we've started to think it might be a hardware problem. When catching this in a debugger the app is stuck polling a ReferenceQueue, expecting a specific SoftReference to appear. However, when looking at the SoftReference in question we can see that it has already passed through the ReferenceQueue (i.e. already polled and now in-active), but was somehow "missed" by the thread polling the queue.
15-12-2018

Is it possible to get a minimal testcase for this issue? One that doesn't require Specjbb2015? As Specjbb2015 appears to be a purchase-only product, it is difficult for people to pitch in on this.
14-12-2018

I have tried to reproduce on a second machine of the same type but haven't succeeded. This indicates that the first machine might be the problem.
04-12-2018

Now I have reproduced it with JDK 11 GA build too. So this is not a regression in 12.
29-11-2018

50 iterations of JBB2015 passsd on JDK11 build. This indicates that this is a problem that is in 12 but not in JDK11 GA. Considering the low repro rate the confidence is low to medium.
28-11-2018