JDK-8202273 : [AOT] Graal does not support the CMS collector
Type:Bug
Component:hotspot
Sub-Component:compiler
Affected Version:11
Priority:P2
Status:Closed
Resolution:Fixed
Submitted:2018-04-25
Updated:2018-08-16
Resolved:2018-04-27
The Version table provides details related to the release that this issue/RFE will be addressed.
Unresolved : Release in which this issue/RFE will be addressed. Resolved: Release in which this issue/RFE has been resolved. Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.
There are about 60 tests failing in the nightly saying: Exception: org.graalvm.compiler.debug.GraalError: Graal does not support the CMS collector
Comments
Yes, we could.
02-05-2018
[~kvn] maybe it would make sense to run hs-tier3 for the "Update Graal" issues?
02-05-2018
We do our usual hs-tier, hs-tier2 testing and hs-precheckin-comp.js (-Xcomp).
We do not run Tier3 or later.
27-04-2018
Added link to Graal update (JDK-8199755) that Vladimir identified as introducing the test failures.
Update: I would normally add the 'regression' label because we went
from passing tests to failing tests, but it seems that this might simply
be a case of mismatched expectations.
[~jwilhelm] - I'll leave adding the 'regression' label up to you.
27-04-2018
ILW = Tests fail because AOT/Graal does not support CMS collector, AOT and JVMCI tests, no workaround = MHH = P2
27-04-2018
open part:
http://cr.openjdk.java.net/~kvn/8202273/webrev.00/
27-04-2018
From hs-tier3 and hs-tier6 which run with CMS I got all 53 AOT tests failed. There are few unrelated failures in jdk_management.
26-04-2018
This bug fix only exclude compiler/aot and compiler/jvmci tests run with CMS.
26-04-2018
Graal issue will be fixed with JDK-8184349 changes by switching off JVMCI for GC which it does not support.
26-04-2018
Please problemlist the tests or fix this issue today, or back out JDK-8199755. These failures pollutes the nightlies and makes it difficult to see if there are other "real" failures.
26-04-2018
I wasn't quite sure what problems we still had outstanding with CMS when it was disabled. We identified a problem with vector operations missing card marks that wasn't showing up in the default collector a while back and I kind of assumed that it was the cause of the last CMS failures we were seeing. Do we actually have outstanding CMS issues that haven't been fixed?
25-04-2018
I don't think normal allocations from a TLAB are any different. CMS uses ParNew as a young gen collector, which is fairly standard. What's tricky about CMS is how the objects are formed in the old gen. If marking is in progress, the objects have to be concurrently parsable. May be that was the problem?
G1 doesn't have anything like that, since it won't mark through objects that were allocated after the marking has started (it's SATB). ZGC is substantially more tricker in how it deals with references in general and is also single-generation.
25-04-2018
Yes, that would be preferable.
25-04-2018
Okay, may be it is simpler to change our testing definitions to not run AOT and Graal with CMS.
Note, AOT never supported CMS and is not planning.
25-04-2018
I cannot reverse the change without investigating and fixing Graal support for CMS. From [~never]:
"If I had to guess what's wrong I'd suspect that our group allocation doesn't play well with concurrent scanning. Doesn't our group allocation code assume that the space we're filling in comes from eden so it doesn't need any card marks? I think CMS will perform free list allocation which would violate this assumption. We probably need to write dirty cards for the whole allocation to make sure it's rescanned properly. I also suspect that our carving up of a single allocation into multiple objects also doesn't play well with concurrent scanning. Long ago I'd discussed group allocation with a CMS guy who is no longer here and the conclusion was that you'd need to make it look like a primitive array first, then fill everything in apart from the klass of the first object and then write the first klass and dirty all the cards to get it rescanned. Is that something we'll need to worry about with G1 or ZGC?"
[~never], [~gdub], if the above assumption about group/bulk allocation playing badly with CMS is true, then could we simply disable this optimization if CMS is enabled?
25-04-2018
[~dnsimon] Doug, can you reverse change with disable CMS support?
25-04-2018
The problem is caused by recent Graal update JDK-8199755 which included next change:
[GR-6791] Prevent use of Graal with CMS collector.
25-04-2018
Please note that Oracle does still support CMS.
25-04-2018
Failed tests are AOT tests which uses Graal. We should exclude AOT tests and Graal as JIT when run with CMS.
25-04-2018
Note, these are regular hotspot compiler tests which started to fail with CMS.
Are there any recent changes which caused this?
25-04-2018
Since Oracle does not support CMS why we are testing with it?