JDK-6920977 : G1: guarantee(k == probe->klass(),"klass should be in dictionary") fails
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: hs17
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2010-01-28
  • Updated: 2012-10-08
  • Resolved: 2010-03-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u21Fixed 7Fixed hs17Fixed
Description
While testing G1 I hit the following guarantee failure:

Internal Error at loaderConstraints.cpp:500, pid=21053, tid=15 
Error: guarantee(k == probe->klass(),"klass should be in dictionary")

I can reproduce it reasonably easily (within maybe an hour) on my OpenSolaris workstation (intel quadcore) with a solaris / 32-bit / fastdebug build.

Comments
EVALUATION ChangeSet=http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/38836cf1d8d2,ChangeRequest=6920977
10-02-2010

SUGGESTED FIX So, it turns out that if the entry is in the placeholders table, the instanceKlass on the placeholder entry might not have been installed yet. So, we'll split the check into two: - if the look up for the class in the system dictionary is successful, we'll leave the k == probe->klass() guarantee as is - if the look up for the class in the system dictionary fails, we'll only check that the class appears in the placeholders table (as we cannot k from the placeholde entry).
05-02-2010

EVALUATION Also, an e-mail from Karen: So - to the best of my understanding you were actually correct - the guarantee should be changed from find_class to find_class_or_placeholder. Sorry I didn't have this clearly thought out before. Here is the logic: In define_instance_class the order of operations is tricky and deliberate: 1. check_constraints: which both checks the constraints, and updates the constraint table with the new instanceKlass 2. javaCall to loader_addClass_method 3. update_dictionary: which adds the new instanceKlass with the defining loader 4. remove from placeholder table There is a comment that check_constraints and update_dictionary need to be "atomic" with respect to each other. You have to check_constraints before you update the java class loader table. You can't hold a lock around the call out to java. You can't update the dictionary until all checks are done, because once an instanceKlass is in the dictionary it is globally visible. So if you have a gc between the check_constraints and update_dictionary - you will see the situation you have today. And you've caught it in define_instance_class because you have a placeholder entry with a definer set. I suspect that I should think through a bit more how to recover from an OutOfMemoryError in the call to loader_addClass_method. Right now we crash so it is not critical. If MVM lets you keep running we will end up with an entry in the loader_constraint table referring to an instanceKlass that is not in the dictionary, and a left over placeholder table entry, so I should add a comment to the code that when we recover from OutOfMemoryErrors that should be cleaned up.
01-02-2010

WORK AROUND Well, there's no "easy" workaround. Given that this is a bug in the verification code, not running with verification turned on is the only way not to trip over it.
01-02-2010

SUGGESTED FIX Change the guarantee to also check the placeholders table, if find_class() returns NULL. I'll also add an explicit check to check that k != NULL, to catch that condition separately.
01-02-2010

EVALUATION The issue seems to be that the klass oop can be installed in the placeholders table, while a class is being defined, and before it is added to the system dictionary. So, find_class() on the dictionary is not guaranteed to find the klassOop in the system dictionary, when said klassOop is well formed and appearing in the loader constraints table. So, this guarantee: klassOop k = dictionary->find_class(d_index, d_hash, name, loader); guarantee(k == probe->klass(), "klass should be in dictionary"); is too strict, as the klassOop might not yet be appearing in the system dictionary.
01-02-2010

EVALUATION I've been looking at this with help from Karen (thanks!) and I've got it to fail with a lot of extra instrumentation. This is what we know so far. The guarantee fails while the loader constraints table is being verified. Verification is initiated from Universe::verify(), which is called because +VerifyBeforeGC was set. So, the issue is found at the beginning of a G1 pause. In all the failures I've seen, the class in question always seems to be 'java/lang/SecurityManager'. The reason the guarantee fails is that when we do a lookup in the system dictionary to find the class that's in the probe, we don't find it and we get back NULL: k = 0x00000000 whereas what we're looking for clearly is not: probe->klass() = 0xf4f93b00 probe->name() = 0xf4a0fad8 = java/lang/SecurityManager probe->loaders() sun/misc/Launcher$AppClassLoader <bootloader> The class loader of ik (ik = instanceKlass::cast(probe->klass())) seems to be the boot loader: ik->class_loader() = <bootloader> The first two times I saw this it happend at the beginning of the first pause after a cleanup pause, so I naturally thought they would be related. However, in the last two failures, it happens two pauses after a cleanup pause. So, maybe, thinking that the cleanup pause is somehow to blame might be a red herring and it could simply be due to timing. However, I had added instrumentation to dump the contents of the system dictionary before and after the cleanup pause and enabled verification during GC to also verify that cleanup didn't mess anything up. There are a few interesting observations that we can make by looking at the contents of the system dictionary at that point: - 'java/lang/Security' does not seem to be in the system dictionary - 'java/lang/Security' does appear in the loader constraints table: 3: Symbol: 'java/lang/SecurityManager' , loaders:a 'sun/misc/Launcher$AppClassLoader', NULL, - verification passes I print the contents of the system dictionary when we reach the point of the failure too. The scenario is as above, with a minor difference: - 'java/land/Security' does not seem to be in the system dictionary - 'java/lang/Security' does appear in the loader constraints table: 3: Symbol: 'java/lang/SecurityManager' , loaders:a 'sun/misc/Launcher$AppClassLoader', NULL, - 'java/lang/SecurityManager' also now appears in the placeholders table: place holder 'java/lang/SecurityManager', supername 'java/lang/Object', definer AllocatedObj(0x099f1c00) - verification now fails
28-01-2010