JDK-8149727 : intermittent NPE in Nashorn nightly builds
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • Submitted: 2016-02-12
  • Updated: 2016-02-24
  • Resolved: 2016-02-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9Resolved
Related Reports
Duplicate :  
Description
Since Feb 8, 2016, the Nashorn nightly builds suffer from an intermittent NPE problem occurring in test262parallel execution. The error manifests as follows:

     [java] java.lang.AssertionError: Failed generating bytecode for test/script/external/test262/test/harness/sta.js:-1
     [java]     at jdk.nashorn.internal.codegen.CompilationPhase$BytecodeGenerationPhase.transform(CompilationPhase.java:436)
     [java]     at jdk.nashorn.internal.codegen.CompilationPhase.apply(CompilationPhase.java:630)
     [java]     at jdk.nashorn.internal.codegen.Compiler.compile(Compiler.java:656)
     [java]     at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.compileTypeSpecialization(RecompilableScriptFunctionData.java:717)
     [java]     at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.getBest(RecompilableScriptFunctionData.java:897)
     [java]     at jdk.nashorn.internal.runtime.ScriptFunctionData.getBest(ScriptFunctionData.java:392)
     [java]     at jdk.nashorn.internal.runtime.ScriptFunctionData.getBestInvoker(ScriptFunctionData.java:254)
     [java]     at jdk.nashorn.internal.runtime.ScriptFunction.findCallMethod(ScriptFunction.java:915)
     [java]     at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1862)
     [java]     at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:102)
     [java]     at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:94)
     [java]     at jdk.dynalink.linker.support.CompositeTypeBasedGuardingDynamicLinker.getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:184)
     [java]     at jdk.dynalink.linker.support.CompositeGuardingDynamicLinker.getGuardedInvocation(CompositeGuardingDynamicLinker.java:132)
     [java]     at jdk.dynalink.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:146)
     [java]     at jdk.dynalink.DynamicLinker.relink(DynamicLinker.java:262)
     [java]     at jdk.nashorn.internal.scripts.Script$Recompilation$466030$18490D$sta.DaylightSavingTA(test/script/external/test262/test/harness/sta.js:651)
     [java]     at jdk.nashorn.internal.scripts.Script$Recompilation$465985$sta$cu1$restOf.:program(test/script/external/test262/test/harness/sta.js:566)
     [java]     at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:654)
     [java]     at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:511)
     [java]     at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:393)
     [java]     at jdk.nashorn.internal.test.framework.SharedContextEvaluator.run(SharedContextEvaluator.java:135)
     [java]     at jdk.nashorn.internal.test.framework.AbstractScriptRunnable.evaluateScript(AbstractScriptRunnable.java:200)
     [java]     at jdk.nashorn.internal.test.framework.ParallelTestRunner$ScriptRunnable.execute(ParallelTestRunner.java:202)
     [java]     at jdk.nashorn.internal.test.framework.AbstractScriptRunnable.runTest(AbstractScriptRunnable.java:132)
     [java]     at jdk.nashorn.internal.test.framework.ParallelTestRunner$ScriptRunnable.call(ParallelTestRunner.java:281)
     [java]     at jdk.nashorn.internal.test.framework.ParallelTestRunner$ScriptRunnable.call(ParallelTestRunner.java:124)
     [java]     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
     [java]     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1158)
     [java]     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:632)
     [java]     at java.lang.Thread.run(Thread.java:804)
     [java] Caused by: java.lang.NullPointerException
     [java]     at jdk.internal.org.objectweb.asm.Type.getMethodDescriptor(Type.java:678)
     [java]     at jdk.nashorn.internal.codegen.types.Type.getMethodDescriptor(Type.java:207)
     [java]     at jdk.nashorn.internal.codegen.FunctionSignature.<init>(FunctionSignature.java:137)
     [java]     at jdk.nashorn.internal.codegen.FunctionSignature.<init>(FunctionSignature.java:67)
     [java]     at jdk.nashorn.internal.codegen.FunctionSignature.<init>(FunctionSignature.java:154)
     [java]     at jdk.nashorn.internal.codegen.ClassEmitter.method(ClassEmitter.java:551)
     [java]     at jdk.nashorn.internal.codegen.CodeGenerator.enterFunctionNode(CodeGenerator.java:2100)
     [java]     at jdk.nashorn.internal.ir.FunctionNode.accept(FunctionNode.java:342)
     [java]     at jdk.nashorn.internal.ir.LexicalContextNode$Acceptor.accept(LexicalContextNode.java:57)
     [java]     at jdk.nashorn.internal.ir.LexicalContextExpression.accept(LexicalContextExpression.java:47)
     [java]     at jdk.nashorn.internal.ir.FunctionNode.accept(FunctionNode.java:56)
     [java]     at jdk.nashorn.internal.codegen.CompilationPhase.transformFunction(CompilationPhase.java:634)
     [java]     at jdk.nashorn.internal.codegen.CompilationPhase.access$100(CompilationPhase.java:55)
     [java]     at jdk.nashorn.internal.codegen.CompilationPhase$BytecodeGenerationPhase.transform(CompilationPhase.java:423)
     [java]     ... 29 more

One of the entries in the itypes argument passed to ASM's getMethodDescriptor is null when it shouldn't be. Indeed, the entries in the array are initialised either from "new Type(...)" calls, or from static final fields.

Changing the implementation of Nashorn's Type.getMethodDescriptor to include an additional check (see below) leads to the same stack trace (with the line number changing from 207 to 209), so the value is not null in the check, but appears to be null in the called method.

    public static String getMethodDescriptor(final Type returnType, final Type... types) {
        final jdk.internal.org.objectweb.asm.Type[] itypes = new jdk.internal.org.objectweb.asm.Type[types.length];
        for (int i = 0; i < types.length; i++) {
            if ((itypes[i] = types[i].getInternalType()) == null) {
                throw new AssertionError("EEK " + itypes[i]);
            }
        }
        return jdk.internal.org.objectweb.asm.Type.getMethodDescriptor(returnType.getInternalType(), itypes);
    }

A full log of a failing test run is attached. Note that the test run either succeeds, or fails with many instances of the same error.
Comments
I executed 1000 runs with the fix for JDK-8148786 and the issue does not reproduce anymore. Closing this as duplicate. Please re-open if the failure shows up again after the fix for JDK-8148786 propagated to the jdk9-dev repo.
24-02-2016

Here is a status update: - the problem only reproduces with a product build and is very intermittent - explicit null checks in the array initialization loop are not triggered, the loop may not be executed at all. The following code fails with "IS NULL2": for (int i = 0; i < types.length; i++) { if (types[i].getInternalType() == null) { throw new AssertionError("IS NULL1"); } itypes[i] = types[i].getInternalType(); } for (int i = 0; i < itypes.length; ++i) { if (itypes[i] == null) { throw new AssertionError("IS NULL2"); } } - the problem is not in jdk.internal.org.objectweb.asm.Type::getMethodDescriptor() (also reproduces with interpreter) - the failures seems to be in the C2 compiled method jdk.nashorn.internal.codegen.ClassEmitter.method() inlining the following methods: [java] at jdk.nashorn.internal.codegen.types.Type.getMethodDescriptor(Type.java:207) [java] at jdk.nashorn.internal.codegen.FunctionSignature.<init>(FunctionSignature.java:137) [java] at jdk.nashorn.internal.codegen.FunctionSignature.<init>(FunctionSignature.java:67) [java] at jdk.nashorn.internal.codegen.FunctionSignature.<init>(FunctionSignature.java:154) [java] at jdk.nashorn.internal.codegen.ClassEmitter.method(ClassEmitter.java:551) - the graph is *huge* (~18.000 nodes) and therefore hard to analyze - analysis of the assembly code suggests that the array initialization loop is not executed -> I suspect this to be a duplicate of JDK-8148786 where the loop's Phi ends up with an incorrect type. This was triggered (but not introduced by JDK-8145322) which was pushed to jdk9/dev on Feb 2: http://mail.openjdk.java.net/pipermail/jdk9-dev-changes/2016-February/005157.html I'm currently trying to verify.
19-02-2016

Reproduced with -XX:AbortVMOnException=java.lang.NullPointerException (see attached 'hs_err_pid72133.log'). Looks like we hit a null check uncommon trap in getMethodDescriptor(): Event: 112.337 Thread 0x00007fdd2480e000 Uncommon trap: reason=null_check action=maybe_recompile pc=0x00007fdd19515f60 method=jdk.internal.org.objectweb.asm.Type.getMethodDescriptor(Ljdk/internal/org/objectweb/asm/Type;[Ljdk/internal/org/objectweb/asm/Type;)Ljava/lang/String; @ 27
15-02-2016

I'm able to reproduce this with the latest jdk9/dev in 1/10 full runs of test262parallel. Investigating.
15-02-2016

I was able to reproduce it on a Linux system against an all-tip version of 9-dev. Here are the steps that worked for me after "make images" completed: * make JAVA_HOME point to the built JDK image * prepend JAVA_HOME/bin to PATH * in the nashorn/make directory, run "ant externals" to load the required tests * still in that directory, do an "ant clean" * still in that directory, do as many "ant test262parallel" as necessary to reproduce the problem I used a shell script that repeatedly runs test262parallel and writes the output to a file whilst printing $? after each run. In one experiment this morning, 2 out of 7 runs failed. Hope that helps.
12-02-2016

Thx, Michael. I have a look.
12-02-2016

NB. This is reproducible on Linux.
12-02-2016

[~mhaupt], could you provide instructions on how to reproduce this?
12-02-2016