JDK-8096749 : OutOfMemoryError with FadeTransition under certain circumstances
  • Type: Bug
  • Component: javafx
  • Sub-Component: graphics
  • Affected Version: 8,8u20
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2014-10-08
  • Updated: 2016-04-29
  • Resolved: 2014-10-31
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8
7u111Fixed 8u40Fixed
Related Reports
Relates :  
Relates :  
Description
Take the following simple example, which gradually fades a rectangle onto a transparent canvas:

public class Test extends Application {

    @Override
    public void start(Stage primaryStage) {
        primaryStage.initStyle(StageStyle.TRANSPARENT);
        int width = 1920;
        int height = 1080;

        Rectangle rect = new Rectangle(width, height);
        rect.setFill(Color.SALMON);
        rect.setOpacity(0);
        StackPane scenePane = new StackPane();
        scenePane.getChildren().add(rect);
        primaryStage.setScene(new Scene(scenePane));
        primaryStage.setWidth(width);
        primaryStage.setHeight(height);
        primaryStage.show();
        FadeTransition ft = new FadeTransition(Duration.millis(10000), rect);
        ft.setToValue(1);
        ft.play();
    }

    public static void main(String[] args) {
        launch(args);
    }

}


When run with the VM args -Xms100m -Xmx100m, this works no problem at all. However, when I give the VM substantially more memory (such as -Xms1000m -Xmx1000m) it very quickly falls over:

java.lang.OutOfMemoryError
        at sun.misc.Unsafe.allocateMemory(Native Method)
        at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127)
        at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
        at com.sun.prism.impl.BufferUtil.newByteBuffer(BufferUtil.java:90)
        at com.sun.prism.impl.BufferUtil.newIntBuffer(BufferUtil.java:121)
        at com.sun.javafx.tk.quantum.UploadingPainter.run(UploadingPainter.java:148)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at com.sun.javafx.tk.RenderJob.run(RenderJob.java:58)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at com.sun.javafx.tk.quantum.QuantumRenderer$PipelineRunnable.run(QuantumRenderer.java:125)
        at java.lang.Thread.run(Thread.java:745)

More than one profiler I've tried shows the heap at as taking up barely any of the allocated space - but the process view in task manager shows it shooting through all the available memory in a couple of seconds.

The oddities don't end there however - it only seems to fall over for some values of width and height (a width of 1921 for instance means that the application executes fine, no errors in sight.)

This only seems to occur with a transparent stage. Without the transparent style set on the stage in the first line, all seems well (with all configurations I've tried, anyway.) Likewise, it only occurs on Java 8 (I'm using 8u20) - all is fine with Java 7 / JFX 2.x. I'm running Windows 7x64.
Comments
Fixed in the 8u-dev repo with the following changeset: changeset: 8307:02a97cf9fb93 date: Fri Oct 31 15:30:10 2014 -0700 summary: Fix RT-38923: OOM with fast animations of transparent stages http://hg.openjdk.java.net/openjfx/8u-dev/rt/rev/02a97cf9fb93
31-10-2014

That case is more asserting that it did the clear right. I suppose it might be possible to trick the object into doing the wrong thing, but I think an assert is fine.
31-10-2014

+1 on this change for this bug. One question, though. Should the following also throw ISE for consistency? + assert(p.getWidthUnsafe() == pixelW && + p.getHeightUnsafe() == pixelH && + p.getScaleUnsafe() == pixelScale); We can track the resize problem with RT-34467 (and probably leave it for JDK 9). Of the possible solutions, I think either switching to heap buffers or not recreating the Pixels on downsizing seem promising.
31-10-2014

New fix: http://cr.openjdk.java.net/~flar/RT-38923/webrev.01/ I decided to try out the "don't throw out the Source object when sizes change" approach. Unfortunately, it didn't help with the resize test case from RT-34467, but it does just as well as the previous fix for the animating cases. It does consolidate some buffer-related tracking information into the source object, though, which feels a little more modular even if it didn't make any progress on the other test cases. I'm not sure what else we can try without complicating things too much. Some thoughts: - If the old buffers are large enough, reuse the buffers in new Pixels() objects so we don't discard them on downsizes. New buffers will still get created whenever GC claims a weak saved ref. - Find some internal mechanism in the JDK to manually dispose the direct buffers we are discarding in the flush() (though we don't want to discard them if they are enqueued). - Switch to heap buffers
31-10-2014

One issue is that my solution basically devolves into the unending queue if you change sizes since I create a whole new source object on all size changes. Another solution might be to make the "getUnusedPixels()" method more intelligent so that it flushes the saved objects on a size change, but continues to track the enqueued objects. That would still involve some turnover of the buffers, but at least if you created them faster than they were being consumed you would reduce the size of the queue since you could track the previously enqueued (different sized) buffer and replace it when the new sized buffer gets enqueued. Right now when I replace the entire source object we lose our handle on any buffers that are "in flight" and so they have to be processed. This means we have to wait for them to be blitted to the screen before those buffers drop on the floor. Another solution is that I could neuter the existing source object before I replace it (so that getLatestPixels() returned null and wouldn't waste time in processing), but then you'd have missing frames until you get a source object to last all the way through the queue before it gets obsoleted, so I'd prefer to leave the existing queued object until we have a new one to replace it and then replace it in the source object where it will keep its place in the event queue. So, maybe replacing the source objects is not the ideal solution.
30-10-2014

My testing indicates that the second of the two test cases in RT-34467 is fixed by your fix (unsurprising, since that test is basically doing the same sort of thing that the test case from this bug is doing). The first test case, which dynamically modifies the size of the stage still fails, so that bug cannot be closed as a dup of this one. I don't think that continually animating the size of a stage is nearly as common as the case of animating content within a fixed-sized stage, so I am OK with just fixing this issue and then working on the other issue as a follow-on issue (it is currently targeted to JDK 9).
30-10-2014

#1 - it's meant to check that the caller did the right thing. I guess a regular exception check would be a better match for that (InvalidState or something). #2 - good point. That is the intention. I'll fix those and roll another webrev for testing, particularly that the new exception checks never trigger...
30-10-2014

I note that RT-34467 (which was assigned to Glass by mistake and was sitting in Anthony's queue until recently) is likely the same bug. The two test cases from that bug should be run with your fix.
30-10-2014

I'm running some tests now. The fix looks quite sound to me, and I expect no problems in testing. The code that you replaced was implemented to fix RT-24070 and did so with a potentially unbounded number of pixel buffers in flight. I only have two minor comments, and they won't hold up my approving this: 1) Are the assertions all indicative of a problem that should never happen by design? as opposed to a bad call by the caller? If not this could lead to differences when running with / without -ea 2) In getUnusedPixels() : 96 return null; Since the caller is expected to allocate pixels if this returns null, it might be worth a comment explaining this.
30-10-2014

webrev: http://cr.openjdk.java.net/~flar/RT-38923/webrev.00/ I took a slightly different approach to throttle the outstanding Pixels memory by creating a wrapper to intelligently manage the in flight frames as they are transferred between the threads. A new object keeps all references in one location and the state of whether they are being uploaded, waiting on the queue, or free. This allows us to: - reuse objects rather than having them fall on the floor at the end of the queue - replace frames already in the queue with newer versions rather than have all of them waiting to be consumed serially - maintain a guaranteed maximum of 3 outstanding Pixels buffers at a time
30-10-2014

I found that the easiest way to get it to reproduce is to use the -Djavafx.animation.fullspeed=true flag combined with upping the VM heap parameters both to 1000m. It reproduces quickly and easily with that combination.
30-10-2014

Noting that the original reason for creating the nio direct buffers was for blocking IO calls. Those calls cannot use get*Critical because they can block indefinitely so they were having to copy the data to the stack for every IO operation. The direct buffers allowed IO calls to just use the original buffer without having to worry about GC or pinning while they blocked forever. None of those conditions should apply to our use for pixel uploads unless the platform upload() mechanism can block. It might do so for very short periods (vsync?), but not for hours or days like the IO methods. And I think the current VM supports individual pinning of buffers whereas it once locked out all data consolidation if anything was locked.
09-10-2014

Is there any evidence that a direct array affects performance at all here? As far as I can see it all boils down to the native code that sends the pixels to the screen issuing a GetPrimitiveArrayCritical if the pixels are in a regular buffer vs using the nio getDirectBufferAddress, but neither should involve a data copy on most runtimes and the timings of the locks in Get*Critical are on the order of "you don't want to do this per pixel", but once a frame is nothing that should ever even be noticed...?
09-10-2014

Googling nio direct buffer garbage collection turns up a number of other developers struggling with the same issue - that GC is not aggressive enough for these non-heap buffers due to the need to free them via queues. Here is an example that includes code to manually free the buffers: http://stackoverflow.com/questions/1854398/how-to-garbage-collect-a-direct-buffer-java They note that the internal APIs that they are invoking via reflection may change between runtimes, but since FX is within the scope of the VM, it may be less inappropriate for our own internal code to use such measures. Another option is to recycle the direct buffers manually in the uploading painter -> glass cycle similar to the way that Canvas render buffers are recycled and reused.
08-10-2014

No, the relevant change between FX 2 and FX 8 is likely the fact that we now run the render thread in parallel with the FX application thread. This can lead to more than one outstanding buffer in some cases.
08-10-2014

Thanks for the comment - was this allocated from the heap (rather than from native memory) in JavaFX 2 then? If so I certainly didn't notice any performance implications with this setup, though admittedly haven't tested this thoroughly. If it is for performance reasons (and is worth keeping that way) perhaps a VM switch could be provided to swap this behaviour to the heap?
08-10-2014

This is a graphics issue rather than an animation issue. For transparent stages we use the UploadingPainter on Linux and Windows platforms to send the final pixels to Glass for presenting on the screen. The failure is in allocating the pixels using an NIO direct byte buffer (via Prism BufferUtil) rather than from the Java heap. One possible solution might be to use a Java array-backed ByteBuffer for these pixels rather than a direct buffer. We would need to evaluate the performace implications of doing this.
08-10-2014