Bug ID: JDK-6893655 StrikeCache causes OOM on hotspot-server

Type: Bug
Component: client-libs
Sub-Component: 2d
Affected Version: 6u10

Priority: P4
Status: Closed
Resolution: Duplicate
OS: linux_ubuntu
CPU: x86

Submitted: 2009-10-21
Updated: 2020-04-07
Resolved: 2020-04-07

Other
tbdResolved

FULL PRODUCT VERSION :
JDK6+

A DESCRIPTION OF THE PROBLEM :
Following a discussion I had on the 2d-dev mailing list on 25. april 2009
(see http://mail.openjdk.java.net/pipermail/2d-dev/2009-April/000769.html )
I am uploading this simple sample to show how simple it is to cause StrikeCache to consume huge amounts of memory.
Running the sample with java -server -Xmx1024 causes an OOM of native heap after some time.

The problem is StrikeCache uses SoftReferences, which are very seldom cleared by the hotspot-server.
Probably a better idea would be some kind of "soft limit", however I don't have an implementation.

In the xrender pipeline this is solved using a hard limit.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
StrikeCache should limit the amount of native memory used.
ACTUAL -
The JVM OOMs as soon as the process hits the 2GB barrier, on a 32bit system - and consumes extremly huge amounts of memory on 64-bit systems.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.awt.*;
import java.awt.image.*;
import java.util.*;

public class FontOOMTest {
    public static void main(String[] args) {
	Image bImg =new BufferedImage(1024, 1024, BufferedImage.TYPE_INT_RGB);
	Graphics2D g = (Graphics2D) bImg.getGraphics();
	Random rand = new Random();
	
	while(true) {
	    g.setFont(new Font("Dialog", Font.PLAIN, rand.nextInt(100)));
	    g.rotate(rand.nextDouble()%100);
	    g.drawString("qwertzuiopasdfghjklyxcvbnnnnmQWERTZUIOPASDFGHJKKKLLYXCVBNNM12345678990", 512, 512);
	}
    }
}

---------- END SOURCE ----------

I thnk this case should have been fixed by : 6927458: font system should cache transient strikes with weak references.

07-04-2020

EVALUATION Since we have these fixes : 6927458: font system should cache transient strikes with weak references. 6891551: Font rasterisation uses more heap than needed for some strikes. I believe this will be more difficult to reproduce, at least, in real apps rather than a micro-stress test.

14-01-2011

EVALUATION The test is using font sizes up to 100 pixels. That makes for really big glyph images. Above 100 pixels we don't even use glyphs because its so much memory to store them. Perhaps we could alleviate this by reducing the size at which we switch to outlines, but it wouldn't change the fundamental issue. Its not really a garbage problem in that relatively little Java heap is being used, although the server VM's behaviour to grow rather than free makes it a problem. Lots of native heap is being used to store images that are on average 50 pixels high. Lets say its 50x25 pixels per glyph is around 1250 bytes for each glyph. At least 60 unique glyphs is 150Kbytes per strike. But at the Java Heap level its going to be less than 1500 bytes! A ratio of 100:1 native:java heap. So the GC role in the problem is that it doesn't get involved at all. References are used in part because its a more prompt substitute for finalization, and there are native resources to be freed. And since the font library code is not the garbage collector it can't know when that is. The problem with a cache system that imposes limits is that there could be any number of strong references (current users) of that data, and you can't just free it out from under them. So there needs to be something like a check out/check in reference counting system, This would bring its own consequent performance problems and a greater likelihood of concurrent use/free bugs that could cause crashes. And a cache system that imposed limits probably should be aware that not all glyphs are equal. A cache thats simply counts glyphs also won't handle this test well. You'd perhaps want to count bytes in use. The safest and least coding solution would be to use weak references. But whereas soft refs are freed too late, weak refs are freed too soon. Its almost like no cache at all .. but at least you get the native resource cleanup. So to be at all useful it needs to be limited in some way, perhaps adaptively - ie some references could be soft, some weak, depending on the strike and memory usage. Eg - whenever we have > N strikes for a font, use weak references for new ones, or use weak ones for rotations, or glyphs above a certain size. And if we can determine the server VM is running we would use these weak references in such cases, but never for the client VM. At the same time we'd perhaps want to bump up the minimum number of strikes be keep around at the same time, so as to minimise thrashing.

23-10-2009