JDK-8124072 : Mac: 8.0-graphics-scrum-792: up to 30% performance regression on MacOS
  • Type: Bug
  • Component: javafx
  • Sub-Component: window-toolkit
  • Affected Version: 8
  • Priority: P2
  • Status: Resolved
  • Resolution: Duplicate
  • Submitted: 2013-03-01
  • Updated: 2015-06-17
  • Resolved: 2013-07-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 8
8Resolved
Related Reports
Relates :  
Relates :  
Description
There are following regressions in build 8.0-graphics-scrum-792 on Mac-Mid-Range machine:
 Charts.Bubble: -19% (-3.29 fps) 
 Controls.CheckBox: -33% (-17.92 fps)
 Controls.RadioButton: 32% (-15.03 fps) 
 Controls.TableView-XmasTree: -27% (-9.58 fps) 


Steps to run Controls.CheckBox benchmark:
> cd JFX_WS/tests/performance/Controls/
> ant
> java -Djavafx.animation.fullspeed=true
       -cp "JFX_HOME/rt/lib/ext/jfxrt.jar:./dist/Controls.jar:../FXBenchmark/dist/FXBenchmark.jar:../../../import/benchmarks-2.1.1/benchmarks-2.1.1.jar"
       jrockit.bm.Main controls.bm.CheckboxBenchmark -i 1 -wt 0 -tr 60 -toggleStep 300


The regression is caused by changes done as part of
 RT-19342: SceneBuilder is out of sync with the native OS window size while resizing. 

In particular the changes done in glass/glass-lib-macosx/src/GlassView.m are the cause
of observed regression on MacOS. Note, there are no regressions on Windows.

To prove this just copy rt/lib/libglass.dylib from build #791 and the performance will return back.

Comments
1) Could this be due to the fact that fonts are now drawn differently on OS X? 2) Do you have other JIRA that are tracking performance regressions on OS X or is this it? Could you enable 2tk fonts on the Mac to see whether the performance comes back? My thinking is that it took so long to fix this bug that the performance drop that was caused by the original fix actually is fixed but is being masked by other changes.
30-07-2013

Well, I would keep this bug open. RT-26702 has been fixed and integrated into build 8.0-graphics-scrum-1476. The performance results show that performance has been improved in build #1476. In particular performance of Charts.Bubble returned back as it was in build 791. However, the performance of Controls.CheckBox, Controls.RadioButton and Controls.TableView-XmasTree is still lower comparing to build #791. Here are the results for builds #1474 (RT-26702 is not fixed) and #1476 (RT-26702 is fixed) comparing to #791: - Controls.CheckBox-adhoc-items300-toggle30 1474: -45% (-29.02) 1476: -44% (-28.38) - Controls.CheckBox-adhoc-items300-toggleAll 1474: -53% (-28.50) 1476: -38% (-20.51) - Controls.RadioButton 1474: -49% (-23.17) 1476: -37% (-17.27) - Controls.TableView-XmasTree 1474: -20% (-7.31) 1476: -13% (-4.54)
30-07-2013

Ekaterina, this should now be fixed. Can you confirm?
30-07-2013

DUP of https://javafx-jira.kenai.com/browse/RT-26702
10-07-2013

Fixing RT-26702 will fix this.
26-04-2013

The OpenGL profiler gives errors on my machine. I may need to reinstall X code or get a newer version to fix this. Richard, which tests did you run? I am running CheckboxBenchmark. It is interesting that the low level Mac tools report reasonable FPS while our Java code does not. Who is right?
04-03-2013

I can confirm that putting a usleep(1000) in place lockFocus/unlockFocus "fixes" the problem.
04-03-2013

I agree the render lock is irrelevant. I meant "FYI". I will test the 1ms theory. Also, perhaps calling lockFocus/unlockFocus from a background thread is used by OS X to indicate a clean point to trigger a drawInGCLContext().
02-03-2013

The render lock is irrelevant here. The one that the FX thread is blocked on is GlassOffscreen->_lock (the FX thread tries to acquire it in GlassLayer3D->drawInCGLContext ()) and it's held by the render thread in GlassView3D->begin, which is called right after the commented out [view lockFocus]. Removing an extra lock allows the render thread to execute begin() faster and that reduces the window during which the FX thread could respond to drawInCGLContext with the previous frame. If that short window is missed the FX thread has to wait for the new frame just started. If you insert an artificial delay for 1 ms between [view lockFocus] and [view begin] I believe FPS will increase even compared to the original state (before commenting out lockFocus].
02-03-2013

I can confirm that we need the lockFocus/unlockFocus in order for live resize on the Mac to avoid deadlock. Running Ensemble with the lock and resizing the bottom right corner caused deadlock right away.
02-03-2013

I can confirm that we need the lockFocus/unlockFocus in order for live resize on the Mac to avoid deadlock. Running Ensemble with the lock and resizing the bottom right corner caused deadlock right away.
02-03-2013

So how does waiting in the operating system around lockFocus/unlockFocus fix this? The operating system lock is held inside the render lock.
01-03-2013

I ran these tests on my mac. I gathered both the FPS reported by FX and by the Mac OpenGL Profiler tool. I ran glass using both the "lock" and "nolock" versions, and also ran with the SWT glass implementation. The numbers reported are interesting. OpenGL Profiler: �� ��libglass-lock.dylib: 70fps �� ��libglass-nolock.dylib: 56fps �� ��swt: 62fps JavaFX Reports: �� ��libglass-lock.dylib: 36fps �� ��libglass-nolock.dylib: 31fps �� ��swt: 60fps
01-03-2013

I now believe that what I described as a possibility is exactly what happens. It's another manifestation of an incomplete fix for RT-26702. The condition that prevents a new pulse from starting is the state of the rendering target texture: if the render thread starts a new rendering job and locks the texture the FX thread currently waits for the job to finish so that it could respond to drawInCGLContext with a newly prepared frame. I'm reopening RT-26702.
01-03-2013

I can confirm that resize is not being called repeatedly for some reason. I can recreate the problem on my machine.
01-03-2013

There is an OpenGL ES Driver instrument in Xcode->OpenDeveloper Tool->Instruments. If there is some condition of not starting a new pulse that is set in the render thread then the faster the render thread gets to setting this condition the less parallelism we'll have. That's just a speculative possibility, of course.
01-03-2013

Also, how do I measure low level FPS using the Mac OpenGL tools?
01-03-2013

Ummmm ... any suggestion as to why removing a lock causes less parallelism?
01-03-2013

It's pretty obvious from JPA profiles that the drop in performance is due to less parallelism after the change: in many more cases a new pulse starts only AFTER the render thread finishes its job for the previous pulse. Parallel processing still occurs (a new pulse on the FX thread starts right after a render task is passed to the render thread) but the number of such cases is much lower after the change.
01-03-2013

Assign to Steve who made the aforementioned change to GlassView.m
01-03-2013

That is really interesting. The only change that was made in the C code was to get rid of lockFocus/unlockFocus. This *should* make things faster, not slower.
01-03-2013