JDK-6193066 : REGRESSION: High context switch rate on 5.0 under linux (Fedora Core 2)
  • Type: Bug
  • Component: client-libs
  • Sub-Component: java.awt
  • Affected Version: 5.0
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2004-11-09
  • Updated: 2010-08-06
  • Resolved: 2005-09-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availabitlity Release.

To download the current JDK release, click here.
JDK 6
6 b51Fixed
Description
FULL PRODUCT VERSION :
java version "1.5.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64)
Java HotSpot(TM) Client VM (build 1.5.0-b64, mixed mode, sharing)


FULL OS VERSION :
Linux www.my2do.com 2.6.8-1.521 #1 Mon Aug 16 09:01:18 EDT 2004 i686 athlon i386 GNU/Linux

EXTRA RELEVANT SYSTEM CONFIGURATION :
Fedora core 2, Kernel 2.6.8-1

A DESCRIPTION OF THE PROBLEM :

Very high context switch rate observed under Fedora Core 2 when running any Java application (even when the applicaiton is idle, and doing nothing). Can be demonstrated by running SwingSet2 demo.

Under previous JVM (tested on 1.4.2_01 and 1.4.2_06), context switch rate (as reported by vmstat) is approx 600/per second.

Under JDK 5.0, context switch rate is approx. 1500/second .

-Xint makes no difference
-server make no difference

This makes the system very slow - even though the cpu utilization is very low ( < 5%) and there is plenty of memory (512 MB).

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :

In a shell window run vmstat 5, watch the cs (context switch) rate

Run any Swing application (e.g. java -jar SwingSet2.jar)


EXPECTED VERSUS ACTUAL BEHAVIOR :

Expect cs to be <600
Actual is >1500

System feels slow/unresponsive

REPRODUCIBILITY :
This bug can be reproduced always.

CUSTOMER SUBMITTED WORKAROUND :
Run jdk1.4.2

Release Regression From : 1.4.2
The above release value was the last known release where this 
bug was known to work. Since then there has been a regression.
###@###.### 2004-11-09 23:32:02 GMT

Comments
EVALUATION Small correction to the plan above - flushes happen by request. If the X request queue is idle (Java minimized or user is off for some coffee) no flushes happen. This means XToolkit thread will sleep for as long as MAXIMUM_POLL_TIMEOUT (usually 500ms).
2005-08-16

EVALUATION I reproduced the problem on Fedora Core 2. When SwingSet2 is idle, the context switch rate with JDK 1.5 is approx 1100/sec while with 1.4.2 it's about 100/sec. I also tested on a SuSE 9.1 box, same story, 1.5 has about 1000 more context switches per second than 1.4.2. It looks like there's a 1 ms timer somewhere inside JDK. It's not JVM, as I copied 1.5 libjvm.so into 1.4.2 JDK and the test result looks normal (~100 cs/sec). ###@###.### 2004-11-11 18:50:14 GMT ----------------------------------------- With strace, it showed a poll() loop with 1ms timeout. I tracked it down to performPoll() in XToolKit.c. It used to have an aging mechanism to slow down polling frequency, but now it's a constant 1ms timeout. Reassign to AWT. Note the problem is less severe on 2.4 Linux kernel (on a RH7.3 box, the overhead of JDK 1.5 is about 200 cs/sec), because 2.4 kernel timer runs at 100Hz frequency, while in 2.6 kernel it's changed to 1000Hz. It means on 2.4 kernel, the 1ms poll() loop can only wakes up roughly once every 10ms, while on 2.6 kernel, it truly becomes a 1ms timer. ###@###.### 2004-11-11 19:08:18 GMT I'd like some real number measurements to be done to prove that high-context switch is indeed a cause of any system performance degradation. I am running Linux 2.6 for a half a year, and I never noticed that Java causes any performace impact. I tried changing 1ms to 10ms to 50ms. I see performance degradation in Java (less than 1%, but it is constant) in SwingMark. We might investigate returning back to aging strategy or to the less frequent constant rate, but it is risky to do so in an update release since the consequences are not predictable, and I don't see a strong justification in such change. ###@###.### 2004-11-12 07:59:17 GMT I was able to reproduce the problem as reported, here are the measurements: (no background applications) ============================================================================ /export/dom/work/tools/refworkload/log.startup-mustang-1.5.0-nobackground Benchmark Samples Mean Stdev Geomean Weight startup3 1 3.73 Framer 1 0.54 JEdit 1 2.08 LimeWire 1 4.99 NetBeans 1 10.60 Noop 1 0.13 XFramer 1 0.71 ============================================================================ (5 1.5.0 AWT applications) ============================================================================ /export/dom/work/tools/refworkload/log.startup-mustang-1.5.0-5-background Benchmark Samples Mean Stdev Geomean Weight startup3 1 3.89 Framer 1 0.61 JEdit 1 2.14 LimeWire 1 5.30 NetBeans 1 10.77 Noop 1 0.13 XFramer 1 0.82 ============================================================================ (5 1.4.2 AWT applications) ============================================================================ /export/dom/work/tools/refworkload/log.startup-mustang-1.5.0-5-1.4.2-background Benchmark Samples Mean Stdev Geomean Weight startup3 1 3.73 Framer 1 0.55 JEdit 1 2.05 LimeWire 1 5.05 NetBeans 1 10.60 Noop 1 0.14 XFramer 1 0.70 ============================================================================ ###@###.### 2004-11-25 14:45:14 GMT Measurements with the fix: ============================================================================ log.startup-mustang-1.5.0-nobackground Benchmark Samples Mean Stdev Geomean Weight startup3 1 3.73 ============================================================================ log.startup-mustang-1.5.0-5-1.4.2-background Benchmark Samples Mean Stdev %Diff P Significant startup3 1 3.73 0.06 ============================================================================ log.startup-mustang-1.5.0-5-background-with-fix Benchmark Samples Mean Stdev %Diff P Significant startup3 1 3.89 -4.23 ============================================================================ log.startup-mustang-1.5.0-5-background-with-50ms Benchmark Samples Mean Stdev %Diff P Significant startup3 1 3.79 -1.52 ============================================================================ log.startup-mustang-1.5.0-5-background Benchmark Samples Mean Stdev %Diff P Significant startup3 1 3.89 -4.23 ============================================================================ log.startup-mustang-1.5.0-15-background-with-fix Benchmark Samples Mean Stdev %Diff P Significant startup3 1 3.79 -1.65 ============================================================================ log.startup-mustang-1.5.0-15-background-with-50ms Benchmark Samples Mean Stdev %Diff P Significant startup3 1 3.82 -2.42 ============================================================================ Indeed, 1.4.2 doesn't affect the performance of foreground process. However, 1.5.0 affects it no matter of what fix is in use (1ms, 50ms or aging code). ###@###.### 2004-11-26 13:21:58 GMT The Motif 1.5 toolkit looks like this: ============================================================================ log.startup-mustang-1.5.0-15-background-motif Benchmark Samples Mean Stdev %Diff P Significant startup3 1 3.78 -1.27 ============================================================================ Which shows, that 1.5 performance impact comes not only from AWT. Notice from the numbers above that XAWT with aging code doesn't work much better (3.89 and 3.79) ###@###.### 2004-11-26 13:35:59 GMT With 1ms timeout value context-switch rate is indeed noticeable. ###@###.### 2004-11-26 13:45:27 GMT I performed some additional experiments. I noticed that aging algorithm makes Swing look "slow" - it responds to user requests (e.g. Mouse) with the significant, noticeable, delay. Swing on Motif looks similarly - it looks slow. This doesn't happen when timeout is 1. I am inclined to leave timeout a small value, since it allows desktop applications work "faster". Notice, that timeout almost doesn't affect performance of the running application. For the cases when Java GUI is run on the server (I guess it might only be SunRay), and many GUIs are run simultaneously, I implemented an env variable _AWT_STATIC_POLL_TIMEOUT - set it to 0 to enable aging algorithm, to some value to make timeout be statically assigned this value, -1 for infinity. The default value is 11. Right now this property is For Internal Use Only, when (if) the customer demand for this property to be public appears, we should consider publishing it or re-visit polling mechanism. ###@###.### 2005-2-18 12:37:09 GMT Based on the feedback from bug parade and forums, we decided to re-open this issue and find the answers to the remainig questions. I made several experiments investigating the particular issue of very slow perceived performance when aging algorithm is used. I checked the time between ButtonPress event arrival and Swing button repainting and noticed that while visually I don't see button being repainted for significant amount of time, there is no delay in code - repainting happens almost immediately (1-2ms). So, the hypothesis is that graphical data themselves are getting transfered into screen with some delay. On X, XSync/XFlush are responsible for doing such updates. On Motif toolkit, we had several places from which XFlush was called: - AWT_UNLOCK, which would schedule next XFlush in 100ms. This allows for batching. - performPoll would sleep for the amount of time not bigger than auto-flush timeout or scheduled flush timeout This logic is broken in XAWT - we flush only after we have dispatched some event. Experiments confirm this - if you press on a Swing button, it repaints slowly. If you press/release it repaints immediately, since ButtonRelease causes flush of repaints generated by ButtonPress. It has been noticed that Motif flush also doesn't work nicely - the delay between press and repaint is noticeable. I suppose that in addition to implicit flushes in AWT_UNLOCK it would be nice to have explicit flushes at the end of Swing repainting loop. We'll investigate how to do this efficiently, the current plan is to use Graphics object state. The current plan is to: - restore flushing logic similar to Motif. I suppose that ideal would be to make sure that flush happens at least every xxx ms, regardles of the event queue state - add explicit flushes at the appropriate places - restore aging algorithm ###@###.### 2005-07-20 08:51:19 GMT
2004-11-11