The customer figured out that GZIPOutputStreams from java.util.zip don't scale
beyond 2 way Windows servers.
On 4way servers the CPUs don't get used beyond ~50% and the throughput doesn't
increase.
The customer provided a test program.
The test program is attached to the call.
The test program creates a number of threads (8 in the current setting).
These 8 threads start compressing typical http data. They sleep after each compression.
The sleep time goes down over time.
The reduced sleep time leads to a higher system load and a higher throughput.
The benchmark shows that it doesn't scale on a Windows 4 way system.
Tests on Sparc Solaris systems showed that the scale a kind of up to 8 CPUs.
The customer problem occur as well on 1.4.2 and tiger beta.
The customer provided an alternative pure Java implementation (jazzlib) which
is doing much better.