JDK-6395149 : EscapeAnalysis slows down Algorythmic code
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 6
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux
  • CPU: x86
  • Submitted: 2006-03-07
  • Updated: 2010-04-02
  • Resolved: 2006-04-06
Related Reports
Duplicate :  
Description
FULL PRODUCT VERSION :
java version "1.6.0-beta2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.6.0-beta2-b72)
Java HotSpot(TM) Client VM (build 1.6.0-beta2-b72, mixed mod

ADDITIONAL OS VERSION INFORMATION :
Linux cehost 2.6.15-1.1831_FC4 #1 Tue Feb 7 13:37:42 EST 2006 i686 i686 i386 GNU/Linux


EXTRA RELEVANT SYSTEM CONFIGURATION :
Pentium 4 Northwood, 2.6ghz

A DESCRIPTION OF THE PROBLEM :
The provided sample code runs about  10% slower when EscapeAnalysis is enabled.

I already submitted another bug report which is about the a performance regression when running this code on 1.3.1 compared to 6.0 but thought it would be better to report this issue seperated.

EA enabled: 1.3749484394335212mb/s
EA disabled: 1.5244832001951338mb/s

I thought EA (should) have no effect on this code at all or just a slightly positive one.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
run the same code provided one time with and one time withought EscapeAnalysis enabled.
I always used the server compiler for my benchmarks.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
ES should have no negative effect for this type of code.
ACTUAL -
EA slows down generated code by about 10%.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
/* PORTED TO JAVA BY ROBERT NEILD November 1999 */

/* PC1 Cipher Algorithm ( Pukall Cipher 1 ) */
/* By Alexander PUKALL 1991 */
/* free code no restriction to use */
/* please include the name of the Author in the final software */
/* the Key is 128 bits */

/* Only the K zone change in the two routines */
/* You can create a single routine with the two parts in it */

//package com.agosys.unicom.encryption;

public class PC1_Stream
{
	int ax, bx, cx, dx, si, tmp, x1a2, res, i, inter, cfc, cfd, compte;
	int x1a0[] = new int[8];
	byte cle[] = new byte[17]; // Hold key
	private boolean usedAsEncrypter = false;
	private boolean alreadyUsed = false;

	/**
	 * Creates a PC1_InputStream. Decodes an input stream of encoded data.
	 * @param in  The input stream to decode.
	 * @param password 16 byte encryption key
	 */

	public PC1_Stream(byte[] password)
	{
		System.arraycopy(password, 0, cle, 0, Math.min(16, password.length));
	}

	private void assemble()
	{
		x1a0[0] = ((cle[0] * 256) + cle[1]);

		code();
		inter = res;

		x1a0[1] = (x1a0[0] ^ ((cle[2] * 256) + cle[3]));
		code();
		inter = (inter ^ res);

		x1a0[2] = (x1a0[1] ^ ((cle[4] * 256) + cle[5]));
		code();
		inter = (inter ^ res);

		x1a0[3] = (x1a0[2] ^ ((cle[6] * 256) + cle[7]));
		code();
		inter = (inter ^ res);

		x1a0[4] = (x1a0[3] ^ ((cle[8] * 256) + cle[9]));
		code();
		inter = (inter ^ res);

		x1a0[5] = (x1a0[4] ^ ((cle[10] * 256) + cle[11]));
		code();
		inter = (inter ^ res);

		x1a0[6] = (x1a0[5] ^ ((cle[12] * 256) + cle[13]));
		code();
		inter = (inter ^ res);

		x1a0[7] = (x1a0[6] ^ ((cle[14] * 256) + cle[15]));
		code();
		inter = (inter ^ res);

		i = 0;
	}

	void code()
	{
		dx = (x1a2 + i);
		ax = x1a0[i];

		cx = 0x015a;
		bx = 0x4e35;

		tmp = ax;
		ax = si;
		si = tmp;

		tmp = ax;
		ax = dx;
		dx = tmp;

		if (ax != 0)
		{
			ax = (ax * bx);
		}

		tmp = ax;
		ax = cx;
		cx = tmp;

		if (ax != 0)
		{
			ax = (ax * si);
			cx = (ax + cx);
		}

		tmp = ax;
		ax = si;
		si = tmp;
		ax = (ax * bx);
		dx = (cx + dx);

		ax = (ax + 1);

		x1a2 = dx;
		x1a0[i] = ax;

		res = (ax ^ dx);
		i = (i + 1);
	}

	/**
	 * Returns a plain byte, which has been unencrypted from the underlying
	 * InputStream.
	 * @see java.io.FilterInputStream
	 */

	public byte[] decrypt(byte[] encryptedData)
	{
		checkUsage(false);

		for (int i = 0; i < encryptedData.length; i++)
		{
			int c = encryptedData[i];

			assemble();
			cfc = (inter >> 8);
			cfd = (inter & 255);

			c = c ^ (cfc ^ cfd);

			for (compte = 0; compte <= 15; compte++)
			{
				/* we mix the plaintext byte with the key */
				cle[compte] = (byte) (cle[compte] ^ c);
			}

			encryptedData[i] = (byte) c;
		}

		return encryptedData;
	}

	public byte[] encrypt(byte[] data)
	{
		checkUsage(true);

		for (int i = 0; i < data.length; i++)
		{
			int c = data[i];

			assemble();
			cfc = (inter >> 8);
			cfd = (inter & 255);

			for (compte = 0; compte <= 15; compte++)
			{
				/* we mix the plaintext byte with the key */
				cle[compte] = (byte) (cle[compte] ^ c);
			}

			c = c ^ (cfc ^ cfd);
			data[i] = (byte) c;
		}

		return data;
	}

	private void checkUsage(boolean isEncrypter)
	{
		if (alreadyUsed)
		{
			if (usedAsEncrypter != isEncrypter)
			{
				throw new IllegalArgumentException("You may either use this class as encrypter or decrypter, not both!");
			}
		} else
		{
			alreadyUsed = true;
			usedAsEncrypter = isEncrypter;
		}
	}

	public static void main(String[] args)
	{
		byte[] testData = new byte[1024 * 1024];
		for (int i = 0; i < testData.length; i++)
		{
			testData[i] = (byte) (i % 127);
		}

		PC1_Stream dec = new PC1_Stream(testData);
		PC1_Stream enc = new PC1_Stream(testData);

		for (int i = 0; i < 5; i++)
		{
			enc.encrypt(testData);
			dec.decrypt(testData);
		}

		for (int m = 0; m < 10; m++)
		{
			int encCount = 50;
			System.out.println("Starte Verschl����sselung");
			long start = System.currentTimeMillis();
			for (int i = 0; i < encCount; i++)
			{
				enc.encrypt(testData);
				dec.decrypt(testData);
			}
			long end = System.currentTimeMillis();
			long duration = end - start;
			System.out.println("Encryption took: " + duration + " with " + ((double) encCount / ((double) duration / (double) 1000)) + "mb/s ");
		}
	}
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
don't enable escape analysis

Comments
EVALUATION There does seem to be a real performance degradation with -XX:+DoEscapeAnalysis. I can easily reproduce a 3.5% degradation running on machine hsdev-5 (a dual 800 Mhz Pentium III system.) with both the product and fastdebug builds. The degradation increases to arount 10% when run with the following options: "-Xbatch and -XX:CICompilerCount=1 -XX:LoopUnrollLimit=0" There is no difference in performance on SPARC. The only optimization currently implemented with escape analysis (EA) is eliminating the locking of unescaped objects. Since this benchmark does not do any locking, so the escape analysis should have no effect on this benchmark. What appears to be happening is that with escape analysis enabled, nodes are passed to Iterative Global Value Numbering (IGVN) in a slightly different order. This changes the code enough to cause a performance difference. The benchmark contains loops with a lot of integer computations. I suspect that the changed code causes the register allocator to produce a less optimal allocation. As an experiment, I changed the code so that the nodes were passed to IGVN with the order required for EA whether or not EA is enabled. I saw the same performance degradation whether or not EA is turned on. I then investigated changing the order of nodes in the worklist passed to IGVN. I tried reversing the order of the list, and sorting the list by node number so that lower numbered nodes are processed first. I observed the following: Performance change for different IGVN worklist ordering (relative to default ordering) Intel SPARC encryption encryption benchmark SPEC8JVM98 benchmark SPEC8JVM98 EA order -4% +3% +1% +2% reverse list +5% +2% +1% +2% sort on node # +8% +2% +1% -1% There is another bug, 6396979, which reports a performance regression in this same encryption benchmark. After the fix for this bug was integrated, I re-ran my tests and found no significant performance difference changing the ordering of the IGVN worklist except that sorting on node # was 7% slower than the other cases. The default ordering is now 13% faster than the best ordering before the bug fix. It seems that the performance degradation reported in this bug report was due to the same problem as 6396979 and the fix for that bug eliminates the degradation caused by escape analysis, so I am closing this bug as a duplicate of 6396979.
06-04-2006