JDK-6644493 : [Fmt-Da] SimpleDateFormat parsing RFC822 time offset is slow
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.text
  • Affected Version: 6
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: windows_xp
  • CPU: x86
  • Submitted: 2007-12-20
  • Updated: 2011-05-18
  • Resolved: 2011-05-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7
7 b126Fixed
Related Reports
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.6.0_03"
Java(TM) SE Runtime Environment (build 1.6.0_03-b05)
Java HotSpot(TM) Client VM (build 1.6.0_03-b05, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]

A DESCRIPTION OF THE PROBLEM :
Formatting a Date value, using SimpleDateFormat with the pattern "yyyy-MM-dd'T'HH:mm:ss.SSSZ" results in an output string with an RFC822-style time zone offset. For example: 2007-11-21T11:35:31.576-0700

Using the same SimpleDateFormat instance to parse this string is slow. It appears that the implementation is attempting to match a timezone identifier string (e.g. GMT or MST) even though the format specifier is Z, not z. This results in a large performance penalty.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Please see test case below.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expected no significant performance penalty for parsing the date string with the Z format specifier.
ACTUAL -
Current date: 2007-11-21T11:35:31.576-0700
Elapsed time (slow): 1.625
Elapsed time (fast): 0.094

These results show that using the Z specifier to parse the timezone offset is more than 15x (1500%) slower than parsing it without the Z specifier.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

/**
 * This class demonstrates a performance issue when using the Z format specifier to
 * parse an RFC822-style (+/-HHMM) timezone offset.
 *
 * @author Dan Rabe
 */
public class SDFPerformance
{
	public static void main( String[] args ) throws ParseException
	{
		long	start;
		long	finish;
		final int N = 10000;	// Number of iterations
		SimpleDateFormat	slow = new SimpleDateFormat( "yyyy-MM-dd'T'HH:mm:ss.SSSZ" );
		SimpleDateFormat	fast = new SimpleDateFormat( "yyyy-MM-dd'T'HH:mm:ss.SSS" );
		Date				now = new Date();

		String				s = slow.format( now );
		System.out.println( "Current date: " + s );

		// Time to parse using Z format

		start = System.currentTimeMillis();
		for ( int i = 0; i < N; i++ )
		{
			Date d = slow.parse( s );
		}
		finish = System.currentTimeMillis();
		System.out.println( "Elapsed time (slow): " + (finish-start)/1000.0 );

		// Time to parse without Z format

		start = System.currentTimeMillis();
		for ( int i = 0; i < N; i++ )
		{
			Date d = fast.parse( s );
		}
		finish = System.currentTimeMillis();
		System.out.println( "Elapsed time (fast): " + (finish-start)/1000.0 );
	}
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
One option is to parse the string with the Z format specifier. The result still appears to be correct.

Another option is to make some assumptions about what time zone the date is in.

Another option is to use a literal 'Z' at the end of the format specifier to indicate that this is a Zulu (GMT) time, as suggested by the ISO 8601 standard and allowed by the w3c date/time note(http://www.w3.org/TR/NOTE-datetime).

Comments
EVALUATION http://hg.openjdk.java.net/jdk7/build/jdk/rev/7c7e4a0330bc
17-01-2011

EVALUATION Looking up the first char didn't improve performance much. This fix ended up code cleanup for numeric time zone offset parsing.
16-12-2010

EVALUATION The fix for 6594712 improved performance. However, the parser should take a look at the first char to see if it's a sign, then assume it's in the Z format. (Note that Z parses z as well.)
26-12-2007