JDK-4227284 : [Fmt-Da] [Doc] Inconsistent results when parsing Strings with SimpleDateFormat
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.text
  • Affected Version: 1.1.7,1.2.2
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: generic
  • CPU: generic
  • Submitted: 1999-04-06
  • Updated: 2019-04-11
Related Reports
Relates :  
Relates :  
Relates :  
Description

Name: dbT83986			Date: 04/06/99


My JDK info:
java version "1.1.7B"
java full version "JDK1.1.7U"

My issue is that SimpleDateFormat parse() does not throw a ParseException in those cases where it should.

There are two situations: parsing the String "4F" with the parse pattern "HH" and parsing the String "4:2" with the parse pattern "hh:mm". (BTW, I know that this appears to be two issues, but they are closely related. Spank me if I should have submitted these separately.)

The problem in both cases is that the developer has to do a certain amount of prep work on the String before passing it to SDF.parse(). In the first case, I need to check for alpha characters mixed in with numeric characters.  Note that checking for alpha chars is not something I need to do when using other patterns (e.g. hh:mm a). And in the second case, I need to check that the value for minutes is entered as two digits. Shouldn't 4:2 always be treated as invalid since it is ambiguously 4:20 or 4:02? (At the very least, it should be parsed as 4:20).

Here is sample code for both problems:

// parsing alpha chars with HH pattern:

import java.util.Date;
import java.text.SimpleDateFormat;
import java.text.ParseException;

public class SDFIssueOne
{
	public static void main( String[] args )
	{
		String parsePattern = "HH";
		SimpleDateFormat sdf = new SimpleDateFormat(parsePattern);

		String[] inputs = { "1", 	// works correctly
												"15", // works correctly
												"1A", // should fail but doesn't
												"4F", // should fail but doesn't
												"A3", // fails correctly
												"junk"	};	// fails correctly	
		Date retDate = null;

		for( int i = 0; i < inputs.length; i++ ) {
			try
			{
				sdf.setLenient(false);
				retDate = sdf.parse(inputs[i]);			
				System.out.println( "Parsed date " + inputs[i] + " is " + retDate );
			}
			catch (ParseException e)
			{	System.out.println( "Parse exception for " + inputs[i]);
				System.out.println( e.getMessage() );
			}
		}
	}
}

// parsing ambiguous minutes with "hh:mm":

import java.util.Date;
import java.text.SimpleDateFormat;
import java.text.ParseException;

public class SDFIssueTwo
{
	
	public static void main( String[] args )
	{
		String parsePattern = "hh:mm";
		SimpleDateFormat sdf = new SimpleDateFormat(parsePattern);

		//  should fail - only last  fail:
		String[] inputs = { "3:45", // works as expected
												"4:2", 	// DOESN'T throw exception (it should) - produces 4:02.
												"5:234", // throws exception (correctly)
												"11:"	}; // throws exception (correctly)
		Date retDate = null;

		for( int i = 0; i < inputs.length; i++ ) {
			try
			{
				sdf.setLenient(false);
				retDate = sdf.parse(inputs[i]);			
				System.out.println( "Parsed date " + inputs[i] + " is " + retDate );
			}
			catch (ParseException e)
			{	System.out.println( "Parse exception for " + inputs[i]);
				System.out.println( e.getMessage() );
			}
		}
	}
}
(Review ID: 56588) 
======================================================================

Comments
EVALUATION The documentation does not emphasize that SimpleDateFormat stops parsing when the patter is satisfied. This can be detected by checking the ParsePosition on return from SimpleDateFormat.parse(String input, ParsePosition p) SDFIssueOne exhibits this by accepting "1A" and "4F" for the pattern "HH". "H" falls under Number presentation so "HH" and "H" are equivalent during parsing (because of zero padding and the requirement that the number of formatting characters is the _minimum_ number of digits accepted). The "A" and the "F" above simply cause the parser to stop parsing. This is a useful feature as it allows dates of unspecified length to be parsed from the middle of a string. SDFIssueTwo exhibits the same parsing semantics. Note that 4:0002 will be parsed as the date Thu Jan 01 04:02:00 PST 1970 due to the interpretation of zero-padding. This indicates that this is not a bug, however, the documentation perhaps should point out more details of the parsing algorithm. On the other hand, it could be argued that SimpleDateFormat.parse(String text) should throw an exception if the entire string text is not consumed. ###@###.### 1999-10-14 Unlike the format, parse() does not follow the number of character in the ways specification say. The only exception is year (y). We have to clarify the meaning of lenient and number of symbols for parse() as well. koushi.takahashi@japan 1999-10-26 I decided not to modify parse(String text) to throw ParseException for an "invalid" text (e.g. "1A") in order to keep compatibility. Adding a new method to allow developers to determine parse()'s flexibility (in other words, whether ParseException is thrown or not) may be a good idea, but the same detection can be done easily by using ParsePosition. Please use parse(String, Parseposition) instead of parse(String). You can find out whether the whole text is parsed by comparing the current parse position with the actual length of the given string. API doc should be updated so that developers can learn the use of ParsePosition easily. ###@###.### 2005-1-27 00:49:26 GMT
27-01-2005

WORK AROUND Name: dbT83986 Date: 04/06/99 While there is a work-around (prepping the String before passing it to parse()) it makes usng SimpleDateFormat much more cumbersome. ======================================================================
06-08-2004