United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6256805 : LTP: XMLEncoder emits invalid XML

Details
Type:
Bug
Submit Date:
2005-04-18
Status:
Resolved
Updated Date:
2011-03-14
Project Name:
JDK
Resolved Date:
2006-03-30
Component:
client-libs
OS:
linux
Sub-Component:
java.beans
CPU:
x86
Priority:
P3
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:

Related Reports
Relates:
Relates:

Sub Tasks

Description
FULL PRODUCT VERSION :
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)


ADDITIONAL OS VERSION INFORMATION :
True for Windows and Linux (all versions)

A DESCRIPTION OF THE PROBLEM :
XMLEncoder will produce invalid XML under a range of circumstances.  Valid characters in XML (from http://www.w3.org/TR/REC-xml/) are:

"Character Range
[2]   	Char	   ::=   	#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]	/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
"

No character that is outside this range may be put in an XML file.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Any function that returns a string where ANY of the characters are not permitted by the XML specification, or any method that returns a char where the char is not in the permitted range.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Strings should be  encoded (Base64) before saving them to the file.  Characters could be saved as ints or encoded strings.
ACTUAL -
Invalid characters are encoded directly into the file.

REPRODUCIBILITY :
This bug can be reproduced always.
###@###.### 2005-04-18 19:36:08 GMT

                                    

Comments
EVALUATION

We should conform to recomendation
###@###.### 2005-06-06 15:52:12 GMT
                                     
2005-06-06
EVALUATION

If the characters are not legal in XML (control chars #0-8, for example), they will have to be replaced with something that is legal. It is often suggested that you convert such characters to processing instructions like <?char 7?> or elements like <char num="7"/>.

I think that processing instructions are preferable than elements because they are never used in ObjectHandler.

Base64 encoding is not good solution because we can't recognize encoded and/or decoded strings. And we have to encode all strings and characters in XML document even if it is not necessary.
                                     
2005-10-19
EVALUATION

I introduced attribute code for element <char>. The code contains hexadecimal value if it starts with '#'. Otherwise it contains decimal value.
                                     
2005-11-21



Hardware and Software, Engineered to Work Together