JDK-8183582 : Rationalize doclet -docencoding and -charset options
  • Type: Bug
  • Component: tools
  • Sub-Component: javadoc(tool)
  • Affected Version: 9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2017-07-05
  • Updated: 2018-01-25
  • Resolved: 2017-08-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10
10 b18Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
The standard doclet provides two options and two internal settings for what ought to be a single concept. 

-docencoding is defined to be the actual encoding used to write an output file. 
See BaseConfiguration line 201, 491, 
See StandardDocFileFactory line 208

-charset is defined to be the declared charset written into an HTML META tag. 
See HtmlConfiguration line 147, 547
See uses of "charset" in HtmlDocWriter, HtmlDocletWriter, IndexRedirectWriter

The actual value for the charset written into HTML files should directly correspond to the docencoding. It is wrong to allow them to be specified independently and differently.

I suggest the following:
* keep docencoding option and variable
* keep charset option for compatibility, but ensure it is not set differently from docencoding. (This raises minor design issues of, what if only one is specified; that needs discussion)
* ensure that in HTML files, we always write a META tag for Charset, but the value should be the name of the docencoding being used.
* ensure that the default file encoding, if no options are specified , is UTF-8.

Proposed test cases:
No options given: output file should in be UTF-8, and should contain meta tag for charset for UTF-8
One option given: output should be in specified format ... ISO-8859-1 is a good alternative to UTF-8
Two options given with same value:  : output should be in specified format ... ISO-8859-1 is a good alternative to UTF-8
Two options given with different values: error