JDK-8259896 : Base64 MIME decoder should allow unrecognised characters within padding.
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util
  • Affected Version: 8,11,16
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2021-01-14
  • Updated: 2021-02-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Description
A DESCRIPTION OF THE PROBLEM :
The MIME RFC requires that any character outside of the base64 alphabet be ignored.

The MIME decoder does this *except* within padding.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
java.util.Base64.getMimeDecoder().decode("AA=!=");

When two padding characters are appropriate, it should be legal to have characters outside the base64 alphabet between them, as such characters are always ignored.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
All characters outside of the base64 alphabet should be ignored, wherever they are, so the above should decode without issue.
ACTUAL -
Exception java.lang.IllegalArgumentException: Input byte array has wrong 4-byte ending unit
        at Base64$Decoder.decode0 (Base64.java:733)
        at Base64$Decoder.decode (Base64.java:535)
        at Base64$Decoder.decode (Base64.java:558)

---------- BEGIN SOURCE ----------
java.util.Base64.getMimeDecoder().decode("AA=!=");
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Input can be sanitised to remove characters outside of the base64 alphabet prior to processing. For example:

input = input.replaceAll("[^A-Za-z0-9+/=]","");

FREQUENCY : always



Comments
Pasting below, the comment from Raffaello Giulietti, from a mail[1] on core-libs-dev mailing list. ===== Hi, in my opinion, the reporter of [1] is right in requiring that extraneous characters be discarded, even inside the padding. Indeed, the first full paragraph on [2] reads: "Any characters outside of the base64 alphabet are to be ignored in base64-encoded data." where "the base64 alphabet" also includes the padding character '=' and "base64-encoded data" extends to padding as well, because padding is an essential part of encoding. The legitimate doubt expressed in comment [3] should thus be solved in favor of a bug fix. My 2 cents Greetings Raffaello ---- [1] https://bugs.openjdk.java.net/browse/JDK-8259896 [2] https://tools.ietf.org/html/rfc2045#page-26 [3] https://bugs.openjdk.java.net/browse/JDK-8259896?focusedCommentId=14395485&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14395485 ===== [1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2021-February/073698.html
17-02-2021

Extraneous characters should indeed be ignored in the body of the data, but it's unclear whether they should also be ignored if they occur within the trailing padding.
21-01-2021

The observations on Windows 10: JDK 8: Failed, IllegalArgumentException thrown JDK 11: Failed. JDK 16: Failed
18-01-2021