JDK-8252091 : UI Incorrect shaping of chars in the Unicode Mongolian range (0x1800-0x18FF)
  • Type: Enhancement
  • Component: client-libs
  • Sub-Component: 2d
  • Affected Version: 6,7,8,11,14.0.2
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: windows_10
  • CPU: x86_64
  • Submitted: 2020-08-19
  • Updated: 2022-11-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
I observed the issue in
- version 14.0.2 under Windows 10
- version 11.0.8 under Linux Mint 10.3 (=> Ubuntu 18.04)

A DESCRIPTION OF THE PROBLEM :
Background:
The Mongolian script (Unicode range 0x1800 - 18FF) is a "complex" script, requiring for proper shaping paying attention to the context of each character (for instance, in OpenType, OT tables like "isol", "init", "medi", "fina" and "rlig" are required, as with Arabic).

If a Mongolian text is set into an AWT / Swing control it is not correctly shaped, unless some additional text is also added in some other "complex" script, like Arabic or Devanāgarī.

I am convinced the problem is in the `src/java.desktop/share/classes/sun/font/FontUtilies.java` class; in its method `public static boolean isComplexCharCode(int code)` around line  290, there is the following code:
```
        else if (code < 0x1780) {
            return false;
        }
        else if (code <= 0x17ff) { // 1780 - 17FF Khmer
            return true;
        }
        else if (code < 0x200c) {
            return false;
        }
```
which returns `false` for the Mongolian range. I propose to correct it into:
```
        else if (code < 0x1780) {
            return false;
        }
        else if (code <= 0x18ff) { // 1780 - 17FF Khmer
            return true;           // 1800 - 18FF Mongolian
        }
        else if (code < 0x200c) {
            return false;
        }
 ```
which returns true if the passed-in character code is a Khmer or a Mongolian character.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Set a Mongolian text (for instance "ᠪᠠᠢᠨ᠎ᠠ") into an AWT / Swing control and a Mongolian + Devanāgarī or Devanāgarī + Mongolian text in another (for instance "ᠪᠠᠢᠨ᠎ᠠ अनुच्छेद" or "अनुच्छेद ᠪᠠᠢᠨ᠎ᠠ").

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The Mongolian text is correctly shaped in both cases
ACTUAL -
The Mongolian text is correctly shaped in the second case, as the other "complex" text triggers the proper shaping, but it is incorrectly shaped in the first case (something like  "á ª á   á ¢ á ¨ á  ").

The example source attached below shows wrong and correct shaping cases for Swing labels; other controls exhibit the same behaviour.

---------- BEGIN SOURCE ----------
package com.vistamaresoft.swingtest;
import javax.swing.BoxLayout;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
public class Main
{
    private         JFrame      frame;
    private static  Main        window;

    public Main()
    {
        frame   = new JFrame();
        frame.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);
        frame.setTitle("Mongolian Swing Test");
        frame.setBounds(100, 100, 300, 200);
        //Set up the GUI.
        JPanel      panel       = new JPanel();
        panel.setLayout(new BoxLayout(panel, BoxLayout.Y_AXIS));
        JLabel      label       = new JLabel("ᠪᠠᠢᠨ᠎ᠠ");         // NOT CORRECT
        panel.add(label);
        label                   = new JLabel("Abc ᠪᠠᠢᠨ᠎ᠠ");     // NOT CORRECT
        panel.add(label);
        label                   = new JLabel("ᠪᠠᠢᠨ᠎ᠠ  كتاب ");  // CORRECT
        panel.add(label);
        label                   = new JLabel("كتاب ᠪᠠᠢᠨ᠎ᠠ");        // CORRECT
        panel.add(label);
        label                   = new JLabel("ᠪᠠᠢᠨ᠎ᠠ अनुच्छेद");        // CORRECT
        panel.add(label);
        label                   = new JLabel("अनुच्छेद ᠪᠠᠢᠨ᠎ᠠ");        // CORRECT
        panel.add(label);
        frame.setContentPane(panel);
    }

    public static void main(String[] args)
    {
        window = new Main();
        window.frame.setVisible(true);
    }
}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
As adding arbitrary strings to UI element to trigger the correct shaping is in general not feasible, I am not aware of any workaround.

FREQUENCY : always



Comments
A link :https://en.wikipedia.org/wiki/Mongolian_script#Font_issues
21-09-2020

Of course. It is an RFE really. This has never ever been supported. PS I initially accidentally added 8 and 11 to the labels instead of version field.
21-09-2020

> Or am I missing something? It was not an oversight that this is missing. It was deliberate. We add scripts to the list as we can support them. This is a vertical script. We have never supported a vertical script. We don't have code for it. If some fonts have a horizontal way of supporting it, then that might need special treatment.
21-09-2020

[~prr] does it affect any other releases other than reported 14.0.2 ?
21-09-2020

Additional Information from submitter: =========================== Sample code with non-ASCII characters not mangled: ---------- BEGIN SOURCE ---------- <FIND><IN><ATTACHMENTS><MAIN.JAVA> ---------- END SOURCE ---------- As for JDK supporting Mongolian: it seems to me it actually already does, as Mongolian is rendered properly once the "complex script" machinery is activated by another "officially complex" script in the same string. Of course, the font used must support it, but I assume this to be true for any script with context-dependent shapes. Or am I missing something?
16-09-2020

Might not be so simple. I have to look to see if any of the font supporting it do so in a horizontal layout. We do not support vertical layout *at all*. So that would be a major new feature, requiring a lot more work than implied above.
16-09-2020

Attached testcase(Main.java) in not mangled. Discussion is here: https://stackoverflow.com/questions/63440094/java-errors-in-shaping-mongolian-glyphs
27-08-2020

Can you get a test case where the characters are not completely mangled and assign it back to me. Unicode escapes "\uNNNN" are the safest way to do this.
24-08-2020

Having said that we already added a Mongolian font to the windows fontconfig file, so maybe it is time to support it.
20-08-2020

It isn't an oversight so much as we add ranges there only when we decide we can support it at least for rendering. If it isn't there, JDK doesn't support it.
20-08-2020

Executing the provided testcase on Windows 10 OS Test Result: ============ 11.0.8: Fail 14.0.2: Fail 15ea35 :Fail
20-08-2020