JDK-4480282 : Arabic native-to-Swing cut & paste is broken.
  • Type: Bug
  • Component: client-libs
  • Sub-Component: java.awt
  • Affected Version: 1.4.0
  • Priority: P4
  • Status: Closed
  • Resolution: Cannot Reproduce
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2001-07-16
  • Updated: 2002-02-12
  • Resolved: 2002-02-06
Related Reports
Relates :  
Description
Run the attached program.  Click on the JTabbedPane tabs until all the panes are
visible.  Finish preparation by clicking on the tab "Swing JTextFields".
Twenty different JTextFields are in this pane.  They are arranged in 4 columns
and 5 rows.

Copy the language String from the command line (or any other Arabic from
an X-terminal).  Paste it into one of the JTextFields.  Nothing happens.
Copy and paste English text and all is well.

algol% uname -a
SunOS algol 5.8 Generic_108528-06 sun4u sparc SUNW,Ultra-2
algol% echo $LANG
ar
algol% java -version
java version "1.4.0-beta_refresh"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta_refresh-b71)
Java HotSpot(TM) Client VM (build 1.4.0-beta_refresh-b71, mixed mode)

allan.jacobs@Eng 2001-07-16

Attached a (pretty stupid and simple) code JT.java.  It's much smaller than
FontTest.java.
(1) Give JT's JFrame focus.  Type Ctrl-Space so that the Arabic input method
    is activated.
(2) Type Arabic into JT's JTextField.
(3) Select the new Arabic text.  Hit the sparc "Copy" button.
(4) Go to a native X-window, give it focus.  Type Ctrl-Space so that the Arabic
    input method is activated. Hit the sparc "Paste" button.
    The Arabic will be copied.  But there will be a Latin character at the
    end of the paste.
(5) Return to the JTextField.  Select and delete the Arabic.  Type some
    new Arabic in.  Select the new Arabic using the left mouse button.
(6) Go to the native X-window, give it focus, and paste the Arabic using
    the middle mouse button.  The Arabic will be copied.  But there will be
    a Latin character at the end of the paste.
(7) Return to JT.  Delete the text there.
(8) Go to the native X-window, give it focus, and enter some Arabic text.
    Use the left mouse button to select.  Hit the sparc "Copy" button.
(9) Return to JT.  Hit the sparc "Paste" button.  Nothing will happen.
(10) Delete the text in JT.
(11) Return to the X-window.  Enter some Arabic text. Use the left mouse
    button to select.
(12) Return to JT.  Hit the middle mouse button.  Nothing happens.

###@###.### 2001-09-18

Danila Sinopalnikov (###@###.###) sent a utility dumpsel.c that dumps
the contents of the clipboard on demand.  It has been attached.  The results
are attached to this bug report, too -- in the file arabic_8.sel.

java version "1.4.0-beta3"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta3-b80)
Java HotSpot(TM) Client VM (build 1.4.0-beta3-b80, mixed mode)

###@###.### 2001-09-24

Comments
EVALUATION Commit to fix in merlin (redbug). eric.hawkes@eng 2001-07-22 Name: dsR10078 Date: 07/24/2001 XTerm exports the primary selection contents in several native formats. JDK supports only three of them: STRING, TEXT, COMPOUND_TEXT. COMPOUND_TEXT is preferred and we choose it to perform the text tranfer. If the language settings are as follows: LANG=ru_RU.ANSI1251 LC_CTYPE=ru_RU.ANSI1251 LC_NUMERIC=ru_RU.ANSI1251 LC_TIME=ru_RU.ANSI1251 LC_COLLATE=ru_RU.ANSI1251 LC_MONETARY=ru_RU.ANSI1251 LC_MESSAGES=C LC_ALL= and the language string is selected in the XTerm, the text data exported by XTerm in COMPOUND_TEXT format will contain only the selected string without any prefixes. According to ISO 2022 the initial state for the COMPOUND_TEXT format corresponds to Latin-1 charset. So russian text is decoded with ISO8859_1 decoder that produces "gibberish". So the cause is that the text data exported by XTerm in COMPOUND_TEXT is incorrect. However this behavior seems to be common for most native applications. The bug is not reproducible if the mapping for COMPOUND_TEXT in flavormap.properties is commented out: --- flavormap.properties Tue Jul 24 20:28:15 2001 *************** *** 60,66 **** # See java.awt.datatransfer.DataFlavor.selectBestTextFlavor for a list of # text flavors which support the charset parameter. ! COMPOUND_TEXT=text/plain;charset=x-compound-text;eoln="\n";terminators=0 TEXT=text/plain;eoln="\n";terminators=0 STRING=text/plain;charset=iso8859-1;eoln="\n";terminators=0 FILE_NAME=application/x-java-file-list;class=java.util.List --- 60,66 ---- # See java.awt.datatransfer.DataFlavor.selectBestTextFlavor for a list of # text flavors which support the charset parameter. ! # COMPOUND_TEXT=text/plain;charset=x-compound-text;eoln="\n";terminators=0 TEXT=text/plain;eoln="\n";terminators=0 STRING=text/plain;charset=iso8859-1;eoln="\n";terminators=0 FILE_NAME=application/x-java-file-list;class=java.util.List The bug is not reproducible if we follow the behavior of native applications and use the default charset decoder in the initial state. Corresponding patch is attached in ByteToCharCOMPOUND_TEXT.java.pch. This bug can be reproduced with the following minimal test case: ------------------------------------------------------------------------------- import java.awt.Frame; import java.awt.datatransfer.Clipboard; import java.awt.datatransfer.DataFlavor; import java.awt.datatransfer.Transferable; import java.awt.datatransfer.UnsupportedFlavorException; import java.awt.event.MouseAdapter; import java.awt.event.MouseEvent; import java.awt.event.WindowAdapter; import java.awt.event.WindowEvent; import java.io.IOException; import java.util.Locale; public class Test { public static void main(String[] args) { final String localeString = Locale.getDefault().toString() + "/" + Locale.getDefault().getDisplayLanguage(); final Frame frame = new Frame(); final Clipboard selection = frame.getToolkit().getSystemSelection(); System.out.println(localeString); frame.setBounds(200, 200, 200, 200); frame.addMouseListener(new MouseAdapter() { public void mouseClicked(MouseEvent e) { System.out.println(e); final Transferable t = selection.getContents(null); if (t.isDataFlavorSupported(DataFlavor.stringFlavor)) { try { String string = (String)t.getTransferData(DataFlavor.stringFlavor); System.out.println("Contents: " + string); } catch (UnsupportedFlavorException ufe) { ufe.printStackTrace(); } catch (IOException ioe) { ioe.printStackTrace(); } } else { System.out.println("No string data"); } } }); frame.addWindowListener(new WindowAdapter() { public void windowClosing(WindowEvent e) { System.out.println(e); frame.dispose(); } }); frame.setVisible(true); } } ------------------------------------------------------------------------------- Run the test case. A language string will be written to the console. A frame will appear. Select the language string and then click in the frame. With the latest merlin build the output is as follows: <das@aldebaran(pts/9).307> /export2/das/jdk/bin/java Test ru_RU/�������������� java.awt.event.MouseEvent[MOUSE_CLICKED,(17,85),mods=9232,clickCount=1] on frame0 Contents: ru_RU/�������������� ###@###.### 2001-07-24 ====================================================================== Still broken in 1.4.0-beta3-b79. There is a new, small, simple test code JT.java that is attached. See the Bugtraq description for a test protocol. ###@###.### 2001-09-18 Danila Sinopalnikov (###@###.###) sent a utility dumpsel.c that dumps the contents of the clipboard on demand. It has been attached. The results on Solaris 8 are attached to this bug report, too -- in the file arabic_8.sel. java version "1.4.0-beta3" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta3-b80) Java HotSpot(TM) Client VM (build 1.4.0-beta3-b80, mixed mode) ###@###.### 2001-09-24 I reran dumpsel using Arabic entered into dtterm and dtpad. xterms are not i18n ready, so they were ignored. ###@###.### 2001-09-25 The results from the investigation dated 2001-09-25 are moved to attachments in the file dumpsel_out_ar.txt. Patch to ByteToCharCOMPOUND_TEXT.java is moved to attachments in the file ByteToCharCOMPOUND_TEXT.java.pch. ###@###.### 2001-11-22 Name: dsR10078 Date: 11/28/2001 The problems documented in this bug report are manifestations of numerous bugs in the Solaris implementation of COMPOUND_TEXT support. Some of these bugs are documented under bug ids 4502606, 4506640, 4521799, 4528767, 4528768. Because of these bugs the native application, that acts as a data source and uses conversion mechanisms provided by Solaris libraries, exports COMPOUND_TEXT data that doesn't comply to the Compound Text Encoding standard. As a result the application that acts as a target (in this case a Java application) cannot decode this data properly and the whole data transfer fails. There is no way to modify our data transfer code to take these bugs into account and enable data transfer between Java and broken native applications that will work consistently in all environments: 1.dtterm exports COMPOUND_TEXT data that includes unregistered encoding names (4528767, 4528768). The possible workaround for this problem would be to get the conversion tables for these encodings from the Solaris L10N team and to implement corresponding converters. However, this workaround will not work consistently at least for Arabic text transfer. The current Solaris implementation uses unregistered encoding SUN-ISO8859-6 to convert Arabic text to COMPOUND_TEXT format. However, there are at least two different interpretations of the name SUN-ISO8859-6 within the Solaris implementation itself: Unicode locales (e.g. en_US.UTF-8) use double-byte Unicode character codes to encode Arabic text in SUN-ISO8859-6 encoding, while Arabic locale (ar) uses single-byte ISO8859-6 character codes to encode the same Arabic text in the same SUN-ISO8859-6 encoding. As a result we cannot provide a converter for this encoding. 2.xterm exports COMPOUND_TEXT data without the designator (4502606). In this case the possible workaround would be to assume that the initial state for COMPOUND_TEXT format corresponds to the default encoding. However, this workaround will fail if the default encodings for xterm and Java applications are different. ###@###.### 2001-11-28 ====================================================================== It appears that all the problems that contribute to this bug have been traced to bugs in other Solaris libraries. Since we cannot fix this for Merlin, I am dropping the priority. I will commit it to Hopper/Tiger so we continue to track this issue. ###@###.### 2001-11-29 Copy and paste of Arabic text works on Solaris 9 (b53) with JDK 1.4.0-rc-b89. Arabic/English mixed text works too. ###@###.### 2002-01-28 It appears that this was fixed by 4494565, which was integrated into Solaris 9 build s81_53. Closing as no longer reproducible. ###@###.### 2002-02-05
28-01-2002

WORK AROUND Name: dsR10078 Date: 11/28/2001 Remove the mapping for COMPOUND_TEXT format from flavormap.properties file in your JRE installation: --- flavormap.properties Tue Jul 24 20:28:15 2001 *************** *** 60,66 **** # See java.awt.datatransfer.DataFlavor.selectBestTextFlavor for a list of # text flavors which support the charset parameter. ! COMPOUND_TEXT=text/plain;charset=x-compound-text;eoln="\n";terminators=0 TEXT=text/plain;eoln="\n";terminators=0 STRING=text/plain;charset=iso8859-1;eoln="\n";terminators=0 FILE_NAME=application/x-java-file-list;class=java.util.List --- 60,66 ---- # See java.awt.datatransfer.DataFlavor.selectBestTextFlavor for a list of # text flavors which support the charset parameter. ! # COMPOUND_TEXT=text/plain;charset=x-compound-text;eoln="\n";terminators=0 TEXT=text/plain;eoln="\n";terminators=0 STRING=text/plain;charset=iso8859-1;eoln="\n";terminators=0 FILE_NAME=application/x-java-file-list;class=java.util.List ###@###.### 2001-11-28 ======================================================================
28-11-2001