JDK-4765240 : Cannot paste from MS-Word into HTMLEditorKit
  • Type: Bug
  • Component: client-libs
  • Sub-Component: javax.swing
  • Affected Version: 1.4.0
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: windows_xp
  • CPU: x86
  • Submitted: 2002-10-18
  • Updated: 2008-11-19
Description

Name: sv35042			Date: 10/18/2002


FULL PRODUCT VERSION :
C:\Downloads\sunTest\scjd\starting>java -version
java version "1.4.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-b92)
Java HotSpot(TM) Client VM (build 1.4.0-b92, mixed mode)

FULL OPERATING SYSTEM VERSION :
Microsot Windows XP [Version 5.1.2600]

EXTRA RELEVANT SYSTEM CONFIGURATION :
Using MS-Word 2000.

A DESCRIPTION OF THE PROBLEM :
When pasting the contents of the clipboard which contains
MS-Word text onto a JEditorPane that uses the
HTMLEditorKit, the information on the clipboard is printed
to stdout, but is never pasted into the JEditorPane.

When pasting the contents of the clipboard which contains
Netscape Navigator text onto the same pane, the
information is also printed to stdout AND is pasted
properly.  Also noticed that when it is pasted properly
that an UndoableEditEvent is triggered; this event is NOT
fired when the paste is performed from an MS-Word document.

An example output of the MS-Word clipboard information
that is printed to stdout is:

<pre>
<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html;
charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 9">
<meta name=Originator content="Microsoft Word 9">
<link rel=File-List href="./nelly_files/filelist.xml">
<title>This is the story of Nelly from Guam����</title>
<!--[if gte mso 9]><xml>
 <o:DocumentProperties>
  <o:Author>xocnibor</o:Author>
  <o:LastAuthor>xocnibor</o:LastAuthor>
  <o:Revision>1</o:Revision>
  <o:TotalTime>0</o:TotalTime>
  <o:Created>2002-05-22T12:37:00Z</o:Created>
  <o:LastSaved>2002-05-22T12:37:00Z</o:LastSaved>
  <o:Pages>1</o:Pages>
  <o:Company> </o:Company>
  <o:Lines>1</o:Lines>
  <o:Paragraphs>1</o:Paragraphs>
  <o:Version>9.3821</o:Version>
 </o:DocumentProperties>
</xml><![endif]-->
<style>
<!--
 /* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{mso-style-parent:"";
	margin:0in;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:12.0pt;
	font-family:"Times New Roman";
	mso-fareast-font-family:"Times New Roman";}
h1
	{mso-style-next:Normal;
	margin:0in;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	page-break-after:avoid;
	mso-outline-level:1;
	font-size:12.0pt;
	font-family:"Times New Roman";
	mso-font-kerning:0pt;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;
	mso-header-margin:.5in;
	mso-footer-margin:.5in;
	mso-paper-source:0;}
div.Section1
	{page:Section1;}
-->
</style>
</head>

<body lang=EN-US style='tab-interval:.5in'>

<div class=Section1>

<h1>This is the story of <i>Nelly</i> from <u>Guam</u>����
</h1>

<p class=MsoNormal><![if !supportEmptyParas]>&nbsp;<!
[endif]><o:p></o:p></p>

</div>

</body>

</html>
</pre>


Oddly enough, the above clipboard contents can be pasted
when an RTFEditorKit is used.  Also when the HTMLEditorKit
is used and the setPage( ) method is called with the above
clipboard information saved off to a file, it is able to
be read.  So it appears to be a HTMLEditorKit paste
problem.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. Create a JEditorPane and set its editor type to the
HTMLEditorKit.

2.  Run the program.

3.  Fire up MS-Word and create a new document.  Enter in a
simple text, e.g. "This is from MSWord".

4.  In MS-Word, copy the text to the clipboard.

5.  Go back to the java program with the JEditorPane.

6.  Click in the JEditorPane and press the Ctrl-V
keystroke to attempt to copy the text to the JEditorPane.

7.  Observe that stdout prints out the contents of the
clipboard (wrapped in busy html) but does not paste to the
editor pane.

EXPECTED VERSUS ACTUAL BEHAVIOR :
Expected Results:
The text copied from MS-Word should be pasted into the
JEditorPane.

Actual Results:
The contents of the clipboard is sent to stdout, but not
pasted to the JEditorPane.

ERROR MESSAGES/STACK TRACES THAT OCCUR :
No error messages are displayed -- the clipboard contents are sent to stdout,
though.

This bug can be reproduced always.

---------- BEGIN SOURCE ----------

package com;

import javax.swing.*;
import javax.swing.text.*;
import javax.swing.text.html.*;
import javax.swing.text.rtf.*;
import java.awt.*;

public class PasteHorkTest extends JFrame
{
   private JEditorPane myEditorPane;
   
   public static void main(String args[])
   {
      new PasteHorkTest();
   }
    public PasteHorkTest()
   {
      myEditorPane = new JEditorPane();
      myEditorPane.setEditorKit(new HTMLEditorKit());
      // note rtfEditor kit will accept ms-word clipboard contents
      //myEditorPane.setEditorKit( new RTFEditorKit());
      myEditorPane.setEditable(true);
      myEditorPane.setText("");
      getContentPane().setLayout(new BorderLayout());
      getContentPane().add(myEditorPane, BorderLayout.CENTER);
      pack();
      setVisible(true);
   }

}



---------- END SOURCE ----------

CUSTOMER WORKAROUND :
It appears that the RTFEditorKit allows the pasting
properly, but our code relies heavily on the
HTMLEditorKit.  Our program is based on having the ability
to copy HTML-formats into a JEditorPane.
(Review ID: 147021) 
======================================================================

Comments
WORK AROUND Name: pzR10082 Date: 12/03/2002 For the HTMLDocument you're pasting into, set the "IgnoreCharsetDirective" property, e.g. myEditorPane.getDocument().putProperty("IgnoreCharsetDirective", Boolean.TRUE); ###@###.### ======================================================================
25-09-2004

EVALUATION Name: pzR10082 Date: 10/30/2002 HTML exported by MS Word contains a <meta> tag with content-type and charset attributes defined. DocumentParser.handleEmptyTag() throws ChangedCharSetException when it encounters this <meta> tag. ###@###.### 2002-10-30 ======================================================================
30-10-2002