开发者

Unreadable Characters in Word After Export

I have an asp.net page that exports some data to Microsoft Word 2003. The source of the data is what users have typed into an ajax control toolkit HtmlEditor on an input page. All works well unless the user has pasted text from a Word document into the HtmlEditor.

The html that is copied from Word looks like this:

开发者_开发技巧
<p class="MsoBodyText" style="margin: 0in 0in 0pt"><font color="#000000"><br />\r\nThe Blah Blah Blah of Southern California’s blah blah qualify for a blah of “Rating” with a “hold” status.&nbsp;</font></p>

When the content is rendered in Word, it looks like this:

The Blah Blah Blah of Southern California’s blah blah qualify for a blah of “Rating†with a “hold†status.

Any help on this? I have no problem when I force the HTML into a div and show it on the page. It's only on the export to Word that it gets messed up. This happens whether I paste the Word text right into the HtmlEditor or use the Paste From MS Word (with cleanup) button.

Thanks. Andrew.


This is a text encoding problem, and your "html that is copied from Word" is wrong. You've used single and double quotes (ASCII characters 39 and 34, or hex 0x27 and 0x22 respectively), while Word is using smart quotes. They're getting garbled during the copy and paste between Word and the HTMLEditor, and then appearing as the wrong character encoding when pasted back to Word.

If you save the text from the HTMLEditor and look at it with a hex viewer, you'll see the problem immediately.

I can't help you with the "ajax control HTMLEditor" and reconfiguring it to fix this, as I'm not familiar with it.


I never thought I would ever read the phrase "exports some data to Microsoft Word". Fail.

Your program is creating the Word doc programmatically, correct? It looks like you have a binary error on single quotes and double-quotes. How are you creating the Word doc? Interop library?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜