How to paste Text from Word to plain text by preserve defined styles?
I want to let the user paste text to an editor (currently CKEditor). By pasting the text all styles and elements which are not white-listed must be removed, including images, tables etc. So 90% should be converted t开发者_开发问答o plain text or be removed while some simple styles like bold, italic or underlined should be preserved.
Didn't thought that's so complicated. But all I can find within the documentation and the samples of CKEditor is about pasting complete plain text or pasting cleaned up content from Word without the ability to configure a white-list (and even if I remove all table-related plugins it is still possible to paste a table from MS WorD).
I really, really appreciate any hint.
Thanks.
You can't without writing your own parser. Another issue is MS word uses Windows-1252 character encoding and most of the web uses UTF-8 encoding, so if you paste from WORD and transmit this data via AJAX, it will be garbled.
While Dreamweaver has a pretty good "paste from word" feature, it's unlikely you'll find an online equivalent. This is a huge and complex problem that would be an application in itself. Even WORD's "save as HTML" can't even do a decent job of it.
Sadly, what most have to do, is strip it all down to ASCII (paste into Notepad), put it in the editor and mark it back up.
You can add a listener for the 'paste' event in the editor instance: http://docs.cksource.com/ckeditor_api/symbols/CKEDITOR.editor.html#event:paste
That way you get the HTML that it's gonna get pasted and you can perform whatever clean up you need (for example based on inserting that html into a div and then work with the DOM, or using regexps on the string).
Found a solution:
- Listening to the paste event as AlfonsoML wrote.
- Sending the pasted content of Word to the server.
- Parsing it with the HTML Agility Pack.
- Sending it back to the client.
- Inserting it within the editor.
精彩评论