parse word document
The word documents I want to parse will have a known format, defined by a word template. Users will use the word template to create the document. I need to parse data, including values from drop downs, from the word document using C#. This will be done on a SharePoint 2010 server. What is the recommended way to do this? I've seen people mention Open XML SDK 2.0;开发者_如何学JAVA should I use that? If so, do I need to convert the .docx to XML, then parse it? In some cases, I will also have to write to the Word document, how should this be done?
Preferably a solution will support Word 2010 and 2007 but if tools for 2010 are significantly better I'd like to know that as well. Thanks.
The file extension for Office Open XML is .docx. The .docx file can be described as an archive of several different files. Files that handle what fonts, styles, objects will exist in the word document. Those files itself will be describe as XML.
精彩评论