开发者

How to convert a System.IO.Packaging.Package to HTML?

Microsoft Word interoperability classes will let you get at a property called WordOpenXML. This represents a package that will be stored - zipped up - in a .docx file and can be opened by Microsoft Word. However, is there a way to convert this Package to other formats, notably HTML?

I read in an answer to an old question that "Word 2007 has an API that you can use to convert to HTML. [...] You can find documentation around the API, but I remember that there is a convert to HTML function in the API." I'm not 100% sure which API that guy is talking about but perhaps it's 开发者_StackOverflow社区System.IO.Packaging.Package or something similar. I can't seem to find any "convert to HTML function"; does anyone know how you can convert a Package format Word document into HTML?


The API in question is probably the Save method on the document; when a file type of HTML is chosen, Word transforms the document into HTML, and applies the appropriate styling.

Chances are, given that the docx format is XML, there is an XSLT transformation of some sort going on; this is just speculation, but it's not far-fetched, as XSLT is commonly used to create HTML from XML.

That said, what you are looking for probably does not reside in the Package class, nor should it. The Package class is used for creating packages of content, not with the transformation of that content.

However, there's nothing stopping you from providing the transformation of that content; you can get the XML that is the basis of the Word document and then apply your own XSLT which would produce the HTML that you want.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜