Import doc and docx files in .Net and C#
I'm writing a text editor and I want to add the possibility to import .doc and .docx files. I know that I could use OLE Automation, but if I use a recent OLE library, it won't work with those people with an older version of Word, and if instead I use an older version, it won't be able to read .docx files. Any ideas? Thanks
EDIT: 开发者_Go百科Another solution would be that, like my application works with HTML and RTF, convert .doc and .docx files with command line to one of these formats, something like this: http://www.snee.com/bobdc.blog/ 2007/09/using-word-for-command-line-co.html
It's works with the Office 2003 PIA, tested in my computer running Office 2010:
using System.IO;
using System.Reflection;
using Microsoft.Office.Interop.Word;
public string GetHtmlFromDoc(string path)
var wordApp = new Application {Visible = false};
//Cargar documento
object srcPath = path;
var wordDoc = wordApp.Documents.Open(ref srcPath);
//Guardarlo en HTML
string destPath = Path.Combine(Path.GetTempPath(), "word" + (new Random().Next()) + ".html");
if (wordDoc != null)
{
object oDestPath = destPath;
object exportFormat = WdSaveFormat.wdFormatHTML;
wordDoc.SaveAs(ref oDestPath, ref exportFormat);
}
//Cerrar
wordDoc.Close();
wordApp.Quit();
//Comprobar que el archivo existe);
if (File.Exists(destPath))
{
return File.ReadAllText(destPath, Encoding.Default);
}
return null;
}
Why don't you use the Office Primary Interop Assemblies (PIAs)?
I think you will have to decide which versions of Word you want to support. I suggest you settle on Word 2003 as the lowest. That will allow you to use the Office 2003 PIAs and program against them. Installing PIAs in a machine installs binding redirects as well, so they work with newer versions on Word. There should be no problem in opening .docx files with Word 2007 or 2010 through Office 2003 PIAs, although I haven't tried this myself.
You should be able to use the OpenXML libraries or xpath in .NET to read / import the contents of a docx file.
精彩评论