How do you get subject & title from a Word document (without opening it)?
I would like to read the title and subject fields from a Word document, but would rather not have the overhead of firing up Word to do it.
If, in Windows Explorer, I display the title and subject columns, and then navigate to a folder that has Word documents in it, then this information is displayed. What mechanism is being used to do this (aside from Shell extensions) because its fast (but I don't know if you actually need Word installed for this to work), so I'm gue开发者_开发百科ssing its not firing up Word and opening each document.
I've found a link to Dsofile.dll, which I presume I could use, but does this work for .doc and .docx files and is it the only way ?
Well... as one might assume that the time of the ".doc" file is passing, here is one way to get the subject and title from a ".docx" file (or ".xlsx" file for that matter).
using System;
using System.IO;
using System.IO.Packaging; // Assembly WindowsBase.dll
namespace ConsoleApplication16
{
class Program
{
static void Main(string[] args)
{
String path = Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData);
String file = Path.Combine(path, "Doc1.docx");
Package docx = Package.Open(file, FileMode.Open, FileAccess.Read);
String subject = docx.PackageProperties.Subject;
String title = docx.PackageProperties.Title;
docx.Close();
}
}
}
I hope this is useful to someone.
You can read it via XML, too: How to extract information from Office files by using Office file formats and schemas
Here is another example on how to read a Word doc programmatically.
One way or the other you'll have to look inside the file at some point!
精彩评论