How to use Embedded Equations in Java Apache POI library?
I am trying to use "Apache POI" to extract embedded equation and text from a .doc MS Word file into a .ppt MS Powerpoint file, I have successfully extracted text, but how do I extract embedded开发者_JS百科 equations?
the Embedded Equations comes out like this if I only extract it as text:
!!EMBED Equation.3
This may not help you with the binary .doc format, but for the newer .docx format, I was able to get to the equation, which is embedded as an OLE document, using the following code:
InputStream in = new FileInputStream(f);
XWPFDocument doc = new XWPFDocument(in);
for (PackagePart p : doc.getAllEmbedds()) {
POIFSFileSystem poifs = new POIFSFileSystem(p.getInputStream());
byte[] oleData = IOUtils.toByteArray(
poifs.createDocumentInputStream("Equation Native"));
}
And then you can extract the MathType data in there and hand it to a MTEF parser.
If you don't need the MathType data, there is also a placeholder image (in WMF format) that just renders the equation.
精彩评论