Number of pages in a word doc in java
Is there an easy way to count the number of pages 开发者_Python百科is a Word document either .doc or .docx?
Thanks
You could try the Apache API for word Docs:
http://poi.apache.org/
It as a method for getting the page count:
public int getPageCount()
Returns: The page count or 0 if the SummaryInformation does not contain a page count.
I found a really cool class, that count Pages for Word, Excel and PowerPoint. With help of Apache POI. And it is for old doc and new docx.
String lowerFilePath = filePath.toLowerCase();
if (lowerFilePath.endsWith(".xls")) {
HSSFWorkbook workbook = new HSSFWorkbook(new FileInputStream(lowerFilePath));
Integer sheetNums = workbook.getNumberOfSheets();
if (sheetNums > 0) {
return workbook.getSheetAt(0).getRowBreaks().length + 1;
}
} else if (lowerFilePath.endsWith(".xlsx")) {
XSSFWorkbook xwb = new XSSFWorkbook(lowerFilePath);
Integer sheetNums = xwb.getNumberOfSheets();
if (sheetNums > 0) {
return xwb.getSheetAt(0).getRowBreaks().length + 1;
}
} else if (lowerFilePath.endsWith(".docx")) {
XWPFDocument docx = new XWPFDocument(POIXMLDocument.openPackage(lowerFilePath));
return docx.getProperties().getExtendedProperties().getUnderlyingProperties().getPages();
} else if (lowerFilePath.endsWith(".doc")) {
HWPFDocument wordDoc = new HWPFDocument(new FileInputStream(lowerFilePath));
return wordDoc.getSummaryInformation().getPageCount();
} else if (lowerFilePath.endsWith(".ppt")) {
HSLFSlideShow document = new HSLFSlideShow(new FileInputStream(lowerFilePath));
SlideShow slideShow = new SlideShow(document);
return slideShow.getSlides().length;
} else if (lowerFilePath.endsWith(".pptx")) {
XSLFSlideShow xdocument = new XSLFSlideShow(lowerFilePath);
XMLSlideShow xslideShow = new XMLSlideShow(xdocument);
return xslideShow.getSlides().length;
}
source: OfficeTools.getPageCount()
Use Apache POI's SummaryInformation to fetch the Total page count of a MS word document
//Library is aspose
//package com.aspose.words.*
/*Open the Word Document */
Document doc = new Document("C:\\Temp\\file.doc");
/*Get page count */
int pageCount = doc.getPageCount();
Document doc = new Document("C:\\Data\\abc.doc");
//Get page count
int pageCount = doc.getPageCount();
//Print Page Count
System.out.println(pageCount);
If you want to use Aspose.Words for Java, document.getPageCount() API will give you number of pages. Please check http://www.aspose.com/docs/display/wordsjava/com.aspose.words.Document.getPageCount+property
or you may also use the docx4j api,
http://www.docx4java.org/trac/docx4j/browser/trunk/docx4j/src/main/java/org/docx4j/samples/DocProps.java
docx4j can get total pages as below:
org.docx4j.openpackaging.parts.DocPropsExtendedPart docPropsExtendedPart = wordMLPkg.getDocPropsExtendedPart();
org.docx4j.docProps.extended.Properties extendedProps = (org.docx4j.docProps.extended.Properties)docPropsExtendedPart.getJaxbElement();
int numPages = extendedProps.getPages();
精彩评论