开发者

Number of pages in a word doc in java

Is there an easy way to count the number of pages 开发者_Python百科is a Word document either .doc or .docx?

Thanks


You could try the Apache API for word Docs:

http://poi.apache.org/

It as a method for getting the page count:

public int getPageCount()

Returns: The page count or 0 if the SummaryInformation does not contain a page count.


I found a really cool class, that count Pages for Word, Excel and PowerPoint. With help of Apache POI. And it is for old doc and new docx.

String lowerFilePath = filePath.toLowerCase();
if (lowerFilePath.endsWith(".xls")) {
            HSSFWorkbook workbook = new HSSFWorkbook(new FileInputStream(lowerFilePath));
            Integer sheetNums = workbook.getNumberOfSheets();
            if (sheetNums > 0) {
                return workbook.getSheetAt(0).getRowBreaks().length + 1;
            }
        } else if (lowerFilePath.endsWith(".xlsx")) {
            XSSFWorkbook xwb = new XSSFWorkbook(lowerFilePath);
            Integer sheetNums = xwb.getNumberOfSheets();
            if (sheetNums > 0) {
                return xwb.getSheetAt(0).getRowBreaks().length + 1;
            }
        } else if (lowerFilePath.endsWith(".docx")) {
            XWPFDocument docx = new XWPFDocument(POIXMLDocument.openPackage(lowerFilePath));
            return docx.getProperties().getExtendedProperties().getUnderlyingProperties().getPages();
        } else if (lowerFilePath.endsWith(".doc")) {
            HWPFDocument wordDoc = new HWPFDocument(new FileInputStream(lowerFilePath));
            return wordDoc.getSummaryInformation().getPageCount();
        } else if (lowerFilePath.endsWith(".ppt")) {
            HSLFSlideShow document = new HSLFSlideShow(new FileInputStream(lowerFilePath));
            SlideShow slideShow = new SlideShow(document);
            return slideShow.getSlides().length;
        } else if (lowerFilePath.endsWith(".pptx")) {
            XSLFSlideShow xdocument = new XSLFSlideShow(lowerFilePath);
            XMLSlideShow xslideShow = new XMLSlideShow(xdocument);
            return xslideShow.getSlides().length;
}

source: OfficeTools.getPageCount()


Use Apache POI's SummaryInformation to fetch the Total page count of a MS word document


  //Library is aspose 
  //package com.aspose.words.*

/*Open the Word Document */

Document doc = new Document("C:\\Temp\\file.doc"); 

/*Get page count */

int pageCount = doc.getPageCount();


Document doc = new Document("C:\\Data\\abc.doc");     

//Get page count                                         
int pageCount = doc.getPageCount();

//Print Page Count            
System.out.println(pageCount);

If you want to use Aspose.Words for Java, document.getPageCount() API will give you number of pages. Please check http://www.aspose.com/docs/display/wordsjava/com.aspose.words.Document.getPageCount+property

or you may also use the docx4j api,

http://www.docx4java.org/trac/docx4j/browser/trunk/docx4j/src/main/java/org/docx4j/samples/DocProps.java


docx4j can get total pages as below:

org.docx4j.openpackaging.parts.DocPropsExtendedPart docPropsExtendedPart = wordMLPkg.getDocPropsExtendedPart();
org.docx4j.docProps.extended.Properties extendedProps = (org.docx4j.docProps.extended.Properties)docPropsExtendedPart.getJaxbElement();
int numPages = extendedProps.getPages();
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜