开发者

Returning things to PHP from a Word Macro

The objective is to get an accurate word count for a Microsoft Word file. We have a Windows server that runs Apache and PHP. There is a web service running on that machine that basically gets all the content of the document and runs the content through preg_match_all("/\S+/", $string, $m开发者_如何学编程atches); return count($matches[0]);. Works pretty well but it's not at all accurate. So we wrote the following macro:

Sub GetWordCountBreakdown()

    Dim x As Integer
    Dim TotalWords As Long
    Dim FieldWords As Long

    TotalWords = ActiveDocument.ComputeStatistics(wdStatisticWords)

    For x = 1 To ActiveDocument.Fields.Count
        If ActiveDocument.Fields.Item(x).Result.ComputeStatistics(wdStatisticWords) > 25 Then
            FieldWords = FieldWords + ActiveDocument.Fields.Item(x).Result.ComputeStatistics(wdStatisticWords)
        End If
    Next x

    MsgBox (TotalWords & " - " & FieldWords & " = " & TotalWords - FieldWords)

End Sub`

When I run this macro in Word, it gives me a neat little alert box counting up all the words and references in the document. I'm not sure how to return those values to PHP so my webservice can convey them back to me.

Update: I was able to just rewrite this macro in PHP and get the correct wordcount. Basically:

$word = new COM("Word.Application")
$word->Documents->Open(file);
$wdStatisticWords = 0;
$wordcount = $word->ActiveDocument->ComputeStatistics($wdStatisticWords);

etc.


If you can read the OLE streams for the doc file, an accurate wordcount for the document should be stored in either the SummaryInformation or the DocumentSummaryInformation stream. I don't have a script that reads the properties from .doc files, but I do have code for reading the metaproperties of Excel xls files that could be adapted fairly easily.

EDIT

I've just checked, and it's property id 0x0F in the SummaryInformation stream.


Why not simply count the number of spaces in the doc string? Or am I missing something?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜