Extracting all text from a powerpoint file in VBA
I have a huge set of powerpoint files from which I want to extract all the text and just lump it all into one big text file. Each source (PPT) file has multiple pages (slides). I do not care about formatting - only the words.
I could do this manually with a file by just ^A ^C in PPT, followed by ^V in notepad; then page down in the PPT, and repeat for each slide in the powerpoint. (Too bad开发者_StackOverflow社区 I can't just do a ^A that would grab EVERYTHING ... then I could use sendkey to copy / paste)
But there are many hundreds of these PPTs with different numbers of slides.
It seems like this would be a common thing to want to do, but I can't find an example anywhere.
Does anyone have sample code to do this?
Here's some code to get you started. This dumps all text in slides to the debug window. It doesn't try to format, group or do anything other than just dump.
Sub GetAllText()
Dim p As Presentation: Set p = ActivePresentation
Dim s As Slide
Dim sh As Shape
For Each s In p.Slides
For Each sh In s.Shapes
If sh.HasTextFrame Then
If sh.TextFrame.HasText Then
Debug.Print sh.TextFrame.TextRange.Text
End If
End If
Next
Next
End Sub
The following example shows code to loop through a list of files based on Otaku's code given above:
Sub test_click2()
Dim thePath As String
Dim src As String
Dim dst As String
Dim PPT As PowerPoint.Application
Dim p As PowerPoint.Presentation
Dim s As Slide
Dim sh As PowerPoint.Shape
Dim i As Integer
Dim f(10) As String
f(1) = "abc.pptx"
f(2) = "def.pptx"
f(3) = "ghi.pptx"
thePath = "C:\Work\Text parsing PPT\"
For i = 1 To 3
src = thePath & f(i)
dst = thePath & f(i) & ".txt"
On Error Resume Next
Kill dst
Open dst For Output As #1
Set PPT = CreateObject("PowerPoint.Application")
PPT.Activate
PPT.Visible = True
'PPT.WindowState = ppWindowMinimized
PPT.Presentations.Open filename:=src, ReadOnly:=True
For Each s In PPT.ActivePresentation.Slides
For Each sh In s.Shapes
If sh.HasTextFrame Then
If sh.TextFrame.HasText Then
Debug.Print sh.TextFrame.TextRange.Text
End If
End If
Next
Next
PPT.ActivePresentation.Close
Close #1
Next i
Set PPT = Nothing
End Sub
精彩评论