Merge multiple doc or rtf files into a single doc or rtf file by using php script
I would like to merge multiple doc or rtf files into a single file which should be the same format of multiple files. What I mean is that if a user selects multiple rtf template files from a list box and clicks on a button o开发者_如何学Cn web page, the output should be a single rtf file which combines multiple rtf template files, I should use php for this.
I haven't decided the format of template files, but it should be either rtf or doc, and also I assume that template file has some images as well.
I have spent many hours to research the library for this, but still can't find it out.
Please help me out here!! :(
Thanks in advance.
If you are searching for a solution for handling RTF documents only, you can find a PHP package to merge multiple RTF documents here :
www.rtftools.com
Here is a short example on how to merge multiple documents together :
include ( 'path/to/RtfMerger.phpclass' ) ;
$merger = new RtfMerger ( 'sample1.rtf', 'sample2.rtf' ) ; // You can specify docs to be merged to the class constructor...
$merger -> Add ( 'sample3.rtf' ) ; // or by using the Add() method
$merger [] = 'sample4.rtf' ; // or by using the array access methods
$merger -> SaveTo ( 'output.rtf' ) ; // Will save files 'sample1' to 'sample4' into 'output.rtf'
This package allows you to handle documents that are bigger than the available memory.
I've been working on a similar project and havne't managed to find any PHP (or any other open source language) libraries for manipulating MSWord files. The way I approach it is kind of complicated, but works. Here's how I would do it (assuming you have a Linux server):
Setup:
- Install JODConverter and OpenOffice
- Start open office as a server (see http://www.artofsolving.com/node/10)
Approach (ie. what to do in your PHP code):
- Convert your MSWord or RTF files into ODT format by calling JODConverter via backticks or
exec()
- Unzip each file into a temporary directory of its own
- Read the
contents.xml
file from each unzipped document using a DOM Parser - Extract the
<office:text>
contents from each, and concatenate - Put this concatenated xml back into the right spot in one of the
content.xml
files - Re-zip the contents of that temporary directory and give it an
.odt
extension - Use JODConverter to convert this file back to MSWord again
As I said, it's not pretty, but it does the job.
If you're looking to go down the RTF route, this question may also help: Concatenate RTF files in PHP (REGEX)
精彩评论