开发者

Split file with PHP and generate contents

How do I split the content below into separate files without the placeholder tags. I'd also like to take the text inside the placeholder tags and place them inside a new contents file.

<div class='placeholder'>The First Chapter</div>

This is some text.

<div class='placeholder'>The Second Chapter</div>

This is some more text.

<div class='placeholder'>Last Chapter</div>

The last chapter.

Thanks.

UPDATE:

I've tried a modified version of MartinodF code, but can't get it to work.

$text=file_get_contents("t.txt");


$parts = preg_split('/\n?<div class=\'placeholder\'>(.+?)<\/div>\n/im', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$parts_num = count($parts) / 2;

$titles = $files = array();
for($x = 0; $x < $parts_num - 1; $x++) {
    $titles[] = $parts[$x * 2 + 1];开发者_高级运维
    $files[] = $parts[$x * 2 + 1] . "\n" . $parts[$x * 2 + 2];
}


var_dump($titles);
var_dump($files);

echo $titles[1];

UPDATE 2: No longer reliant on separate txt file, but still doesn't work.

$text="<div class='placeholder'>The First Chapter</div>
This is some text.
<div class='placeholder'>The Second Chapter</div>
This is some more text.
<div class='placeholder'>Last Chapter</div>
The last chapter.
";


$parts = preg_split('/\n?<div class=\'placeholder\'>(.+?)<\/div>\n/im', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$parts_num = count($parts) / 2;

$titles = $files = array();
for($x = 0; $x < $parts_num - 1; $x++) {
    $titles[] = $parts[$x * 2 + 1];
    $files[] = $parts[$x * 2 + 1] . "\n" . $parts[$x * 2 + 2];
}


var_dump($titles);
var_dump($files);

echo $titles[1];


Use a Xml/HTML parser to walk over the dom and pull what you need. Theres SimpleXml and DOMDocment buit directly into php. Or you could use something like Zend_Dom_Query or SimpleHTML.


It seems to me that you can simply use regular expressions...

http://www.roscripts.com/PHP_regular_expressions_examples-136.html -- see the end of document there's a few regular expressions for HTML.

... but maybe you presented only a part of your task.


If I understand correctly what you're doing (like extracting titles and contents of each chapter from a script of some sort), MartyIX is right, you can use regular expressions:

$parts = preg_split('/\n?<div class=\'placeholder\'>(.+?)<\/div>\n/im', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$parts_num = count($parts) / 2;

$titles = $files = array();
for($x = 0; $x < $parts_num - 1; $x++) {
    $titles[] = $parts[$x * 2 + 1];
    $files[] = $parts[$x * 2 + 1] . "\n" . $parts[$x * 2 + 2];
}

var_dump($titles);
var_dump($files);

$titles will be an array containing all the "titles", you can write one on each line and have your "contents" file (which will be like the index).

$files, on the other hand, will contain each chapter (the title, without tag around it, a newline and then the text) that you can write out each one to a different file to have your text split into chapters.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜