PHP ->preg_match_all for following structure <h6>my headline</h6>some text ... <h6>another headline</h6> more text
I'm desperate looking for the solution to get this text string
<h6>First pane</h6>
... pane content ...
<h6>Second pane</h6>
Hi, this is a comment.
To delete a comment, just log in and view the post's comments.
There you will have the option to edit
or delete them.
<h6>Last pane</h6>
... last pane content ...
parsed into an PHP array.
I need to seperate it to
1.
1.0=> First pane
1.1=> ... pane content ...
2.
2.0=> Second pane
2.1=> Hi, this is a c开发者_如何学Goomment.
To delete a comment, just log in and view the post's comments.
There you will have the option to edit
or delete them.
3.
3.0=> Last pane
3.1=> ... last pane content ...
Your regex should look like this:
/<h6>([^<]+)<\/h6>([^<]+)/im
If you run the following script, you'll see that the values you're looking for are in $matches[1] and $matches[2].
$s = "<h6>First pane</h6>
... pane content ...
<h6>Second pane</h6>
Hi, this is a comment.
To delete a comment, just log in and view the post's comments.
There you will have the option to edit
or delete them.
<h6>Last pane</h6>
... last pane content ..";
$r = "/<h6>([^<]+)<\/h6>([^<]+)/im";
$matches = array();
preg_match_all($r,$s,$matches);
print_r($matches);
You shouldn't be attempting to parse HTML with a regex. This is doomed to cause much pain and unhappiness for all but the very simplest HTML, and will instantly break if anything in your doc structure changes. Use a proper HTML or DOM parser instead, such as php's DOMDocument
http://php.net/manual/en/class.domdocument.php
For example you can use getElementsByTagName http://www.php.net/manual/en/domdocument.getelementsbytagname.php to get all h6
's
I believe the PREG_SET_ORDER flag is what you're looking for.
$regex = '~<h6>([^<]+)</h6>\s*([^<]+)~i';
preg_match_all($regex, $source, $matches, PREG_SET_ORDER);
This way, each element in the $matches array is an array containing the overall match followed by all of the group captures for a single match attempt. The result up to the first match looks like this:
Array ( [0] => Array ( [0] => First pane ... pane content ... [1] => First pane [2] => ... pane content ... )
see it in action on ideone
EDIT: Notice the \s*
I added, too. Without that, the matched content always starts without a line separator.
精彩评论