Is it possible to replace div contents with regular expression?
Is it possible to wr开发者_如何学JAVAite a regular expression to replace everything between <div id=”somevalue123” class=”text-block”>
and </div>
? I can do this but the problem I am having is that there are other div nodes within the string.
Here is the current regular expression that I am using:
public static function replaceStringBetween($start, $end, $new, $source, $limit = 1)
{
// Reinitialize the replacement count
self::$replacement_count = 0;
// Try to perform the replacement
$result = preg_replace('#('.preg_quote($start) . ')(.*)('.preg_quote($end)
. ')#is', '$1' . $new . '$3', $source, $limit, $count);
if ($count > 0)
{
self::$replacement_count++;
return $result;
}
// As a fallback, try again with a different method
$result = preg_replace ("#{$start}(.*){$end}#is", $new, $source, $limit, $count);
if ($count > 0)
{
self::$replacement_count++;
return $result;
}
// Return the original
return $source;
}
I am passing an HTML file as the source, of course. Thanks
A simple to use PHP parser which I have used to do exactly this in the past is the Simple HTML DOM Parser. You would use the selector div#somevalue123
.
Regular expressions are not capable of supporting arbitrary nesting. You may want to consider a push-down automaton (parser) for arbitrary nesting.
In practice, you could design a series of regular expressions to parse a fixed number of these. However, once you start getting into handling error conditions and (parse) errors, you are really trying to shoe horn a regular expression into the place of a parser.
This seems like you may want to reconsider your approach and design in the modularity you seek, rather than putting it in after the fact by using a regular expression bait-and-switch.
精彩评论