开发者

Is it possible to replace div contents with regular expression?

Is it possible to wr开发者_如何学JAVAite a regular expression to replace everything between <div id=”somevalue123” class=”text-block”> and </div>? I can do this but the problem I am having is that there are other div nodes within the string.

Here is the current regular expression that I am using:

public static function replaceStringBetween($start, $end, $new, $source, $limit = 1)
{
    // Reinitialize the replacement count
    self::$replacement_count = 0;

    // Try to perform the replacement
    $result = preg_replace('#('.preg_quote($start) . ')(.*)('.preg_quote($end) 
        . ')#is', '$1' . $new . '$3', $source, $limit, $count);
    if ($count > 0)
    {
        self::$replacement_count++;
        return $result;
    }

    // As a fallback, try again with a different method
    $result = preg_replace ("#{$start}(.*){$end}#is", $new, $source, $limit, $count);
    if ($count > 0)
    {
        self::$replacement_count++;
        return $result;
    }

    // Return the original
    return $source;
}

I am passing an HTML file as the source, of course. Thanks


A simple to use PHP parser which I have used to do exactly this in the past is the Simple HTML DOM Parser. You would use the selector div#somevalue123.


Regular expressions are not capable of supporting arbitrary nesting. You may want to consider a push-down automaton (parser) for arbitrary nesting.

In practice, you could design a series of regular expressions to parse a fixed number of these. However, once you start getting into handling error conditions and (parse) errors, you are really trying to shoe horn a regular expression into the place of a parser.

This seems like you may want to reconsider your approach and design in the modularity you seek, rather than putting it in after the fact by using a regular expression bait-and-switch.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜