开发者

PHP: preg_replace (x) occurrence?

I asked a similar question recently, but didn't get a clear answer because I was too specific. This one is more broad.

Does anyone know how to replace an (x) occurrence in a regex pattern?

Example: Lets say I wanted to replace the 5th occurrence of the regex pattern in a string. How would I do that?

Here is the pattern: preg_replace('/{(.*?)\|\:(.*?)}/', 'repl开发者_如何学编程acement', $this->source);

@anubhava REQUESTED SAMPLE CODE (last function doesn't work):


$sample = 'blah asada asdas  {load|:title} steve jobs {load|:css} windows apple ';


$syntax = new syntax();
$syntax->parse($sample);


class syntax {

    protected $source;
    protected $i;
    protected $r;

        // parse source
    public function parse($source) {
                // set source to protected class var
        $this->source = $source;

        // match all occurrences for regex and run loop
        $output = array();
        preg_match_all('/\{(.*?)\|\:(.*?)\}/', $this->source, $output);

                // run loop
        $i = 0;
        foreach($output[0] as $key):
            // perform run function for each occurrence, send first match before |: and second match after |:
            $this->run($output[1][$i], $output[2][$i], $i);

            $i++;
        endforeach;

        echo $this->source;

    }

        // run function
    public function run($m, $p, $i) {
                // if method is load perform actions and run inject
        switch($m):

            case 'load':
                $this->inject($i, 'content');
            break;

        endswitch;

    }

        // this function should inject the modified data, but I'm still working on this.
    private function inject($i, $r) {

          $output = preg_replace('/\{(.*?)\|\:(.*?)\}/', $r, $this->source);

    }


}



You're misunderstanding regular expressions: they're stateless, have no memory, and no ability to count, nothing, so you can't know that a match is the x'th match in a string - the regex engine doesn't have a clue. You can't do this kind of thing with a regex for the same reason as it's not possible to write a regex to see if a string has balanced brackets: the problem requires a memory, which, by definition, regexes do not have.

However, a regex engine can tell you all the matches, so you're better off using preg_match() to get a list of matches, and then modify the string using that information yourself.

Update: is this closer to what you're thinking of?

<?php
class Parser {

    private $i;

    public function parse($source) {
        $this->i = 0;
        return preg_replace_callback('/\{(.*?)\|\:(.*?)\}/', array($this, 'on_match'), $source);
    }

    private function on_match($m) {
        $this->i++;

        // Do what you processing you need on the match.
        print_r(array('m' => $m, 'i' => $this->i));

        // Return what you want the replacement to be.
        return $m[0] . '=>' . $this->i;
    }
}

$sample = 'blah asada asdas  {load|:title} steve jobs {load|:css} windows apple ';
$parse = new Parser();
$result = $parse->parse($sample);
echo "Result is: [$result]\n";

Which gives...

Array
(
    [m] => Array
        (
            [0] => {load|:title}
            [1] => load
            [2] => title
        )

    [i] => 1
)
Array
(
    [m] => Array
        (
            [0] => {load|:css}
            [1] => load
            [2] => css
        )

    [i] => 2
)
Result is: [blah asada asdas  {load|:title}=>1 steve jobs {load|:css}=>2 windows apple ]


A much simpler and cleaner solution, which also deals with backreferences:

function preg_replace_nth($pattern, $replacement, $subject, $nth=1) {
    return preg_replace_callback($pattern,
        function($found) use (&$pattern, &$replacement, &$nth) {
                $nth--;
                if ($nth==0) return preg_replace($pattern, $replacement, reset($found) );
                return reset($found);
        }, $subject,$nth  );
}


echo preg_replace_nth("/(\w+)\|/", '${1} is the 4th|', "|aa|b|cc|dd|e|ff|gg|kkk|", 4);   

outputs |aa|b|cc|dd is the 4th|e|ff|gg|kkk|


As is already said, a regex has no state and you can't do this by just passing an integer to pinpoint the exact match for replacement ... you could wrap the replacement into a method which finds all matches and replaces only the nth match given as integer

<? 

function replace_nth_occurence ( &$haystack, $pattern, $replacement, $occurence) {

    preg_match_all($pattern, $haystack, $matches, PREG_OFFSET_CAPTURE);
    if(array_key_exists($occurence-1, $matches[0])) {
        $haystack = substr($haystack, 0, $matches[0][$occurence-1][1]).
                      $replacement.
                    substr($haystack, 
                        $matches[0][$occurence-1][1] +
                        strlen($matches[0][$occurence-1][0])
                      );
    }

}


$haystack = "test0|:test1|test2|:test3|:test4|test5|test6"; 

printf("%s \n", $haystack);

replace_nth_occurence( $haystack, '/\|:/', "<=>", 2);

printf("%s \n", $haystack);

?>


This is the alternative approach:

$parts = preg_split('/\{((?:.*?)\|\:(?:.*?))\}/', $this->source, PREG_SPLIT_DELIM_CAPTURE);

$parts will contain original string parts at even offsets [0] [2] [4] [6] [8] [10] ...

And the matched delimiters will be at [1] [3] [5] [7] [9]

To find the 5th occurence for example, you could then modify element $n*2 - 1 which would be element [9] in this case:

$parts[5*2 - 1] = $replacement.

Then reassemble everything:

$output = implode($parts);


There is no literal way to match occurrence 5 of pattern /pat/. But you could match /^(.*?(?:pat.*?){4,4})pat/ and replace by \1repl. This will replace the first 4 occurrences, plus anything following, with the same, and the fifth with repl.

If /pat/ contains capture groups you would need to use the non-capturing equivalent for the first N-1 matches. The replacing pattern should reference the captured groups starting from \\2.

The implementation looks like:

function replace_occurrence($pat_cap,$pat_noncap,$repl,$sample,$n)
{
    $nmin = $n-1;
    return preg_replace("/^(.*?(?:$pat_noncap.*?){".
                        "$nmin,$nmin".
                        "})$pat_cap/",$r="\\1$repl",$sample);
}


My first idea was to use preg_replace with a callback and do the counting in the callback, as other users have (excellently) demonstrated.

Alternatively you can use preg_split keeping the delimiters, using PREG_SPLIT_DELIM_CAPTURE, and do the actual replacement in the resulting array. PHP only captures what's between capturing parens, so you'll either have to adapt the regex or take care of other captures yourself. Assuming 1 capturing pair, then captured delimiters will always be in the odd numbered indexes: 1, 3, 5, 7, 9, .... You'll want index 9; and implode it again.

This does imply you'll need to have a single capturing

$sample = "blah asada asdas  {load|:title} steve jobs {load|:css} windows apple\n";
$sample .= $sample . $sample;   # at least 5 occurrences

$parts = preg_split('/(\{.*?\|\:.*?\})/', $sample, -1, PREG_SPLIT_DELIM_CAPTURE);
$parts[9] = 'replacement';
$return = implode('', $parts);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜