开发者

Php reg exp: Matching repeating pattern

I want to match strings like those below.

abc|q:1,f:2
cba|q:1,f:awd2,t:3awd,h:gr

I am using php and have tried both preg_match and preg_match_all with this expression.

/^([a-z]+)\|([a-z]+:[a-z0-9]+,?)+$/iU
开发者_StackOverflow中文版

This only returns the first part before the pipe, and one a:1. What am I doing wrong, why is it behaving this way and how can I make it work?


/^([a-z]+)\|((?:[a-z]+:[a-z0-9]+,?)+)$/iU

would capture:

  • the part before the pipe
  • the part after the part

The greedy nature of the '+' quantifier make your capturing group ([a-z]+:[a-z0-9]+,?) only capture the last set of characters matching this regex.

/(?ms)^((?:[a-z]+)\|(?:[a-z]+:[a-z0-9]+,?)+)$/iU

would capture the all line.

Note the '?:' to avoid creating any capturing group.


I just tried:

<?php
$string = 'cba|q:1,f:awd2,t:3awd,h:gr';
$subpat = '[a-z]+:[a-z0-9]+';
$pat = "/^([a-z]+)\|($subpat(?:,$subpat)+)$/i";
preg_match( $pat, $string, $matches );
print_r( $matches );
?>

which yields

Array
(
    [0] => cba|q:1,f:awd2,t:3awd,h:gr
    [1] => cba
    [2] => q:1,f:awd2,t:3awd,h:gr
)

At this point you have the part before the vertical bar in matches[1] and the rest in matches[2]. The repetition of $subpat is there to ensure the strings to be properly separated by commas. After that, apply explode on matches[2].


$string = 'cba|q:1,f:awd2,t:3awd,h:gr';

$re = '~(?:  ^(\w+)\| ) | (?: (\w+) : (\w+) (?:,|$) )~x';
preg_match_all($re, $string, $m, PREG_SET_ORDER);
var_dump($m);

this will match the part before the pipe ("lead") and all key-value pairs at once. "lead" will be in $m[0][1] and key-values will be in $m[1..x][2] and [3]. Add some simple post-processing to convert this to a usable form, for example:

$lead = $m[0][1];
foreach(array_slice($m, 1) as $p)
    $data[$p[2]] = $p[3];
var_dump($lead, $data);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜