开发者

Capturing a pattern of unknown repitition in PCRE

This may be a quick question for experienced regular expressionists, but I'm having trouble getting my match to execute correctly.

Suppose I had a string that looked like this:

http://aaa-bbbb-cc-ddddd-eee-.sub.dom

I would like to go capture all of the "aaa", "bbbb", "cc", and "ddddd" substrings, but I'm not sure how many there will be (e.g., having all triplets up through "zzz").

This is the regular expression I'm trying to use right now:

/http:\/\/(\w开发者_开发问答*?\-)+\.sub\.dom/

I wrote it this way because:

  1. I want to match substrings, but I want each to terminate when a - is parsed
  2. I want to capture one or more of these substrings

But it seems to only be saving the last match that it makes (in the above case, it would only match "eee-".

Is there a good way to capture all of the matched substrings?

More information: I'm using PHP's PCRE function preg_replace_callback. Thanks!


No, it is not possible to match an unknown number of capture groups.

If you try to repeat a capture group, it will always contain the last value captured.

Could you explain a bit more broadly what you're trying to do? Perhaps there is another simple way to do it (possibly without regular expressions).


If you want the items in the subdomain, and then all matches between the dashes... This should work:

$string = "http://aaa-bbbb-cc-ddddd-eee-.sub.dom";

preg_match("/^http:\/\/([\w-]+?)\..*$/i", $string, $match);

$parts = explode('-', $match[1]);

print_r($parts);

Short of that you will probably have to build a small parsing script to parse the string yourself if that doesn't do it for you.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜