开发者

Finding an open and closing tag in Regexp

Is there a way to find custom tags in regexp I.e. match

{a}sometext{/a}  

As well as

{c=#fff}sometext{/c}  

So that it finds the entire block of inner content? The problem is the sometext could have another tag as in:

{a=http://www.google.com}{b}Hello, world{/b}{/a}  

The only solutions I can come up with would match from {a... to .../b} when I want {a... to .../a} is there a single regexp solution, or would it开发者_运维知识库 be best to match the start, and then use another method to find the end from the back up, and grab it out that way? I'm using PHP 5.2 so I have all the options that entails.


This works:

$subject = 'bla bla{a=http://www.google.com}{b}Hello, world{/b}{/a} bla';
$regex = '~\\{a(?:=[^}]+)?\\}(.*?)\\{/a\\}~';
preg_match($regex, $subject, $matches);
var_dump($matches);

gives:

array(2) {
  [0]=>
  string(48) "{a=http://www.google.com}{b}Hello, world{/b}{/a}"
  [1]=>
  string(19) "{b}Hello, world{/b}"
}

BEGIN EDIT You could make the regex more general with backreferences

$regex = '~\\{([a-z]+)(?:=[^}]+)?\\}(.*?)\\{/\\1\\}~';

but in that case, I have no idea how to match inner tags of arbitrary depth. END EDIT

However, I strongly recommened against using a regular expression for this purpose. I suggest you iterate over the string, one array at a time and use an auxiliary stack to keep track of the tags you find (use array_push, array_pop and end for peek).


Sounds like you are trying to do what MediaWiki already does with wiki markup language. I would suggest using their parser and their markup or if you choose to roll your own you might find inspiration from seeing how they do it.

Manual for Parser.php

Source for Parser.php

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜