Regular expression to extract another expression from a string with delimiters

2023-03-30 05:20 问答作者：

This question is a little odd, and I have spent a fair while pushing my knowledge of regular expressions to get this to the point it is at. I'm stuck at the last little bit however. The problem is as follows:

I have a string (which denotes a url in a routing system I'm modifying), that may contain a regular expression to match some segment. For example:

$pattern = "/some/path/to/</[a-z]+/>regex_var1/location";

The important bits to note here are:

The regular expression is delimited within the string with </ /> (this is not especially optional unless its the end of the world for legacy reasons. I would prefer to leave this as is).
The bit of text after the /> (regex_var1) is a name for the match of this parameter. I need to keep this out of the expression to keep it compatible with the rest of the system, suffice to say it can be ignored in this context.
This url pattern would match /some/path/to/another/location

What I want to achieve is to split a given format (example as above) into segments. These segments are used in a backtracking开发者_如何学Go tree traversal to match a Request URI with a controller. At present regular expressions are not supported, my intention is to allow this. In the past each segment was denoted by a /, however I require / characters in the contained regular expression. If I use it in it's current form the expression is split across two segments. For example

$pattern = "/some/</([a-z]+)(/optional)?/>regex2/location";
$segments = preg_split('/(?<!<)\/(?!>)/', $pattern);

yields 4 segments

// print_r($segments)
Array
(
    [0] => 
    [1] => some
    [2] => </([a-z]+)(
    [3] => optional)?/>regex2
    [4] => location
)

when I actually only want 3

// print_r($segments)
Array
(
    [0] => 
    [1] => some
    [2] => </([a-z]+)(/optional)?/>regex2
    [3] => location
)

I am not interested in matching the whole URL with a regular expression, which would defeat the whole point of the exercise. This problem might seem unwarranted in isolation, but details about why I am after this specific implementation are beyond the scope of the question.

Hm, I cannot see an easy way to do it with a regexp only. You might first parse out the regexes (/<\/.*?\/>[^\/]*/), store them in an array and replace them by something easy yet non-clashing ($1), then run your regex and reinsert the regexes.

Another way to do it:

$str = "/some/</([a-z]+)(/optional)?/>regex2/location";
$out_segments = array();
$in_regex = false;
foreach(preg_split('+/+', $str) as $segment) {
    if ($in_regex) {
        if (substr($segment, 0, 1) === '>') {
            $in_regex = false;
        }
        $out_segments[count($out_segments) - 1] .= "/$segment";
        continue;
    }
    if (!$in_regex && substr($segment, -1, 1) === '<') {
        $segment = substr($segment, 0, -1);
        if ($segment !== '') {
            $out_segments[] = $segment;
        }
        $in_regex = true;
        $segment = '<';
    }
    if ($segment !== '') {
        $out_segments[] = $segment;
    }
}
var_dump($out_segments);

Edit: The wrong pseudocode looked much easier. The idea is not that bad, though.

You could try splitting the string into its components first, and then processing it afterwards:

$url = '/some/location/</([a-z]+)(/optional)?/>regex2/here/or/there';
$reg = '#(.*?)(</.*?/>.*?(?=/|$))(.*)?#';
if( preg_match($reg, $url, $matches) ) {
    $result = array_merge(
        preg_split( '#/#', $matches[1], 0, PREG_SPLIT_NO_EMPTY),
        array( $matches[2] ),
        preg_split( '#/#', $matches[3], 0, PREG_SPLIT_NO_EMPTY)
    );
    print_r( $result );    
}

Array
(
    [0] => some
    [1] => location
    [2] => </([a-z]+)(/optional)?/>regex2
    [3] => here
    [4] => or
    [5] => there
)

The regex should always be in $matches[2], so you can find it, no matter where it occurs in the URL.

继续阅读：php regex routes

Regular expression to extract another expression from a string with delimiters

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？