开发者

Regex: Matching strings without breaking the match string by matching to a escaped quote in middle of the string

I'm matching out conditionals from a template if statement. The problem lies with parsing out the strings before breaking a开发者_Python百科part the conditional statement into it's individual conditions.

I'm replacing the string conditions with placeholders before the breakup of conditions so the strings don't interfere with the break up pattern matching.

The code below does it's job fine.

// remove quoted strings from conditional elements as will conditional tokenising below
if (preg_match_all('/([\"\'])([^\\1]*?)\\1/s', $conditions, $string_matches))
{   
    $uid = uniqid(time().'_');
    $strings = array(
            'id' => $uid, 
            'matches' => array()
        ); 
    $replacements = array();
    foreach($string_matches[0] as $key=>$match)
    {
        $match_id = '#'.$uid.md5($match);
        $replacements[$match] = $match_id;
        $strings['matches'][$match_id] = array(
                'match' => $match,
                'content' => $string_matches[2][$key],
            );
    }
    $conditions = str_replace(array_keys($replacements), array_values($replacements), $conditions);
}

It matches the following great

boolean_arg1 && arg2 !== 'testing multi quotes' && arg3 === "test & yup" -or-
boolean_arg1 && arg2 !== 'testing "multi" quotes' && arg3 === "test & yup"

giving me

boolean_arg1 && arg2 !== #1292059008_4d0341809c0f74062e5ac5086fb24f8e8383a137a5a5e && arg3 === #1292059008_4d0341809c0f7d4820850f1f6e06677e741be556352e3
boolean_arg1 && arg2 !== #1292059102_4d0341de3f5196213c34e77a2cfbb11f867f9ed57c85f && arg3 === #1292059102_4d0341de3f519d4820850f1f6e06677e741be556352e3

However introducing escaped quotes into the string, breaks the pattern match at the escaped string.

boolean_arg1 && arg2 !== 'testing "multi" \'quotes' && arg3 === "test && yup"

gives

boolean_arg1 && arg2 !== #1292059161_4d03421974c3166a7cae87ddc1002905892eff6453bd4quotes' && arg3 === #1292059161_4d03421974c31d4820850f1f6e06677e741be556352e3

(Notice quotes') after the first replacement.

I'm not very good with doing lookups and the like. I was wondering if there is a simple solution for convert the regex in the code above to one that matches complete strings with escaped quotes in them?


Use a pattern that reflects escape sequences like:

/"(?:[^"\\]*|\\["\\])*"|'(?:[^'\\]*|\\['\\])*'/

With this only the escape sequences of \\ and \" or \' respectively are known. You can expand them by changing the ["\\]/['\\].

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜