开发者

regex find content question

Trying to use regex refind tag to find the content within the brackets in this example using coldf开发者_如何学Cusion

 joe smith <joesmith@domain.com>

The resulting text should be

 joesmith@domain.com

Using this

<cfset reg = refind(
 "/(?<=\<).*?(?=\>)/s","Joe <joe@domain.com>") />

Not having any luck. Any suggestions?

Maybe a syntax issue, it works in an online regex tester I use.


You can't use lookbehind with CF's regex engine (uses Apache Jakarta ORO).

However, you can use Java's regex though, which does support them, and I've created a wrapper CFC that makes this even easier. Available from: http://www.hybridchill.com/projects/jre-utils.html

(Update: The wrapper CFC mentioned above has evolved into a full project. See cfregex.net for details.)

Also, the /.../s stuff isn't required/relevant here.

So, from your example, but with improved regex:

<cfset jrex = createObject('component','jre-utils').init()/>

<cfset reg = jrex.match( "(?<=<)[^<>]+(?=>)" , "Joe <joe@domain.com>" ) />


A quick note, since I've updated that regex a few times; hopefully it's at its best now...

(?<=<) # positive lookbehind - start matching at `<` but don't capture it.
[^<>]+ # any char except  `<` or `>`, the `+` meaning one-or-more greedy.
(?=>)  # positive lookahead - only succeed if there's a `>` but don't capture it.


I've never been happy with the regular expression matching functions in CF. Hence, I wrote my own:

<cfscript>
    function reFindNoSuck(string pattern, string data, numeric startPos = 1){
        var sucky = refindNoCase(pattern, data, startPos, true);
        var i = 0;
        var awesome = [];

        if (not isArray(sucky.len) or arrayLen(sucky.len) eq 0){return [];} //handle no match at all
        for(i=1; i<= arrayLen(sucky.len); i++){
            //if there's a match with pos 0 & length 0, that means the mime type was not specified
            if (sucky.len[i] gt 0 && sucky.pos[i] gt 0){
                //don't include the group that matches the entire pattern
                var matchBody = mid( data, sucky.pos[i], sucky.len[i]);
                if (matchBody neq arguments.data){
                    arrayAppend( awesome, matchBody );
                }
            }
        }
        return awesome;
    }
</cfscript>

Applied to your problem, here is my example:

<cfset origString = "joe smith <joesmith@domain.com>" />
<cfset regex = "<([^>]+)>" />
<cfset matches = reFindNoSuck(regex, origString) />

Dumping the "matches" variable shows that it is an array with 2 items. The first will be <joesmith@domain.com> (because it matches the entire regex) and the second will be joesmith@domain.com (because it matches the 1st group defined in the regular expression -- all subsequent groups would also be captured and included in the array).


/\<([^>]+)\>$/

something like that, didn't test it though, that one's yours ;)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜