开发者

How can I shorten this regex for JavaScript?

Basically I just want it to match anything inside (). I tried the . and * but they don't seem to work. Right now my regex looks like:

\(开发者_Python百科([\\\[\]\-\d\w\s/*\.])+\)

The strings it's going to match are URL routes like:

#!/foo/bar/([a-z])/([\d\w])/(*)

In this example, my regex above matches:

  • ([a-z])

  • ([\d\w])

  • (*)

    BONUS: How can I make it so that it only matches when it starts with a ( and ends with a ). I thought I used the ^ at the front where it's \( and the $ and the end where it's \) but no luck. Disregard this bonus. I didnt realize it didnt matter...


Are you worried about nested parentheses? If not, you could set it up to match all characters that aren't a closing paren:

\(([^)]*)\)


Basically I just want it to match anything inside ().
BONUS: How can I make it so that it only matches when it starts with a ( and ends with a )?

Easy peasy.

var re1 = /^\(.*\)$/
// or
var re2 = new RegExp('^\\(.*\\)$');

Edit

Re: @Mike Samuel's comments

Does not match newlines between the parentheses which were explicitly matched by \s in the original.
...
Maybe you should use [\s\S] instead of .
...
If you're going to exclude newlines you should do so intentionally or explicitly.

Note that . matches any single character except the newline character. If you also want to match newlines as part of the "anything" between parentheses, use the [\s\S] character class:

var re3 = /^\([\s\S]*\)$/
// or
var re4 = new RegExp('^\\([\\s\\S]*\\)$');


To negate a match, you use the [^...] construct. Thus, to match anything within parentheses, you would use:

\([^)]+\)

which says "match any string that starts with an open parenthesis, contains any number of characters that are not closing parentheses and ends with a closing parenthesis.

To match entire lines that match the above construct, just wrap it with ^ and $:

^\([^)]+\)$


I'm not completely sure I understand what you're doing, but try this:

var re = /\/(\([^()]+\)(?=\/|$)/;

Matching the leading slash in addition to the opening paren ensures that the paren is indeed at the beginning. You can't do the same thing at the end because you don't know there will be a trailing slash. And if there is one, you don't want to consume it because it's also the leading slash for the next match attempt.

Instead, you use the lookahead - (?=\/|$) - to match the trailing slash without consuming it. If there is no slash, I assume no other character should be present either--hence the anchor: $.

@patorjk brought up a good point, though: can there be more parentheses between the outermost pair? If there are, the problem is much more complicated. I won't bother trying to expand my regex to deal with nested parens; some regex flavors can handle such things, but not JavaScript. Instead I'll recommend this sloppier regex:

\/(\([\s\S]+?\))(?=\/|$)

I say "sloppy" because it relies on the assumption that the sequences /( and )/ will never appear inside a valid match. As with my first regex, the text that you're interested in (i.e., everything but the leading and trailing slashes) will be captured in group #1.

Notice the non-greedy quantifier, too. With a regular greedy quantifier it will match everything from the first ( to the last ) in one shot. In other words, it'll match ([a-z])/([\d\w])/(*) instead of ([a-z]), ([\d\w]) and (*) as you wanted.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜