开发者

Reg Ex question

What does the following reg ex code mean?

'/^\开发者_运维知识库w{4,20}$/'


It means that string should contain from 4 to 20 word characters (letters, digits, and underscores). Here:

  • ^ (caret) matches at the start of the string the regex pattern is applied to. Matches a position rather than a character. Most regex flavors have an option to make the caret match after line breaks (i.e. at the start of a line in a file) as well
  • $ (dollar) matches at the end of the string the regex pattern is applied to. Matches a position rather than a character. Most regex flavors have an option to make the dollar match before line breaks (i.e. at the end of a line in a file) as well. Also matches before the very last line break if the string ends with a line break
  • \w shorthand character class matching word characters (letters, digits, and underscores). Can be used inside and outside character classes.
  • {n,m} where n >= 0 and m >= n Repeats the previous item between n and m times. Greedy, so repeating m times is tried before reducing the repetition to n times

Let me show you a usage example. Say, we have the file with the following contents:

[spongebob@conductor /tmp]$ cat file.txt
between4and20
therearetoomanyalphanumcharacters
foo
okay

Now you want to get only those strings which match your pattern '/^\w{4,20}$/':

[spongebob@conductor /tmp]$ grep -E '^\w{4,20}$' blah
between4and20
okay

On output you see only those lines, which fulfil your regular expression.

Ah, also, don't confuse ^ (caret) with ^ immediately after the opening [, the latter negates the character class, causing it to match a single character not listed in the character class. (Specifies a caret if placed anywhere except after the opening [), for example [^a-d] matches x (any character except a, b, c or d).


It means:

  • ^ Between the beginning,
  • $ and the end of a given string,
  • \w{4,20} there should be only 4-20 Alphanumeric characters (like a,b,c,d,1,2,3...etc, and also _)

I think you'll find Wikipedia's page on Regular Expressions a big, big help while learning regexes.


And just so there is no confusion, ^ and $ don't necessarily need each other, If the regex was:

'/^\w{4,20}/'

That'd mean: The match should be at the start of the string, followed by 4-20 alphanumeric characters.

Example (match in bold): Foobar baz

And if the regex pattern was:

'/\w{4,20}$/'

That'd mean: The match should be at the end of the string, proceeded by 4-20 alpha-numeric characters

Example (match in bold): Foo barbaz


/ opening delimiter    
    ^ = start of sting
    \w = word character
    {x,y} min max
    $ = end of string 
/end delimiter
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜