Reg Ex question
What does the following reg ex code mean?
'/^\开发者_运维知识库w{4,20}$/'
It means that string should contain from 4 to 20 word characters (letters, digits, and underscores). Here:
^
(caret) matches at the start of the string the regex pattern is applied to. Matches a position rather than a character. Most regex flavors have an option to make the caret match after line breaks (i.e. at the start of a line in a file) as well$
(dollar) matches at the end of the string the regex pattern is applied to. Matches a position rather than a character. Most regex flavors have an option to make the dollar match before line breaks (i.e. at the end of a line in a file) as well. Also matches before the very last line break if the string ends with a line break\w
shorthand character class matching word characters (letters, digits, and underscores). Can be used inside and outside character classes.{n,m}
wheren >= 0
andm >= n
Repeats the previous item between n and m times. Greedy, so repeating m times is tried before reducing the repetition to n times
Let me show you a usage example. Say, we have the file with the following contents:
[spongebob@conductor /tmp]$ cat file.txt
between4and20
therearetoomanyalphanumcharacters
foo
okay
Now you want to get only those strings which match your pattern '/^\w{4,20}$/'
:
[spongebob@conductor /tmp]$ grep -E '^\w{4,20}$' blah
between4and20
okay
On output you see only those lines, which fulfil your regular expression.
Ah, also, don't confuse ^
(caret) with ^
immediately after the opening [
, the latter negates the character class, causing it to match a single character not listed in the character class. (Specifies a caret if placed anywhere except after the opening [), for example [^a-d]
matches x
(any character except a
, b
, c
or d
).
It means:
- ^ Between the beginning,
- $ and the end of a given string,
- \w{4,20} there should be only 4-20 Alphanumeric characters (like a,b,c,d,1,2,3...etc, and also _)
I think you'll find Wikipedia's page on Regular Expressions a big, big help while learning regexes.
And just so there is no confusion, ^
and $
don't necessarily need each other,
If the regex was:
'/^\w{4,20}/'
That'd mean: The match should be at the start of the string, followed by 4-20 alphanumeric characters.
Example (match in bold): Foobar baz
And if the regex pattern was:
'/\w{4,20}$/'
That'd mean: The match should be at the end of the string, proceeded by 4-20 alpha-numeric characters
Example (match in bold): Foo barbaz
/ opening delimiter
^ = start of sting
\w = word character
{x,y} min max
$ = end of string
/end delimiter
精彩评论