开发者

Stuck with understanding this regex

I'm stuck with this regex

((["0"-"9"])+ ("-")*)*
    (["0"-"9"])* ["a"-"z"] ( ["a"-"z","0"-"9"] )*
    ( ("-")+ (["a"-"z","0"-"9"])+)*

What I understand so far is:

 ((["0"-"9"])+ ("-")*)*

can be a number between 0-9. The plus means it can be repeated (e.g. 1157). Not sure about what the star does :S I think an example of this first line could be 1157- ?

(["0"-"9"])* ["a"-"z"] ( ["a"-"z","0"-"9"] )*

starts with a number, followed by a letters and numbers e.g. 1test5

 ( ("-")+ (["a"-"z","0"-"9"])+)*

optionally starts with a minus and follows by letters 开发者_StackOverflowand numbers. e.g -54

If anyone could give me an example of a string that matches the regex it would be helpful. Finding is hard to understand this. :)


  1. ((["0"-"9"])+ ("-")*)*
  2. (["0"-"9"])*
  3. ["a"-"z"]
  4. ( ["a"-"z","0"-"9"] )*
  5. ( ("-")+(["a"-"z","0"-"9"])+)*

where:

* is 0 or more
+ is one or more

To better understand this regex I have broken it into segments which are far easier to read. To get the total string just follow each segment and you will get strings that match this complete regex.

For segment 1:
1 or more digit followed by 0 to many hyphens '-', which can be repeated 0 or more times.

segment 2:
0 or many digits

segment 3:
only 1 lower case character ("abcdefghijklmnopqrstuvwxyz")

segment 4:
0 or many lower case characters or digits.

segment 5:
a hyphen 1 or more times followed by one or more lower case characters or digits, repeated 0 or more times.

By putting this together we can get the minimum string needed to fit this regex: 0a-a

where segments 2 and 4 can be ignored because they include 0 or more.

Some more examples will be:

9-0a0a-a0
4456---4456---890aasd-asda-a434

to infinite length.


+ means at least one of the preceding expression

* means 0 or more of the preceding expression

So ((["0"-"9"])+ ("-")*)* (I will ignore the quotes) means:

At least 1 digit, a space 0 or more - and all this can be repeated 0 or more times.

that means it will match:

  • The empty string
  • 1 ------234 -
  • 1 2 3

Normally, if you put a space into a regex it will then try to match a space. In my explanation I used it in this way. BUT there is e.g. in some languages an extended modifier, where space in the regex will not be interpreted, they can then be used to make the regex more readable.
I don't know what your regex engine is doing in this case.

You can see your regex here on Regexr (cleaned from the quotes) with some matching examples


Those quotes are odd, but I would guess at

  • ( 0-9 one or more times followed by "-" ) zero or more times.
  • then 0-9 zero or more times.
  • then a-z followed by a-z or 0-9 any number of times
  • then ( "-" at least once followed by a-z or 0-9 at least once ) zero or more times.

So you might find matches like

0-a-a

9123-9123-312346-1412312-123223abcd992-a90898-z08333218457

123-1a92-----abcdef---------123456-a

The options beyond that most basic capture are almost limitless...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜