开发者

Can I shorten this regular expression?

I have the need to check whether strings adhere to a particular ID format.

The format of the ID is as follows:

aBcDe-fghIj-KLmno-pQRsT-uVWxy

A sequence of five blocks of five letters upper case or lower case, separated by one dash.

I have the f开发者_StackOverflowollowing regular expression that works:

string idFormat = "[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}";

Note that there is no trailing dash, but the all of the blocks within the ID follow the same format. Therefore, I would like to be able to represent this sequence of four blocks with a trailing dash inside the regular expression and avoid the duplication.

I tried the following, but it doesn't work:

string idFormat = "[[a-zA-Z]{5}[-]{1}]{4}[a-zA-Z]{5}";

How do I shorten this regular expression and get rid of the duplicated parts?

What is the best way to ensure that each block does also not contain any numbers?


Edit:

Thanks for the replies, I now understand the grouping in regular expressions.

I'm running a few tests against the regular expression, the following are relevant:

Test 1: aBcDe-fghIj-KLmno-pQRsT-uVWxy

Test 2: abcde-fghij-klmno-pqrst-uvwxy

With the following regular expression, both tests pass:

^([a-zA-Z]{5}-){4}[a-zA-Z]{5}$

With the next regular expression, test 1 fails:

^([a-z]{5}-){4}[a-z]{5}$

Several answers have said that it is OK to omit the A-Z when using a-z, but in this case it doesn't seem to be working.


You can try:

([a-z]{5}-){4}[a-z]{5}

and make it case insensitive.


If you can set regex options to be case insensitive, you could replace all [a-zA-Z] with just plain [a-z]. Furthermore, [-]{1} can be written as -.

Your grouping should be done with (, ), not with [, ] (although you're correctly using the latter in specifying character sets.

Depending on context, you probably want to throw in ^...$ which matches start and end of string, respectively, to verify that the entire string is a match (i.e. that there are no extra characters).

In javascript, something like this:

/^([a-z]{5}-){4}[a-z]{5}$/i


This works for me, though you might want to check it:

[a-zA-Z]{5}(-[a-zA-Z]{5}){4}

(One group of five letters, followed by [dash+group of five letters] four times)


([a-zA-Z]{5}[-]{1}){4}[a-zA-Z]{5}


Try

string idFormat = "([a-zA-Z]{5}[-]{1}){4}[a-zA-Z]{5}";

I.e. you basically replace your brackets by parentheses. Brackets are not meant for grouping but for defining a class of accepted characters.

However, be aware that with shortened versions, you can use the expression for validating the string, but not for analyzing it. If you want to process the 5 groups of characters, you will want to put them in 5 groups:

string idFormat =
    "([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})";

so you can address each group and process it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜