开发者

regular expression for hyphen seperated strings

I created a regular expression for the format XX-XX-XX-XX-XX, where XX is a alphanumeric.

Regular expression is ^[a-z0-9A-Z]{2}-[a-z0-9A-Z]{2}-[a-z0-9A-Z]{2}-[a-z0-9A-Z]{2}$. But what I really want to do is to match the below patterns. My string should have one hyphen (-) for each 2 characters.

exapmle 1 : XX-            OK
exapmle 2 : XX-X           OK 
exapmle 3 : XX-XX-   开发者_如何学运维      OK 
exapmle 4 : XX-XX-XX       OK
exapmle 5 : XX-XX-XX-X     OK
exapmle 6 : XX-XX-X        OK
exapmle 7 : XX-XX--        NOT OK
exapmle 8 : XX-XX-X-       NOT OK


This will do the trick. You basically want any number (zero or more) of XX- followed by zero, one or two X:

^([0-9A-Za-z]{2}-)*[0-9A-Za-z]{0,2}$


The match needs to start with a match of any number of XX- strings:

^([A-Za-z0-9]{2}-)*

Depending on the regexp engine you're using, you may be able to use the somewhat more concise [[:alnum:]] here. Note that [\w\d] as originally posted is inappropriate for a couple of reasons; see Alan Moore's comment for details.

Getting the last bit is surprisingly difficult, because you have to nest conditional elements. I.E. the final hyphen only matches if the preceding X matches, and that X only matches if the first one does.

Note that this approach assumes that you're not limiting the number of XX- segments. In particular, note that it will match XX-XX-XX-XX-XX-. You can limit the number of XX- segments pretty easily, but getting it to not match a hyphen after the fifth XX is a little more complicated.

Anyway, back to the explanation. A following X is okay:

^([A-Za-z0-9]{2}-)*([A-Za-z0-9])?

It's also okay if it is followed by another X:

^([A-Za-z0-9]{2}-)*([A-Za-z0-9]([A-Za-z0-9])?)?

And a final - is also okay (assuming that it's preceded by XX):

^([A-Za-z0-9]{2}-)*([A-Za-z0-9]([A-Za-z0-9]-?)?)?

Finally, append $ to specify that it should take up the whole line:

^([A-Za-z0-9]{2}-)*([A-Za-z0-9]([A-Za-z0-9]-?)?)?$

I've forked SeanA's jsfiddle. Thanks, Sean!

update

Thanks to Alan Moore's great job "watching the watchmen" (see the comments), I realized that you can do this quite a bit more simply with

/^([A-Za-z0-9]{2}-)*[A-Za-z0-9]{0,2}$/

An updated fiddle for that RE.

Here you are saying that there can be up to two Xs at the end of a series of XX- segments. This works because if there is a hyphen at the end, it will just become part of an additional XX- segment.

I've left the above info in because it solves a more general problem. For example, if each of the segments consisted of a letter and a number, you would have to take such an approach.

If you want it to match XX-XX-XX-XX-XX but not XX-XX-XX-XX-XX-, you can use

/^([A-Za-z0-9]{2}-){0,4}[A-Za-z0-9]{0,2}$/

A forked fiddle for that use case.


Looks like this does the trick:

/^([\w\d]{2}-)*([\w\d]|([\w\d]{2}-?)?)$/

See it in action here: http://jsfiddle.net/sadkinson/FaQe6/6/

Explanation:

/^([\w\d]{2}-)*  -- any number of XX-
([\w\d]          -- either a single X
|([\w\d]{2}-?)?  -- or two Xs and maybe a dash to end

UPDATE: I fixed the above based on a very astute observation (+1) by a commenter :)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜