开发者

Regex which ensures no character is repeated

I need to ensure that a input string follows these rules:

  • It should contain upper case characters only.
  • NO character should be repeated in the string. eg. ABCA is not valid because 'A' is being repeated.

For the upper case thing, [A-Z] should be fine. But i am lost at how to ensure no repeating characters.

Can someone suggest some m开发者_JAVA百科ethod using regular expressions ?


You can do this with .NET regular expressions although I would advise against it:

string s = "ABCD";
bool result = Regex.IsMatch(s, @"^(?:([A-Z])(?!.*\1))*$");

Instead I'd advise checking that the length of the string is the same as the number of distinct characters, and checking the A-Z requirement separately:

bool result = s.Cast<char>().Distinct().Count() == s.Length;

Alteranatively, if performance is a critical issue, iterate over the characters one by one and keep a record of which you have seen.


This cannot be done via regular expressions, because they are context-free. You need at least context-sensitive grammar language, so only way how to achieve this is by writing the function by hand.

See formal grammar for background theory.


Why not check for a character which is repeated or not in uppercase instead ? With something like ([A-Z])?.*?([^A-Z]|\1)


Use negative lookahead and backreference.

  string pattern = @"^(?!.*(.).*\1)[A-Z]+$";
  string s1 = "ABCDEF";
  string s2 = "ABCDAEF";
  string s3 = "ABCDEBF";
  Console.WriteLine(Regex.IsMatch(s1, pattern));//True
  Console.WriteLine(Regex.IsMatch(s2, pattern));//False
  Console.WriteLine(Regex.IsMatch(s3, pattern));//False

\1 matches the first captured group. Thus the negative lookahead fails if any character is repeated.


This isn't regex, and would be slow, but You could create an array of the contents of the string, and then iterate through the array comparing n to n++

=Waldo


It can be done using what is call backreference.

I am a Java program so I will show you how it is done in Java (for C#, see here).

final Pattern aPattern = Pattern.compile("([A-Z]).*\\1");
final Matcher aMatcher1 = aPattern.matcher("ABCDA");
System.out.println(aMatcher1.find());
final Matcher aMatcher2 = aPattern.matcher("ABCDA");
System.out.println(aMatcher2.find());

The regular express is ([A-Z]).*\\1 which translate to anything between 'A' to 'Z' as group 1 ('([A-Z])') anything else (.*) and group 1.

Use $1 for C#.

Hope this helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜