C# Simple Regex - 32 characters, containing only 0-9 and a-f (GUID)
How to test using regex in C# if:
- length of string is exactly 32
- string contains only n开发者_运维百科umbers from 0-9 and small letters from a-f
How about:
Regex regex = new Regex("^[0-9a-f]{32}$");
if (regex.IsMatch(text))
...
In .Net 4.0 there is Guid
structure that has TryParse
static function.
http://msdn.microsoft.com/en-us/library/system.guid.aspx
If you're not dead-set on using a Regex for this purpose, this is pretty clean:
Func<char, bool> charValid = c => (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f');
bool isValid = text.Length == 32 && text.All(charValid);
Ok, so there's an answer here that shows you how to find out if this string is 32 characters in length, and each character is either a digit or lowercase 'a' to lowercase 'f' with a regex, and another that shows you how to do it with a simple Func which scans that characters in the string. They're both great answers and technically correct, but in your question title you explicitly state "GUID" which opens up a new can of worms.
GUIDs can take a number of different string representations, and you might encounter any one of them. Do you need to handle all of these? Will you need to accomodate strings that begin and end with curly braces ('{' and '}') or parentheses? What about dashes ('-')? According to MSDN, creating a new GUID with
string s = ...;
Guid g = new Guid(s);
Allows for strings in the following forms
32 contiguous digits: dddddddddddddddddddddddddddddddd
-or-
Groups of 8, 4, 4, 4, and 12 digits with hyphens between the groups. The entire GUID can optionally be enclosed in matching braces or parentheses: dddddddd-dddd-dddd-dddd-dddddddddddd
-or-
{dddddddd-dddd-dddd-dddd-dddddddddddd}
-or-
(dddddddd-dddd-dddd-dddd-dddddddddddd)
-or-
Groups of 8, 4, and 4 digits, and a subset of eight groups of 2 digits, with each group prefixed by "0x" or "0X", and separated by commas. The entire GUID, as well as the subset, is enclosed in matching braces: {0xdddddddd, 0xdddd, 0xdddd,{0xdd,0xdd,0xdd,0xdd,0xdd,0xdd,0xdd,0xdd}}
Do you need to handle all of these cases? Also, consider if using a regex is really the best option. As some people have already commented, regex can be confusing for some developers, and the intent isn't always clear. On top of that, regex can be slow in some cases.
I whipped up a quick performance test on three different ways of determining if a string is in fact a string representation of a GUID:
- Regex
- Checking all chars in the string
- Creating a new Guid instance with the given string (if the Guid can be created, then the string is a valid string representation)
Here's the code:
[Test]
public void Test_IsRegex_Performance()
{
Action withRegexMatch = () =>
{
Regex regex = new Regex("^[0-9a-f]{32}$");
Guid g = new Guid();
string s = g.ToString();
regex.IsMatch(s);
};
Action withCharCheck = () =>
{
Guid g = new Guid();
string s = g.ToString();
Func<char, bool> charValid = c => (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f');
bool isValid = s.Length == 32 && s.All(charValid);
};
Action withNewGuid = () =>
{
Guid g = new Guid();
string s = g.ToString();
try
{
Guid g2 = new Guid(s);
// if no exception is thrown, this is a valid string
// representation
}
catch
{
// if an exception was thrown, this is an invalid
// string representation
}
};
const int times = 100000;
Console.WriteLine("Regex: {0}", TimedTask(withRegexMatch, times));
Console.WriteLine("Checking chars: {0}", TimedTask(withCharCheck, times));
Console.WriteLine("New Guid: {0}", TimedTask(withNewGuid, times));
Assert.Fail();
}
private static TimeSpan TimedTask(Action action, int times)
{
Stopwatch timer = new Stopwatch();
timer.Start();
for (int i = 0; i < times; i++)
{
action();
}
timer.Stop();
return timer.Elapsed;
}
And the results from a million iterations on my machine:
Regex: 00:00:10.1786901
Checking chars: 00:00:00.2504520
New Guid: 00:00:01.3129005
So, the regex solution is slow. Ask yourself if you really need a regex here. Note that you can probably eek out some extra performance by only declaring the regex once, and reusing it, but I think the point in this case is that you might have better success by looking at what you're trying to accomplish, as opposed to how.
Hope that helps.
精彩评论