Regular expression to match valid namespace name
I thought this question was asked before but I tried Google but didn't find an answer. Maybe I used wrong keywords.
Is it possible to use regular expression to match valid C# namespace name?
Update:
Thanks everyone for your answers and research! This question is much more complex than I expected. As Oscar Mederos and Joey pointed out,开发者_开发问答 a valid namespace cannot contain C# reserved keywords, and can contain a lot more Unicode characters than Latin letters.
But my current project only need to syntactically validate namespaces. So I accepted primfaktor's answer, but I upvoted all answers.
For me, this worked:
^using (@?[a-z_A-Z]\w+(?:\.@?[a-z_A-Z]\w+)*);$
It matches using lines in C# and returns the complete namespace in the first (and only) match group. You may want to remove ^
and $
to allow for indentation and trailing comments.
Example on RegExr.
I know that the question was how to validate a namespace using a regex, but another way to do it is to make the compiler do the work. I am not certain that what I have here catches 100% of all errors, it does work pretty well. I created this ValidationRule for a project on which I am currently working:
using System.CodeDom.Compiler;
using System.Windows.Controls;
using Microsoft.CSharp;
using System.Text.RegularExpressions;
namespace Com.Gmail.Birklid.Ray.CodeGeneratorTemplateDialog
{
public class NamespaceValidationRule : ValidationRule
{
public override ValidationResult Validate(object value, System.Globalization.CultureInfo cultureInfo)
{
var input = value as string;
if (string.IsNullOrWhiteSpace(value as string))
{
return new ValidationResult(false, "A namespace must be provided.");
}
else if (this.doubleDot.IsMatch(input))
{
return new ValidationResult(false, "'..' is not valid.");
}
var inputs = (value as string).Split('.');
foreach (var item in inputs)
{
if (!this.compiler.IsValidIdentifier(item))
{
return new ValidationResult(false, string.Format(cultureInfo, "'{0}' is invalid.", item));
}
}
return ValidationResult.ValidResult;
}
private readonly CodeDomProvider compiler = CSharpCodeProvider.CreateProvider("CSharp");
private readonly Regex doubleDot = new Regex("\\.\\.");
}
}
If you want to know if a string can be used as a namespace, you should refer to The C# Language Specifications and look at the grammar that validates the namespace.
The namespace should be a sequence of identifiers
separated by a .
. Example:
identifier
identifier.identifier
identifier.identifier.identifier
...
And what is an identifier
?
available_identifier
or @any_identifier
An available_identifier
is an any_identifier
but cannot be a keyword
reserved by the language.
any_identifier
is the following:
(_|letter)(letter|number)*
Edit:
I must say that this regex can be really really complicated. Take in count that it is necessary to check if no reserved keywords are used, and here is the list of the reserved keywords:
abstract as base bool break byte case catch char checked class const continue decimal default delegate do double else enum event explicit extern false finally fixed float for foreach goto if implicit in int interface internal is lock long namespace new null object operator out override params private protected public readonly ref return sbyte sealed short sizeof stackalloc static string struct switch this throw true try typeof uint ulong unchecked unsafe ushort using virtual void volatile while
Can't you split the validation, maybe creating a method in C# or any other language to validate it instead of using only one regex?
To be honest, I suggest you any of those two things:
- Implement a parser of that grammar (see the reference). You can do it either by hand or using tools like ANTLR
Implement a method that takes the string you want to validate (let's call it
str
) and write a file like:namespace str { class A {} }
and try to compile it :)
using msbuild or any C# compiler. If it gives an error, then you know that word is not correct :)
How about this...
(?:[A-Z][a-zA-Z0-9\._]+)+[a-z0-9_]
精彩评论