开发者

Reg Ex negation not working in XML string

I am trying to apply negation on regular expression in .Net. It does not work. When string has valid last name reg ex should not match. Fo开发者_JAVA百科r invalid last name it should match. Valid name allows only charecters, spaces, single quote and length between 1-40. Somebody suggested to parse the XML, I don't want to do that. I know there is another way of doing this by removing the negation in reg ex and invert the match condition in code. But I don't want that too. I need pure reg ex solution for this.

Here is my code. That does match the valid last name. But I don't want to match.

string toBevalidated = @"<FirstName>SomeName</FirstName><LastName>Some</LastName><Address1>Addre1</Address1>";
        var regex = new Regex(@"<LastName>([^a-zA-Z'\s])|(.{41,})</LastName>");
        var match = regex.Match(toBevalidated);

        // Check to see if a match was found
        if (match.Success)
        {
            Console.WriteLine("Success");
        }
        else
        {
            Console.WriteLine("Failed");
        }

EDIT: There are confusion here let me give some example what I intended to to. when last name is valid reg ex should not match. For example below samples should not match the reg ex

case 1

<FirstName>SomeName</FirstName><LastName>Brian</LastName><Address1>Addre1</Address1>

Case 2

<FirstName>SomeName</FirstName><LastName>O'neil</LastName><Address1>Addre1</Address1>

case 3

<FirstName>SomeName</FirstName><LastName>Peter John</LastName><Address1>Addre1</Address1>

When last name is invalid, reg ex should match

case 4

<FirstName>SomeName</FirstName><LastName>Brian123</LastName><Address1>Addre1</Address1>

case 5

<FirstName>SomeName</FirstName><LastName>#Brian</LastName><Address1>Addre1</Address1>

case 6

<FirstName>SomeName</FirstName><LastName>BrianBrianBrianBrianBrianBrianBrianBrianBrianBrian</LastName><Address1>Addre1</Address1>

if you need more information please let me know


It would have been helpful if you'd given an example of this not behaving as you expected it to, but I suspect it's because you're only matching an invalid character if it's a single invalid character, e.g.

<LastName>5</LastName>

That will match (I believe; I haven't checked) but this won't:

<LastName>55</LastName>

I think you could do something like:

<LastName>(.*[^a-zA-Z'\s].*)|(.{41,})</LastName>

to ensure that there's at least one invalid character in there (or that there are 41 or more characters). But there may be corner cases here where that's inappropriate.

EDIT: Got it. The alternation operator was taking everything before it as an option, instead of just the preceding group. The final regular expression is:

<LastName>((.*[^a-zA-Z'\s].*)|(.{41,}))</LastName>

And here's some sample code:

using System;
using System.Text.RegularExpressions;

class Test
{
    static void Main()
    {
        string pattern = @"<LastName>((.*[^a-zA-Z'\s].*)|(.{41,}))</LastName>";
        Regex regex = new Regex(pattern);

        string[] samples = {
            "<FirstName>SomeName</FirstName><LastName>Brian</LastName><Address1>Addre1</Address1>",
            "<FirstName>SomeName</FirstName><LastName>O'neil</LastName><Address1>Addre1</Address1>",
            "<FirstName>SomeName</FirstName><LastName>Peter John</LastName><Address1>Addre1</Address1>",
            "<FirstName>SomeName</FirstName><LastName>Brian123</LastName><Address1>Addre1</Address1>",                
            "<FirstName>SomeName</FirstName><LastName>#Brian</LastName><Address1>Addre1</Address1>",
            "<FirstName>SomeName</FirstName><LastName>BrianBrianBrianBrianBrianBrianBrianBrianBrianBrian</LastName><Address1>Addre1</Address1>",
        };

        foreach (var sample in samples)
        {
            bool valid = !regex.IsMatch(sample);
            Console.WriteLine("Valid: {0} Text: {1}", valid, sample);
        }
    }
}


Try to rewrite the RegEx to: <LastName>([a-zA-Z'\s]{0,41})</LastName> and use negation in other code: if (!match.success) ...


Ok,

I couldn't get it work in one pass but if you do it in 2 passes I think it will work, first you check for the incorrect characters and in the second pass you check for the length,

Match m = Regex.Match(@"<FirstName>SomeName</FirstName><LastName>Some</LastName><Address1>Addre1</Address1>", "<LastName>(.*[^a-zA-Z'\\s].*)</LastName>");

m = Regex.Match(@"<FirstName>SomeName</FirstName><LastName>SomeSomSomeSomeSomeSomeSomeSomeSomeSomeeSomeSomeSomeSomeSomeSomeSome</LastName><Address1>Addre1</Address1>", "<LastName>[a-zA-Z'\\s]{41,}</LastName>");

I haven't checked all the cases you provided please check it out and let me know if it works.

Thanks for Skeet for the correction .[^a-zA-Z'\s]. it does need .* before and after otherwise it won't match the names containing special characters.

The second part of the regex pattern which checks the length matches every thing even the and that's why it does not work.

Good luck.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜