开发者

Compare two values using RegEx

If I have two values eg/ABC001 and ABC100 or A0B0C1 and A1B0C0开发者_StackOverflow, is there a RegEx I can use to make sure the two values have the same pattern?


Well, here's my shot at it. This doesn't use regular expressions, and assumes s1 and s2 only contain numbers or digits:

public static bool SamePattern(string s1, string s2)
{
   if (s1.Length == s2.Length)
   {
      char[] chars1 = s1.ToCharArray();
      char[] chars2 = s2.ToCharArray();

      for (int i = 0; i < chars1.Length; i++)
      {
         if (!Char.IsDigit(chars1[i]) && chars1[i] != chars2[i])
         {
            return false;
         }
         else if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
         {
            return false;
         }
      }

      return true;
   }
   else
   {
      return false;
   }
}

A description of the algorithm is as follows:

  1. If the strings have different lengths, return false.
  2. Otherwise, check the characters in the same position in both strings:
    1. If they are both digits or both numbers, move on to the next iteration.
    2. If they aren't digits but aren't the same, return false.
    3. If one is a digit and one is a number, return false.
  3. If all characters in both strings were checked successfully, return true.


If you don't know the pattern in advance, but are only going to encounter two groups of characters (alpha and digits), then you could do the following:

Write some C# that parsed the first pattern, looking at each char and determine if it's alpha, or digit, then generate a regex accordingly from that pattern.

You may find that there's no point writing code to generate a regex, as it could be just as simple to check the second string against the first.

Alternatively, without regex:

First check the strings are the same length. Then loop through both strings at the same time, char by char. If char[x] from string 1 is alpha, and char[x] from string two is the same, you're patterns are matching.

Try this, it should cope if a string sneaks in some symbols. Edited to compare character values ... and use Char.IsLetter and Char.IsDigit

private bool matchPattern(string string1, string string2)
{
    bool result = (string1.Length == string2.Length);
    char[] chars1 = string1.ToCharArray();
    char[] chars2 = string2.ToCharArray();

    for (int i = 0; i < string1.Length; i++)
    {
        if (Char.IsLetter(chars1[i]) != Char.IsLetter(chars2[i]))
        {
            result = false;
        }
        if (Char.IsLetter(chars1[i]) && (chars1[i] != chars2[i]))
        {   
            //Characters must be identical
            result = false;
        }
        if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
            result = false;
    }
    return result;
}


Consider using Char.GetUnicodeCategory
You can write a helper class for this task:

public class Mask
{
    public Mask(string originalString)
    {
        OriginalString = originalString;
        CharCategories = originalString.Select(Char.GetUnicodeCategory).ToList();
    }

    public string OriginalString { get; private set; }
    public IEnumerable<UnicodeCategory> CharCategories { get; private set; }

    public bool HasSameCharCategories(Mask other)
    {
        //null checks
        return CharCategories.SequenceEqual(other.CharCategories);
    }
}

Use as

Mask mask1 = new Mask("ab12c3");
Mask mask2 = new Mask("ds124d");
MessageBox.Show(mask1.HasSameCharCategories(mask2).ToString());


I don't know C# syntax but here is a pseudo code:

  • split the strings on ''
  • sort the 2 arrays
  • join each arrays with ''
  • compare the 2 strings


A general-purpose solution with LINQ can be achieved quite easily. The idea is:

  1. Sort the two strings (reordering the characters).
  2. Compare each sorted string as a character sequence using SequenceEquals.

This scheme enables a short, graceful and configurable solution, for example:

// We will be using this in SequenceEquals
class MyComparer : IEqualityComparer<char>
{
    public bool Equals(char x, char y)
    {
        return x.Equals(y);
    }

    public int GetHashCode(char obj)
    {
        return obj.GetHashCode();
    }
}

// and then:
var s1 = "ABC0102";
var s2 = "AC201B0";

Func<char, double> orderFunction = char.GetNumericValue;
var comparer = new MyComparer();
var result = s1.OrderBy(orderFunction).SequenceEqual(s2.OrderBy(orderFunction), comparer);

Console.WriteLine("result = " + result);

As you can see, it's all in 3 lines of code (not counting the comparer class). It's also very very easily configurable.

  • The code as it stands checks if s1 is a permutation of s2.
  • Do you want to check if s1 has the same number and kind of characters with s2, but not necessarily the same characters (e.g. "ABC" to be equal to "ABB")? No problem, change MyComparer.Equals to return char.GetUnicodeCategory(x).Equals(char.GetUnicodeCategory(y));.
  • By changing the values of orderFunction and comparer you can configure a multitude of other comparison options.

And finally, since I don't find it very elegant to define a MyComparer class just to enable this scenario, you can also use the technique described in this question:

Wrap a delegate in an IEqualityComparer

to define your comparer as an inline lambda. This would result in a configurable solution contained in 2-3 lines of code.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜