开发者

String comparison in dotnet framework 4

I will explain my problem(excuse my bad English), I have a .NET exe in which every milliseconds of processing is very important.

This program does lots of string comparison (most of it is string1.IndexOf(string2, StringComparison.OrdinalIgnoreCase)).

When i switch to framework 4, my program time is twice than before.

I searched for explanation and I found that the function IndexOf(s, OrdinalIgnoreCase) is much slower in framework 4 (I did test with a simple console application and in a loop the time was 30ms in 3.5 and 210ms in 4.0 ???). But the comparison in current culture is quicker in framework 4 than 3.5.

Here it's a sample of code I use :

int iMax = 100000;
String str  = "Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+fr;+rv:1.9.0.1)+Gecko/2008070208+Firefox/3.0.1";
Stopwatch sw = new Stopwatch();
sw.Start();
StringComparison s = StringComparison.OrdinalIgnoreCase;
for(int i = 1;i<iMax;i++)
{
    str.IndexOf("windows", s);
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.Read();

My questions are :

  1. Has anyone noticed the same problem?

  2. Someone have an explanation on this change?

  3. Is there a solution to by开发者_如何学Cpass the problem?

Thanks.


Ok i have a response of one of my question.

With reflector i can see the difference between framework 2 and 4 and that explain my perforamnce issue.

    public int IndexOf(string value, int startIndex, int count, StringComparison comparisonType)
{
    if (value == null)
    {
        throw new ArgumentNullException("value");
    }
    if ((startIndex < 0) || (startIndex > this.Length))
    {
        throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_Index"));
    }
    if ((count < 0) || (startIndex > (this.Length - count)))
    {
        throw new ArgumentOutOfRangeException("count", Environment.GetResourceString("ArgumentOutOfRange_Count"));
    }
    switch (comparisonType)
    {
        case StringComparison.CurrentCulture:
            return CultureInfo.CurrentCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.None);

        case StringComparison.CurrentCultureIgnoreCase:
            return CultureInfo.CurrentCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.IgnoreCase);

        case StringComparison.InvariantCulture:
            return CultureInfo.InvariantCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.None);

        case StringComparison.InvariantCultureIgnoreCase:
            return CultureInfo.InvariantCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.IgnoreCase);

        case StringComparison.Ordinal:
            return CultureInfo.InvariantCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.Ordinal);

        case StringComparison.OrdinalIgnoreCase:
            return TextInfo.IndexOfStringOrdinalIgnoreCase(this, value, startIndex, count);
    }
    throw new ArgumentException(Environment.GetResourceString("NotSupported_StringComparison"), "comparisonType");
}

This is the base code of function IndexOf of the 2 framework (no difference between 4 and 2)

But in the function TextInfo.IndexOfStringOrdinalIgnoreCase there are differences :

Framework 2 :

    internal static unsafe int IndexOfStringOrdinalIgnoreCase(string source, string value, int startIndex, int count)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    return nativeIndexOfStringOrdinalIgnoreCase(InvariantNativeTextInfo, source, value, startIndex, count);
}

Framework 4 :

    internal static int IndexOfStringOrdinalIgnoreCase(string source, string value, int startIndex, int count)
{
    if ((source.Length == 0) && (value.Length == 0))
    {
        return 0;
    }
    int num = startIndex + count;
    int num2 = num - value.Length;
    while (startIndex <= num2)
    {
        if (CompareOrdinalIgnoreCaseEx(source, startIndex, value, 0, value.Length, value.Length) == 0)
        {
            return startIndex;
        }
        startIndex++;
    }
    return -1;
}

The main algorithm has changed in framework 2 the call is a nativeDll that has been removed of framework 4. Its good to know


This is a known issue in .NET 4.

Here is the MS Connect report.


I cannot answer on your specific .NET 4 speed issue.

However you'd probably gain much more speed by improving your algo. Check out the Rabin-Karp string search algo.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜