开发者

How to efficiently overwrite parts of a string by index in .NET?

In my .NET program I allow a user to define "fields" which are values calculated by the business logic. These fields have 开发者_如何学Pythona position and length, so that they can all be inserted into a single output string at a given index. I also allow a user to specify default content of this output string. If no field is defined to replace a given position, the default character is output instead

My question is, how can I do this efficiently? The StringBuilder class has an Insert(int index, string value) method, but this lengthens the output string each time rather than overwriting it. Am I going to have to set each char one at a time using the StringBuilder[int index] indexer, and is this inefficient? Since I am going to be doing this a lot of times I would like it to be as fast as possible.

Thanks.


Doing it one character at a time is likely your best bet. I say this because calling Insert and Remove on a StringBuilder results in characters being shifted right/left, just as the analogous methods would in any mutable indexed collection such as a List<char>.

That said, this is an excellent candidate for an extension method to make your life a bit easier.

public static StringBuilder ReplaceSubstring(this StringBuilder stringBuilder, int index, string replacement)
{
    if (index + replacement.Length > stringBuilder.Length)
    {
        // You could throw an exception here, or you could just
        // append to the end of the StringBuilder -- up to you.
        throw new ArgumentOutOfRangeException();
    }

    for (int i = 0; i < replacement.Length; ++i)
    {
        stringBuilder[index + i] = replacement[i];
    }

    return stringBuilder;
}

Usage example:

var builder = new StringBuilder("My name is Dan.");
builder.ReplaceSubstring(11, "Bob");

Console.WriteLine(builder.ToString());

Output:

My name is Bob.


The StringBuilder class lets you build a mutable string. Try using the Remove function before doing the Insert. Since its randomly accessible, it should be very quick. As long as the StringBuilder keeps the same capacity, it won't be taking time copying strings around in memory. If you know the string will become longer, try setting the capacity to be larger when you call New StringBuilder()


As long, as strings are immuteble, each manipulation with it, will cause GC load, even StringBuilder insert/remove calls. I would cut source string by insertion points, and then "zip" it with data, that need to be inserted. After that you can just concat strings inside list, to get resulting string.

Here is a sample code that do split/zip operaions. It assumes, that Fields are defined as touple of (position, length, value).

public class Field
{
    public int pos { get; set; }
    public int len { get; set; }
    public string value { get; set; }
    public string tag { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        var source = "You'r order price [price] and qty [qty].";
        var fields = new List<Field>();
        fields.Add(new Field()
        {
            pos = 18, 
            len = 7, 
            value = "15.99$",
            tag = "price"
        });
        fields.Add(new Field()
        {
            pos = 37-3,
            len = 5,
            value = "7",
            tag = "qty"
        });
        Console.WriteLine(Zip(Split(source, fields), fields));
        Console.WriteLine(ReplaceRegex(source, fields));

    }

    static IEnumerable<string> Split(string source, IEnumerable<Field> fields)
    {
        var index = 0;
        foreach (var field in fields.OrderBy(q => q.pos))
        {
            yield return source.Substring(index, field.pos - index);
            index = field.pos + field.len;
        }
        yield return source.Substring(index, source.Length - index);
    }
    static string Zip(IEnumerable<string> splitted, IEnumerable<Field> fields)
    {
        var items = splitted.Zip(fields, (l, r) => new string[] { l, r.value }).SelectMany(q => q).ToList();
        items.Add(splitted.Last());
        return string.Concat(items);
    }
    static string ReplaceRegex(string source, IEnumerable<Field> fields)
    {
        var fieldsDict = fields.ToDictionary(q => q.tag);
        var re = new Regex(@"\[(\w+)\]");
        return re.Replace(source, new MatchEvaluator((m) => fieldsDict[m.Groups[1].Value].value));
    }
}

BTW, would be better to replace special user markers, like [price], [qty] using regex?


I would recommend using the StringBuilder class. However you can do it with a string but there can be side effects. Here are a couple blog posts that show how to manipulate strings and the possible side effects.

http://philosopherdeveloper.wordpress.com/2010/05/28/are-strings-really-immutable-in-net/

http://philosopherdeveloper.wordpress.com/2010/06/13/string-manipulation-in-net-epilogue-plus-new-theme/


If replacing substrings is going to be a big bottleneck, you may want to ditch the substrings thing altogether. Instead, break up your data into strings that can be independently modified. Something like the following:

class DataLine
{
    public string Field1;
    public string Field2;
    public string Field3;

    public string OutputDataLine()
    {
        return Field1 + Field2 + Field3;
    }
}

That's a simple static example, but I'm sure that could be made more generic so that if every user defines fields differently you could handle it. After breaking your data into fields, if you still need to modify individual characters in the fields at least you're not touching the whole set of data.

Now, this may push the bottle neck to the OutputDataLine function, depending on what you're doing with the data. But that can be handled separately if necessary.


If your string is already pre formated for the length then the StringBuilder class has

public StringBuilder Replace(string oldValue, string newValue, int startIndex, int count)

just set your start index and count = 1 so you can replace that specific instance.

Another thing you could do is use String.Format(). Convert all your pre defined fields into indexes so you get a string like "This {0} is very {1}" and then just match up the parameters to the specific index and do a String.Format(myString, myParams);

-Raul


As you fairly stated, StringBuilder has Insert method but no Overwrite method.
So i have created the Overwrite extension method, see below, for my projects.
Note that it will cut the value if the StringBuilder has not enough room for it. You can easily modify it's logic, however.

    public static void Overwrite( this StringBuilder sb, int index, string value )
    {
        int len = Math.Min( value.Length, sb.Length - index );
        sb.Remove( index, len );
        sb.Insert( index, value.Substring( 0, len ) );
    }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜