String concatenation optimisation
We're currently using LINQ to generate SQL queries, with a bit of magic inside to handle case-specific queries.
Up until now, it's worked fine; very fast, hardly any issues. We've recently run into efficiency issues when querying a large amount of data from the database.
We construct the query as such:
var someIntList = new List<int> { 1,2,3,4,5 };
var query = dtx.Query.Containers.Where(c => c.ContainerID.IsIn(someIntList));
or
var someStringList = new List<int> {"a", "b", "c" };
query = dtx.Query.Containers.Where(c => c.BuildingName.IsIn(someStringList));
Which would generate (along with a bunch of other stuff which isn't related to this):
SELECT * FROM Container WHERE ContainerID IN (1,2,3,4,5)
and
SELECT * FROM Container WHERE BuildingName IN ('a','b','c')
Now in this particular situation, we need to return 50,000 rows .. which is generated through 5 seperate queries, splitting up the load. The DB returns fairly quickly (within seconds), however generating the query takes a long time.
Here's the very last function which is called to generate this particular query:
private static string GetSafeValueForItem(object item)
{
if (item == null)
return "NULL";
if (item is bool)
return ((bool)item ? "1" : "0");
if (item is string)
return string.Format("'{0}'", item.ToString().Replace("'", "''"));
if (item is IEnumerable)
return ListToDBList((IEnumerable)item);
if (item is DateTime)
return string.Format("'{0}'", ((DateTime)item).ToString("yyyy-MM-dd HH:mm:ss"));
return item.ToString();
}
private stat开发者_StackOverflowic string ListToDBList(IEnumerable list)
{
var str = list.Cast<object>().Aggregate("(", (current, item) => current + string.Format("{0},", GetSafeValueForItem(item)));
str = str.Trim(',');
str += ")";
return str;
}
Are there any obvious improvements which can be made to speed up the string concatenation in this case? Refactoring the code and using a different implementation (such as avoiding the query generating and hitting the database directly) is not preferred, but if it offered a big performance boost would be great to hear.
Your Aggregate code is basically string concatenation in a loop. Don't do that.
Options:
- Use
StringBuilder
- Use string.Join
Here's an example using String.Join that outputs the same as your ListToDBList:
String.Format("({0})", String.Join(",", list.Cast<object>().Select(item=>GetSafeValueForItem(item)).ToArray()));
See here for an explanation why concatenating in a loop using + (which is what your call to Aggregate was doing) is slow: http://www.yoda.arachsys.com/csharp/stringbuilder.html
I haven't made test cases and profiled your code, so I don't know how much improvement you can expect.
Use a StringBuilder instead of String.Format and the += operator. The += operator is known to be slow. I suspect String.Format is going to be somewhat slow, too.
You could also try string.Join instead of manually joining the array. It works on IEnumerable in newer versions of the .NET framework (4.0?).
Not sure why you're doing list.Cast when a plain IEnumerable will be of objects anyway. But your whole ListToDBList can be replaced by
string.Format("({0})", string.Join(",",list.ToArray()));
Not sure how much quicker it would be, but it's clearer to my mind.
精彩评论