开发者

What is the performance impact of LinqToSql compared to plain ADO.Net for a very simple SQL Server database?

I am developing a number-crunching application in C#. It is going to use a database of a very simple table structure (with many records though) and have no concurrent clients (but some threads of one perhaps) to access it but do many many thousands of sequential gets/inserts/updates. Sho开发者_运维问答uld I better use plain ADO.Net querying or LinqToSQL?

It doesn't look obvious for me - maybe LinqToSql is better because of buffering (+readability), maybe its bad because of its overhead.

UPDATE 1:

I've got a very simple model here - 3 tables of about 10 fields each. No foreign keys (normalization is sacrificed to the KISS principle). I could use Excel tables for this, but I prefer coding C#+SQL rather than VBA and believe SQL Server is faster (and allows more records).

UPDATE 2:

I don't really need any ORM, simple SQL querying would be enough. I consider LinqToSql because of 3 reasons: 1. it allows visual, diagram-first database generation, 2. LINQ looks better than having queries as string literals. 3. logically it seems that it may (or may not) increase performance by bulk update/insert commits, cached reads and lazy loading.

UPDATE 3:

I've got 4 GB of RAM and don't mind the application to use all the gigs while processing data.


Any time you are doing that amount of sequential data access, a few obvious questions (that have nothing to do with LINQ-to-SQL vs raw ADO.NET) are:

  • could you pull lists rather than gets
  • could you do the entire thing at the database

LINQ-to-SQL adds a lot of convenience and static checking, and generally the overheads are pretty minimal; however, if you load lots (thousands) of records into a single data-context the identity-manager has more work to do, which may impact performance slightly. We also sometimes notice unexplained pauses in LINQ-to-SQL load times (i.e. the TSQL will take 1ms and the object will take 80ms to materialize) - this was sporadic and hard to reproduce, but for bulk work we ended up writing dapper as a drop-in-replacement allowing convenient materialization in a LINQ-to-SQL style way, without that overhead - but also without the ability to mutate it and push changes down ;p Of course, you could try Attach etc.


Sam Saffron created a micro ORM called dapper. The project web site has some benchmarks you might be interested to look at. They compare raw DataReader with various ORM implementations, including LinqToSql


In general, given your description you will do fine with Linq-to-sql

A word of warning - you write in your question

logically it seems that it may (or may not) increase performance by bulk update/insert commits, cached reads and lazy loading.

Linq certainly will not increase your performance in case of bulk updates/inserts. On the contrary, bulk updates with SP's or bulk inserts with SqlBulkCopy are a magnitude faster than Linq-sql.


Do not care about the overhead of any technology unless working on a highly performance sensitive project or you encounter measurable performance issues.

Premature optimization is the root of all evil. Have a look at http://c2.com/cgi/wiki?PrematureOptimization

In response to the comments: My original answer was mainly targeted towards the last part of the question, in which the OP states concerns regarding the overhead of a higher level technology (LINQ queries) over a lower level technology (raw queries). Of course you should not ignore performance at all. But as the OP talks about a rather simple type of application, it should make to great difference in terms of performance. In those cases I would always choose the higher level technology because of readability, flexibility and maintainability.


to access it but do many many thousands of sequential gets/inserts/updates.

If this is what you're doing, you should use ADO.Net.

There's five ways to query a single table's data with LinqToSql. Here they are in approx slowest to fastest:

Sequential gets

This is slow because an expression tree must be translated into sql and a database roundtrip must occur for each row.

N expression trees and dynamic methods generation. N database roundtrips.

from c in myDC.Customers where c.Id = myId select c;

Sequential CompiledQuery

1 expression tree and dynamic method generation. N database roundtrips.

Func<MyDataContext, int, IEnumerable<Customer>> compiledQuery =
  (dc, i) => from c in dc.Customers where c.Id = i select c;

Ranged gets

N/PageSize expression tree and dynamic method generation. N/PageSize database roundtrips.

(from c in myDC.Customers where c.Id > myId order by c.Id select c).Take(pageSize)

Option2

  --note, sql server has an upper limit of ~2100 parameters.
from c in myDC.Customers where myIds.Contains(c.Id) select c;

Compiled Ranged gets

1 expression tree and dynamic method generation. N/PageSize database roundtrips.

Func<MyDataContext, int, IEnumerable<Customer>> compiledQuery =
  (dc, i) => from c in dc.Customers where c.Id > i order by c.Id select c).Take(pageSize)

Option2

--note, .Contains method not supported by compiled query due to varying parameter count.
--instead the expression has to be built from the ground up - this is hard, but not impossible.

Table Dump

1 expression tree and dynamic method generation. 1 database roundtrip. Downside - table might be too big to hold in memory or query might run over timeout.

from c in myDC.Customer select c;

With ADO.Net, you don't perform an Expression->Sql translation, and the method that turns DataReader rows into objects is written by you and not dynamically generated. You should see performance at the CompiledQuery level by default.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜