performance of LINQ queries against the SQL equivalent

2023-03-17 13:43 问答作者：

开发者_如何转开发I’m currently having a debate with someone at work regarding the performance of LINQ queries against the SQL equivalent.

Has anyone done/seen any scientific testing on this?

If not, anecdotal evidence of where you’ve had to replace LINQ with a SQL query for performance reasons will help me make my case.

I have a slightly different take on this; when profiling (with our shiny profiler) we noticed that LINQ (to SQL in this case) was doing a reasonable job generating TSQL for basic queries, and the operation was running very fast at the DB server (0.5ms etc) - however, the actual query operation was taking MUCH longer (like 20ms+ for the same 0.5ms query, in some cases). So where was the time? You might think "query translation", but no; we also have a lot of ExecuteQuery<T> code (i.e. where you write the TSQL by hand) and this was doing exactly the same thing. It turned out that somewhere between the materializer and the identity map vast amounts of time was being lost.

So; we wrote our own materializer that was pretty-much a drop-in replacement for ExecuteQuery - and thus was born dapper.

On more of the LINQ side, it generally does OK at generating TSQL for simple queries, but for anything complex I usually trust hand-coded TSQL a lot more. To take a case as a sample, I had a complex query involving groups, skips and takes. It didn't perform well. When I wrote it by hand with ROW_NUMBER() etc the same results took 4% of the "stats IO" and total time.

My current opinion on LINQ is that the ORM tools make data mutation a breeze, but for query I tend to use dapper. Which is ironic since the Q in LINQ is "query".

LINQ as in LINQ2SQL or EF needs to have a generalized ruleset on how to convert the LINQ queries to SQL and with this introduces alevel of abstraction. This abstraction will sometimes result in statements that are less-than-optimal. Writing your SQL statements by hand allows you to tweak it for your specific case.
This leads to the conclusion that the speed of SQL is more likely to be faster, especially in complex scenarios. But this doesn't mean that every LINQ query is slower than its SQL equivalent.
My experience with EF 1 in combination with Oracle supports this.

One of the more obvious examples where SQL query is better is update/delete batching. There is no facility built into LINQ-To-SQL to handle multiple updates or deletes, except to process the records one by one, which generates individual queries for each record.

var myRecords = myContext.Books.Where(b => b.Author = "Bob");
foreach (var rec in myRecords)
    myContext.Books.DeleteOnSubmit(rec);
myContext.SubmitChanges()  // generates delete statement for each record.

In the general case, LINQ is optimized to generate very efficient queries, and can sometimes generate even better queries than the intuitive way to write it with SQL. However, there will always be exceptions where the LINQ-to-SQL statements will not be as efficient as a SQL query that can be written by hand.

LinqToSql is a layer on top of ADO.Net. So if you used ADO.Net with Sql queries you'll be "faster" than LinqToSql (by not doing those extra things that LinqToSql does).

This same argument (when performance is the only criteria) can be made against .Net and for Machine code. .Net IL is ultimately JIT'ed to Machine and if you just wrote Machine in the first place, you'd be faster.

Here's a couple scenarios where it matters:

If your queries are low IO with large amounts of data returned for materialization, LinqToSql will take much more time than a raw sql. On the other hand, you end up with Customer instances instead of DataRows.
If your queries are highly dynamic, such that you never issue the same sql text twice, you'll always incur query translation costs. Sql queries will be faster (esp if you don't count the construction of the sql text).

Here's a couple scenarios where it just doesn't matter:

If your queries are database-IO heavy (report style), you won't notice the extra client-side CPU cost of LinqToSql's abstraction.
If your queries are repetative (same query with different parameters), much of the cost of LinqToSql's abstraction can be mitigated by using CompiledQuery. <-- I think most systems fit this scenario.

Please check other question: LINQ-to-SQL vs stored procedures?

In my opinion, linq is good enough to use in almost every aspect. For some rare cases when you are very good at sql and really want to improve performance then go for raw sql instead.

When you run a LINQ query, LINQ generates T-SQL code to perform your query. You can run the SQL Server Profiler to get an idea of what the query looks like. I've done it in the past and the T-SQL generated looks pretty clean.

I can't speak directly of performance as I've not run any numbers, but one thing I have seen is that unless you use a rowversion column in your table for the LINQ queries, you'll end up with every column in the table being included in the WHERE clause, which is done that way as part of LINQ's built-in optimistic concurrency checking. Using the rowversion column allows LINQ to use only that column for this checking and makes for cleaner and more performant T-SQL.

Eearlier this year I submitted a bug report against the performance of Linq2Sql when inserting many records in a single transaction set. Each individual insert took progressively longer and longer to process because Linq2Sql had to check the relationships -- which the DB server was going to check anyway.

You can see the bug report at https://connect.microsoft.com/VisualStudio/feedback/details/637841/linq-to-sql-shows-progressively-worse-performance-as-the-transaction-set-grows

Click on show details and I included some performance numbers.

Interestingly enough, Microsoft's answer was that they weren't planning on improving Linq2Sql performance and that I should use ADO.Net which is what I had already decided to do.

My bottom line at this point is that I love Linq2Sql for queries but would not use it for inserts or updates. I also like the fact that Linq2Sql will populate the records from my foreign keys for me without coding separate queries.

As for complicated queries - I'm about 50% using Linq2Sql and about 50% stored procedures. I never do inline SQL because I prefer using methods that are strongly typed. So for simple queries, the simplicity of Linq2Sql beats ADO.Net 100% of the time.

Generally, in my experience the problem with EF4.1 or LINQtoSQL isn't LINQ, it's the coder.

Developers generally don't know how to code efficient queries or how to code efficient and scalable database access (i.e. they know VB or C# well, but not SQL) and this shows in how they construct a query. At that point, it doesn't matter if they're using LINQ or not.

Where this really shows up is when applications work fine in development (dev database has a few 100,000 rows of data) but perform badly in production (millions of rows of data). Developers immediately point out that it worked fine in development, so it must be something wrong with the database and not their code. This is how wars are started between developers and DBA's.

While I really like LINQ, you're often better off moving queries into Stored Procs so that you can tune and tweak your data access without having to do a full Dev-Test-QA-Release cycle on your application for every minor update.

继续阅读：.net linq performance sql

performance of LINQ queries against the SQL equivalent

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？