Which of these two SQL queries is more efficient?
I am using Entity Framework and Linq to Entities with the MySQL ADO.Net connector to access a MySQL database.
There are two tables Requests and Submissions with a one to many relationship from 开发者_开发百科Requests to Submissions. Accordingly, the Submissions table contains a RequestId column that is has a foreign key dependency on Requests.
I need to retrieve all requests where its submissions contain a certain value. In LINQ I can do it one of two ways:
var r1 = foo.Submissions.Where(s => s.FieldName == "foo" && s.FieldValue == "bar").Select(s => s.Request).Distinct();
var r2 = foo.Requests.Where(r => r.Submissions.Any(s => s.FieldName == "foo" && s.FieldValue == "bar"));
which evaluates to
SELECT `Distinct1`.*
FROM
(SELECT DISTINCT `Extent2`.*
FROM `Submissions` AS `Extent1` INNER JOIN `Requests` AS `Extent2` ON `Extent1`.`RequestId` = `Extent2`.`RequestId`
WHERE ("foo" = `Extent1`.`FieldName`) AND ("bar" = `Extent1`.`FieldValue`))
AS `Distinct1`
SELECT `Extent1`.*
FROM `Requests` AS `Extent1`
WHERE EXISTS
(SELECT 1 AS `C1`
FROM `Submissions` AS `Extent2`
WHERE (`Extent1`.`RequestId` = `Extent2`.`RequestId`) AND ((@gp1 = `Extent2`.`FieldName`) AND (@gp2 = `Extent2`.`FieldValue`)))
Now the first style of query uses an INNER JOIN...is that now less efficient than the 2nd choice?
You should be able to determine this yourself, by looking at the query plans generated for both queries in SSMS. Look specifically for any scans being done instead of seeks.
Then, you can analyze both queries in SQL Profiler to see which generates fewer overall reads, and consumes less CPU cycles.
The first approach potentially involves a sort (distinct) which suggests that the EXISTS alternative will perform better when the nr of submissions for each request is large.
What does the wall clock tell you?
精彩评论