Optimizing LINQ Any() call in Entity Framework
After profiling my Entity Fra开发者_如何转开发mework 4.0 based database layer I have found the major performance sinner to be a simple LINQ Any() I use to check if an entity is already existing in the database. The Any() check performs orders of magnitude slower than saving the entity. There are relatively few rows in the database and the columns being checked are indexed.
I use the following LINQ to check for the existence of a setting group:
from sg in context.SettingGroups
where sg.Group.Equals(settingGroup) && sg.Category.Equals(settingCategory)
select sg).Any()
This generates the following SQL (additionally my SQL profiler claims the query is executed twice):
exec sp_executesql N'SELECT
CASE WHEN ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[SettingGroups] AS [Extent1]
WHERE ([Extent1].[Group] = @p__linq__0) AND ([Extent1].[Category] = @p__linq__1)
)) THEN cast(1 as bit) WHEN ( NOT EXISTS (SELECT
1 AS [C1]
FROM [dbo].[SettingGroups] AS [Extent2]
WHERE ([Extent2].[Group] = @p__linq__0) AND ([Extent2].[Category] = @p__linq__1)
)) THEN cast(0 as bit) END AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]',N'@p__linq__0 nvarchar(4000),@p__linq__1 nvarchar(4000)',@p__linq__0=N'Cleanup',@p__linq__1=N'Mediator'
Right now I can only think of creating stored procedures to solve this problem, but I would of course prefer to keep the code in LINQ.
Is there a way to make such an "Exist" check run faster with EF?
I should probably mention that I also use self-tracking-entities in an n-tier architecture. In some scenarios the ChangeTracker state for some entities is set to "Added" even though they already exist in the database. This is why I use a check to change the ChangeTracker state accordingly if updating the database caused an insert failure exception.
Try adding index to the database table "SettingGroups", by Group & Category.
BTW, does this produce similar sql?
var ok = context.SettingGroups.Any(sg => sg.Group==settingGroup && sg.Category==settingCategory);
The problem is Entity Framework (at least EF4) is generating stupid SQL. The following code seems to generate decent SQL with minimal pain.
public static class LinqExt
{
public static bool BetterAny<T>( this IQueryable<T> queryable, Expression<Func<T, bool>> predicate)
{
return queryable.Where(predicate).Select(x => (int?)1).FirstOrDefault().HasValue;
}
public static bool BetterAny<T>( this IQueryable<T> queryable)
{
return queryable.Select(x => (int?)1).FirstOrDefault().HasValue;
}
}
Then you can do:
(from sg in context.SettingGroups
where sg.Group.Equals(settingGroup) && sg.Category.Equals(settingCategory)
select sg).BetterAny()
or even:
context.SettingGroups.BetterAny(sg => sg.Group.Equals(settingGroup) && sg.Category.Equals(settingCategory));
I know it sounds a miserable solution, but what happens if you use Count instead of Any?
Have you profiled the time to execute the generated select statement against the time to execute the select what you would expect/like to be produced ? It is possible that it is not as bad as it looks.
The section
SELECT
1 AS [C1]
FROM [dbo].[SettingGroups] AS [Extent1]
WHERE ([Extent1].[Group] = @p__linq__0) AND ([Extent1].[Category] = @p__linq__1)
is probably close to what you would expect to be produced. It is quite possible that the query optimiser will realise the second query is the same as the first and hence it may add very little time to the overall query.
精彩评论