Writing set based SQL query without set operations (EXCEPT, INTERSECT)
I have three tables: UserObjects
, UserObjectsRelations
, UserClasses
which model M:N relationship between UserObjects and UserClasses .
Now I need to select this:
All(UserObjects) - Intersect(UserObjectRelations -> UserObjects).Where(UserObjectRelation -> UserClassId IN (some list))
It means I have l开发者_运维技巧ist of UserClassIds which I use to filter UserClasses (or relations directly) and I need to find all UserObjects which are not assigned to all those UserClasses.
Example: Suppose that I have UserObjectRelations filtered by UserClassId IN (1,2):
UserClassId | UserObjectId
--------------------------
1 | 1
2 | 1
2 | 2
I also have many other user objects. Result of my query should be all UserObjects not mentioned in this result set + UserObject with Id = 2 because it is not related to all requested UserClasses.
The problem is that SQL query is generated by Entity Framework (we don't have full control over generated SQL) so our initial approach with INTERSECT
failed - with many UserClasses it creates too complex query and sometimes SQL Server raises error because of deep nesting.
It creates query like this (but much huge because EF doesn't use * notation and it really likes a lot of nested SELECTs):
SELECT Unsused.*
FROM dbo.UserObjects AS Unsused
WHERE Unsused.IsDeleted = 0
EXCEPT (
SELECT U.*
FROM dbo.UserObjects AS U
INNER JOIN dbo.UserObjectRelations AS UR ON UR.UserObjectId = U.Id
WHERE UR.UserClassId = 1
INTERSECT (
SELECT U.*
FROM dbo.UserObjects AS U
INNER JOIN dbo.UserObjectRelations AS UR ON UR.UserObjectId = U.Id
WHERE UR.UserClassId = 2
))
I'm now rewriting the query - first in SQL, later I will try to define it in Linq-To-Entities. I come up with this:
SELECT Unused.*
FROM dbo.UserObjects AS Unused
LEFT JOIN (
SELECT UsageReport.Id
FROM (
SELECT Tmp.Id, COUNT(*) AS Usage
FROM dbo.UserObjects AS Tmp
INNER JOIN dbo.UserObjectRelations AS DefiningRelations ON
Tmp.Id = DefiningRelations.UserObjectId
WHERE DefiningRelations.UserClassId IN (1, 2)
GROUP BY Tmp.Id) AS UsageReport
WHERE UsageReport.Usage = 2
) AS Used ON Used.Id = Unused.Id
WHERE Unused.IsDeleted = 0 AND Used.Id IS NULL
Query probably doesn't look very good but I'm already trying to avoid constructs which I don't know how to translate to Linq-To-Entities.
I'm still not happy with the query. I don't like this part: WHERE UsageReport.Usage = 2
which filters inner select to only user objects which are used by both user classes. This parameter must be dynamic and always represent number of Ids passed into IN clause.
Do you have idea how to write such query in better way?
Another one that also uses COUNT():
SELECT u.*
FROM UserObjects
LEFT JOIN (
SELECT UserObjectId
FROM UserObjectRelations
WHERE UserClassId IN (1, 2)
GROUP BY UserObjectId
HAVING COUNT(DISTINCT UserClassId) = 2
) r ON u.Id = r.UserObjectId
WHERE r.UserObjectId IS NULL
AND u.IsDeleted = 0
I use COUNT(DISTINCT) here, but if it is certain that duplicates are impossible there, then COUNT(*) would probably be better.
However, if you really feel strongly against using COUNT like that, I would recommend you reconsider the INTERSECT approach, only not the way you've shown it used.
Here's how I would use it:
SELECT u.*
FROM UserObjects
LEFT JOIN (
SELECT UserObjectId FROM UserObjectRelations WHERE UserClassId = 1
INTERSECT
SELECT UserObjectId FROM UserObjectRelations WHERE UserClassId = 2
) r ON u.Id = r.UserObjectId
WHERE r.UserObjectId IS NULL
AND u.IsDeleted = 0
As you can see, no COUNT here, and it doesn't look too heavy. I believe you can have however many classes included that way, and you don't need to use parentheses there.
Does this work? It still uses the count of the list, though. I'm not sure if there's a good way around that without a stored procedure...
SELECT o.* FROM UserObjects o
LEFT JOIN UserObjectsRelations r ON o.id = r.UserObjectId
WHERE r.UserClassId IN (1,2) OR r.UserClassId IS NULL
GROUP BY o.id HAVING COUNT(o.id) < 2
Update: Sorry, wasn't thinking properly before. Not sure if this is the best way to do it, but you do get rid of the number of Ids in the IN clause (and I did it with MySQL, so sorry if it's not kosher in TSQL). Here's what I came up with:
SELECT o.* FROM UserObjects o,
(SELECT o.id oid, c.id cid FROM UserObjects o, UserClasses c
WHERE c.id IN (1,2)
) sub
LEFT JOIN UserObjectsRelations r ON sub.oid = r.UserObjectId AND
sub.cid = r.UserClassId
WHERE o.id = sub.oid AND r.UserClassId IS NULL
GROUP BY o.id
精彩评论