开发者

Writing set based SQL query without set operations (EXCEPT, INTERSECT)

I have three tables: UserObjects, UserObjectsRelations, UserClasses which model M:N relationship between UserObjects and UserClasses .

Now I need to select this:

All(UserObjects) - Intersect(UserObjectRelations -> UserObjects).Where(UserObjectRelation -> UserClassId IN (some list))

It means I have l开发者_运维技巧ist of UserClassIds which I use to filter UserClasses (or relations directly) and I need to find all UserObjects which are not assigned to all those UserClasses.

Example: Suppose that I have UserObjectRelations filtered by UserClassId IN (1,2):

UserClassId |  UserObjectId 
--------------------------
     1      |        1
     2      |        1
     2      |        2

I also have many other user objects. Result of my query should be all UserObjects not mentioned in this result set + UserObject with Id = 2 because it is not related to all requested UserClasses.

The problem is that SQL query is generated by Entity Framework (we don't have full control over generated SQL) so our initial approach with INTERSECT failed - with many UserClasses it creates too complex query and sometimes SQL Server raises error because of deep nesting.

It creates query like this (but much huge because EF doesn't use * notation and it really likes a lot of nested SELECTs):

SELECT Unsused.* 
FROM dbo.UserObjects AS Unsused
WHERE Unsused.IsDeleted = 0
EXCEPT (
    SELECT U.* 
    FROM dbo.UserObjects AS U
    INNER JOIN dbo.UserObjectRelations AS UR ON UR.UserObjectId = U.Id
    WHERE UR.UserClassId = 1
    INTERSECT (
        SELECT U.* 
        FROM dbo.UserObjects AS U
        INNER JOIN dbo.UserObjectRelations AS UR ON UR.UserObjectId = U.Id
        WHERE UR.UserClassId = 2    
    ))

I'm now rewriting the query - first in SQL, later I will try to define it in Linq-To-Entities. I come up with this:

SELECT Unused.*
FROM dbo.UserObjects AS Unused
LEFT JOIN (
    SELECT UsageReport.Id
    FROM (
        SELECT Tmp.Id, COUNT(*) AS Usage
        FROM dbo.UserObjects AS Tmp 
        INNER JOIN dbo.UserObjectRelations AS DefiningRelations ON
            Tmp.Id = DefiningRelations.UserObjectId
        WHERE DefiningRelations.UserClassId IN (1, 2)
        GROUP BY Tmp.Id) AS UsageReport
    WHERE UsageReport.Usage = 2
) AS Used ON Used.Id = Unused.Id
WHERE Unused.IsDeleted = 0 AND Used.Id IS NULL

Query probably doesn't look very good but I'm already trying to avoid constructs which I don't know how to translate to Linq-To-Entities.

I'm still not happy with the query. I don't like this part: WHERE UsageReport.Usage = 2 which filters inner select to only user objects which are used by both user classes. This parameter must be dynamic and always represent number of Ids passed into IN clause.

Do you have idea how to write such query in better way?


Another one that also uses COUNT():

SELECT u.*
FROM UserObjects
  LEFT JOIN (
    SELECT UserObjectId
    FROM UserObjectRelations
    WHERE UserClassId IN (1, 2)
    GROUP BY UserObjectId
    HAVING COUNT(DISTINCT UserClassId) = 2
  ) r ON u.Id = r.UserObjectId
WHERE r.UserObjectId IS NULL
  AND u.IsDeleted = 0

I use COUNT(DISTINCT) here, but if it is certain that duplicates are impossible there, then COUNT(*) would probably be better.

However, if you really feel strongly against using COUNT like that, I would recommend you reconsider the INTERSECT approach, only not the way you've shown it used.

Here's how I would use it:

SELECT u.*
FROM UserObjects
  LEFT JOIN (
    SELECT UserObjectId FROM UserObjectRelations WHERE UserClassId = 1
    INTERSECT
    SELECT UserObjectId FROM UserObjectRelations WHERE UserClassId = 2
  ) r ON u.Id = r.UserObjectId
WHERE r.UserObjectId IS NULL
  AND u.IsDeleted = 0

As you can see, no COUNT here, and it doesn't look too heavy. I believe you can have however many classes included that way, and you don't need to use parentheses there.


Does this work? It still uses the count of the list, though. I'm not sure if there's a good way around that without a stored procedure...

SELECT o.* FROM UserObjects o
           LEFT JOIN UserObjectsRelations r ON o.id = r.UserObjectId
WHERE r.UserClassId IN (1,2) OR r.UserClassId IS NULL
GROUP BY o.id HAVING COUNT(o.id) < 2

Update: Sorry, wasn't thinking properly before. Not sure if this is the best way to do it, but you do get rid of the number of Ids in the IN clause (and I did it with MySQL, so sorry if it's not kosher in TSQL). Here's what I came up with:

SELECT o.* FROM UserObjects o, 
                (SELECT o.id oid, c.id cid FROM UserObjects o, UserClasses c
                 WHERE c.id IN (1,2)
                ) sub
           LEFT JOIN UserObjectsRelations r ON sub.oid = r.UserObjectId AND
                                               sub.cid = r.UserClassId
WHERE o.id = sub.oid AND r.UserClassId IS NULL
GROUP BY o.id
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜