开发者

How should I approach the database logic behind a "forget this" feature?

So, let's say I have a users table and a pages table. Let's say that I want to allow users to hide/forget/ignore certain pages. At this point, I can think of two possible approaches:

An outer join with null matching:

I can create a separate ignored_pages table with columns user_id and page_id, writing to it INSERT INTO ignored_pages (user_id, page_id) VALUES (1,2); when the user with ID of 1 ignores a page with ID of 2.

Then I can run something like SELECT pages.* FROM pages LEFT OUTER JOIN ignored_pages ON pages.id = ignored_pages.page_id WHERE ignored_pages.user_id = 1 AND ignored_pages.id IS NULL;

EDIT: Joe Stefanelli pointed on an error in my query. It should be SELECT pages.* FROM pages LEFT OUTER JOIN ignored_pages ON pages.id = ignored_pages.page_id AND ignored_pages.user_id = 1 WHERE ignored_pages.id IS NULL;

A subquery and NOT IN:

I can use the same many-to-many table and then run something like SELECT pages.* FROM pages WHERE pag开发者_如何学Ce_id NOT IN (SELECT page_id FROM ignored_pages WHERE user_id = 1);.


Is there a best practice or series of rules of thumb or (likely) a better approach to this problem than the one I'm taking?


Your best performance may actually be a NOT EXISTS

SELECT p.* 
    FROM pages p
    WHERE NOT EXISTS(SELECT 1 
                         FROM ignored_pages 
                         WHERE user_id = 1 
                             AND page_id = p.id);

If you decide to stick with your LEFT JOIN option, you'll need to correct that query to test user_id on the join condition rather than the WHERE clause.

SELECT pages.* 
    FROM pages 
        LEFT OUTER JOIN ignored_pages 
            ON pages.id = ignored_pages.page_id 
                AND ignored_pages.user_id = 1 
    WHERE ignored_pages.id IS NULL;


For best performance you should use NOT EXISTS

SELECT pages.* 
FROM pages
WHERE NOT EXISTS(
   SELECT NULL
   FROM ignored_pages
   WHERE user_id = 1 AND ignored_pages.page_id = pages.page_id)


Also check the link http://explainextended.com/2009/09/16/not-in-vs-not-exists-vs-left-join-is-null-postgresql/


This page has a good comparison of using LEFT OUTER JOIN vs NOT EXISTS. A related link on that page shows that NOT EXISTS is either comparable, or faster than NOT IN, at least for the example case on that blog. The first link shows that NOT EXISTS performs almost two times better (cpu cycles, and execution time) than LEFT OUTER JOIN, so long as you have an index on all of the columns being joined/matched on.

The index for ignored_pages would likely look something like this:

CREATE UNIQUE CLUSTERED INDEX IX_Ignored_Pages ON ignored_pages (user_id, page_id);

Adapting to your code, the NOT EXISTS syntax would look something like this:

SELECT p.*
FROM pages p
WHERE NOT EXISTS (
   SELECT 1
   FROM ignored_pages i
   WHERE i.user_id = @user_id
   AND i.page_id = p.page_id
);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜