开发者

SQL: How do I make a selection based on categories?

There are two tables, categories and books and I'd like to select all books based on the given categories.

Categories table:

cat_id | book_id
-开发者_JS百科---------------
1      | 1
2      | 1
3      | 1
3      | 2

Books table:

id  | name
----------------
1   | abc
2   | def

I've tried SELECT * FROM categories WHERE cat_id IN(1,3) but then it returns books containing at least one of the given categories. What I'd like is that it only returns the books containing ALL the categories, so it should only return all (or one) rows where book_id = 1 since it's the only book with all the given categories.


Try:

select book_id
from categories
group by book_id
having sum( ( cat_id in (1,3) )::int ) = 2

Or if you intend to pass an array to postgres from language that supports passing array directly to it(like this: http://fxjr.blogspot.com/2009/05/npgsql-tips-using-in-queries-with.html), use this:

select book_id
from categories
group by book_id
having sum( ( cat_id = ANY(ARRAY[1,3]) )::int ) = 2

If you want to get the book name:

select categories.book_id, books.name
from categories
join books on books.id = categories.book_id
group by categories.book_id
    ,books.name
having sum( ( categories.cat_id in (1,3) )::int ) = 2

@Evan Carroll, amending the query:

ANSI SQL way:

select categories.book_id, books.name
from categories
join books on books.id = categories.book_id
group by categories.book_id
    ,books.name
having count(case when categories.cat_id in (1,3) then 1 end) = 2

Sans the book name:

select book_id
from categories
group by book_id
having count( case when cat_id in (1,3) then 1 end ) = 2

What's the advantage of inlining the condition and its count value in the same clause(i.e. having) as opposed to separately putting the condition in where clause and its count in having clause?...

select book_id
from categories
where category_id in (1,3)
group by book_id
having count(*) = 2

...If we inline both the condition and its count value in having clause, we can facilitate an inquiry of let's say list all books with categories of 1 and 3, or with categories of 2 and 3 and 4. Future-proofing FTW! Plus the testing of combined categories and its count are next to each other, plus factor in terms of readability.

To facilitate that kind of query:

select book_id
from categories
group by book_id
having 
    count( case when cat_id in (1,3) then 1 end ) = 2 
    or count( case when cat_id in (2,3,4) then 1 end ) = 3

To achieve performance(sometimes, achieving both performance and readability; don't mix well), must duplicate the testing of elements of having clause to where clause:

select book_id
from categories
where cat_id in (1,2,3,4)
group by book_id
having 
    count( case when cat_id in (1,3) then 1 end ) = 2 
    or count( case when cat_id in (2,3,4) then 1 end ) = 3

[EDIT]

BTW, here's the idiomatic MySQL:

select book_id
from categories
group by book_id
having sum( cat_id in (1,3) ) = 2


You actually get more than one entry per book. If n of n categories are assigned to the book, you get n entries for the book. So you could group your query and select only those who have n hits:

SELECT T.cat_id, count(*) hits FROM
(
    SELECT * FROM categories WHERE cat_id IN(1,3)
) T
GROUP BY T.cat_id
HAVING hits = 2


Yet another alternative method:

SELECT book_id FROM categories WHERE cat_id = 1 
INTERSECT 
SELECT book_id FROM categories WHERE cat_id = 3;

You can continue to chain INTERSECTs if you have more than two categories to match.


Join against each category that you require:

SELECT books.*
FROM books
     JOIN categories cat1 ON cat1.book_id = books.book_id
     JOIN categories cat3 ON cat3.book_id = books.book_id
WHERE cat1.cat_id = 1
      AND cat3.cat_id = 3

Or you do this equivalently using WHERE EXISTS (semi join) if you don't like to add inner joins.


SELECT * FROM 
(
 SELECT b.id, count(c.cat_id) as cat_count
 FROM books AS b
 JOIN cats AS c
   ON ( b.id = c.book_id )
 GROUP BY b.id
) AS t
WHERE t.cat_count = ( SELECT DISTINCT count(cat_id) FROM cat );

This assumes one book can't be in the same category twice. This selects all books in either category, counts the categories, and makes sure the category count is the max number of categories.


Try this:

SELECT * FROM books WHERE id IN 
(SELECT book_id
FROM categories
GROUP BY book_id 
HAVING COUNT(distinct cat_id)  = (select count(distinct cat_id) from categories))

Edited: I edited the query so it returns the books containing ALL categories as stated in the question

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜