Efficient way to do unions and intersections in mySQL
I have a mySQL table with columns: name and label. If a person, "Bob" has the labels "cool","funny", and "childish", my table would have the corresponding rows: (Bob, cool), (Bob, funny), and (Bob, childish).
Is there an efficient way to select people based on labels with a boolean query? For example, in pseudo-SQL: SELECT name WHERE person IS (COOL OR NOT FUNNY) AND NOT CHILDISH.
I think I could hack something together using UNION, JOIN, maybe some sub-queries, but I was wondering if there was an efficient way to do this.
EDIT:
As of now, I am planning to distribute AND, ie ((COOL OR NOT FUNNY) AND NOT CHILDISH开发者_Go百科) => (COOL AND NOT CHILDISH) OR (NOT FUNNY AND NOT CHILDISH). And then I can determine each of the parts that are OR'd together with something like:
SELECT DISTINCT a.name
FROM `tags` AS a
JOIN `tags` AS b ON (a.label='cool' AND a.name=b.name AND b.name NOT IN (
SELECT name FROM `tags` WHERE label='funny'))
JOIN `tags` AS c ON (a.name=c.name AND c.name='childish')
# for "COOL AND NOT FUNNY AND CHILDISH"
And then use UNION to join them together.
For the negative checks, the most efficient way would be to use MINUS
as follows:
SELECT NAME
FROM NAME_LABEL
WHERE LABEL IN ('COOL') -- use IN for easy matching of multiple labels
UNION
SELECT NAME
FROM NAME_LABEL NL
WHERE NOT EXISTS (SELECT * FROM NAME_LABEL WHERE NAME = NL.NAME AND LABEL IN ('FUNNY'))
MINUS
SELECT NAME
FROM NAME_LABEL
WHERE LABEL IN ('CHILDISH');
The MINUS
keyword selects distinct rows from the first query and don't appear in the second query.
Performance would be better with an index on LABEL
:
CREATE INDEX NAME_LABEL_NAME ON NAME_LABEL(NAME);
Unfortunately, the "NOT FUNNY" requires an EXISTS subquery. If you use a join, the MySQL query optimizer turns it into a subselect anyway :(
精彩评论