How to optimize this "where IN" and "where NOT IN" query?
The following query/queries get the cities the user has visited, get the places where the user has visited; and returns the places in those cities where the user hasn't been.
// I get the city_id and object_id. Each vote has the place_id and its city_id.
SELECT DISTINCT city_id as city_id, object_id as object_id
FROM vote
WHERE object_model = 'Place'
AND user_id = 20
ORDER BY 开发者_高级运维created_at desc
// I build an array with city_ids and another with object_ids
$city_ids = array(...);
$place_ids = array(...);
I get the places where the user hasn't been in the cities he has been - 1 second
SELECT id, title
FROM place
WHERE city_id IN ($city_ids)
AND id NOT IN ($place_ids)
ORDER BY points desc
LIMIT 0,20
EXPLAIN SQL
select_type table type possible_keys key key_len ref ows Extra
-----------------------------------------------------------------------------------------------------------
SIMPLE p range PRIMARY,city_id_index city_id_index 9 NULL 33583 Using where; Using filesort
Another attempt to optimize is to do it one query using LEFT JOIN / IS NULL and a subquery, but it takes much longer (30+ seconds)
SELECT id, title
FROM place AS p
LEFT JOIN vote v ON v.object_id = p.id
AND v.object_model = 'Place'
AND v.user_id = 20
WHERE p.city_id IN (SELECT city_id
FROM vote
WHERE user_id = 20
AND city_id != 0)
AND v.id is null
ORDER BY p.points desc
LIMIT 0, 20
How would you do the query/queries thinking that we can have an array of 500 cities and 1000 places for each user? Which is the best alternative to where in and where NOT IN when there are many ids?
I am no MySQL expert, but I the query doesn't look too complex. Instead of focusing on the query, I would look at the indexes. Maybe the following indexes will help:
CREATE INDEX vote_index1 ON vote (user_id, city_id)
CREATE INDEX vote_index2 ON vote (object_id, object_model, user_id)
Do not use IN
operator, just try to solve with joining all the tables necesary. The IN
can be accomplished with normal join I believe, and the NOT IN
you accomplish by e.g.:
select *
from a left join b using (field)
where b.field is NULL
this way you get all records from table a where there is no corresponding record in table b.
When using mysql, you have to remember it is extremely dumb when handling IN() subqueries (or anything else really). So you should rewrite your second attempt as :
SELECT id, title
FROM
(SELECT DISTINCT city_id FROM vote WHERE user_id = 20) v
JOIN places p USING (city_id)
LEFT JOIN vote v2 ON (v2.object_id = p.id AND v2.user_id = 20)
WHERE v2.id IS NULL
ORDER BY p.points desc
LIMIT 0, 20
Note that "city_id != 0" is useless, since there is a foreign key from votes to cities, so vote.city_id cannot be 0. It could be NULL though.
Also, the database design is probably wrong : cities should have their own table, the "table name + id" columns are a bad idea, etc.
If you want to query for 2 attributes you need to join 2 tables and not only 1 table. Also I want to know what is object_id?
SELECT id, title
FROM place AS p
LEFT JOIN vote v ON v.object_id = p.id
AND v.object_model = 'Place'
AND v.user_id = 20
LEFT JOIN place AS P1 on V.city_id = P1.city_id
WHERE v.id is null
ORDER BY p.points desc
LIMIT 0, 20
精彩评论