SQL: "NOT IN" subquery optimization or alternatives
I have two database tables: "places" and "translations". The translations of places开发者_C百科 names are made by selecting records from "places", which don't have the translations to the specified language yet:
SELECT `id`, `name`
FROM `places`
WHERE `id` NOT IN (SELECT `place_id` FROM `translations` WHERE `lang` = 'en')
This worked fine with 7 000 records of places, but crashed when the number of translations reached 5 000. Since then, the query takes about 10 seconds and returns the error:
2006 - MySQL server has gone away
As I understand, the main problem here is the subquery returning to many results, bu how could I solve it, if I need to select all the places which are not translated yet?
My plan B is to create a new boolean field in "places" table, called "translated", and reset it to "false", each time I change language - that would prevent for having subquery. However, maybe I could just modify my current SQL statement and prevent from adding additional field?
The obvious alternative:
SELECT
`id`, `name`
FROM
`places`
WHERE
NOT EXISTS (
SELECT 1 FROM `translations` WHERE `id` = `places`.`id` AND `lang` = 'en'
)
There should be a clustered composite index over (translations.id, translations.lang)
(composite means: a single index over multiple fields, clustered means: the index governs how the table is sorted).
In this case, I believe that the best alternative is to do 2 separate queries. Storing the result of the first in a variable and using the second.
精彩评论