开发者

SQL: "NOT IN" subquery optimization or alternatives

I have two database tables: "places" and "translations". The translations of places开发者_C百科 names are made by selecting records from "places", which don't have the translations to the specified language yet:

SELECT `id`, `name`
FROM `places`
WHERE `id` NOT IN (SELECT `place_id` FROM `translations` WHERE `lang` = 'en')

This worked fine with 7 000 records of places, but crashed when the number of translations reached 5 000. Since then, the query takes about 10 seconds and returns the error:

2006 - MySQL server has gone away

As I understand, the main problem here is the subquery returning to many results, bu how could I solve it, if I need to select all the places which are not translated yet?

My plan B is to create a new boolean field in "places" table, called "translated", and reset it to "false", each time I change language - that would prevent for having subquery. However, maybe I could just modify my current SQL statement and prevent from adding additional field?


The obvious alternative:

SELECT
  `id`, `name`
FROM
  `places`
WHERE 
  NOT EXISTS (
    SELECT 1 FROM `translations` WHERE `id` = `places`.`id` AND `lang` = 'en'
  )

There should be a clustered composite index over (translations.id, translations.lang) (composite means: a single index over multiple fields, clustered means: the index governs how the table is sorted).


In this case, I believe that the best alternative is to do 2 separate queries. Storing the result of the first in a variable and using the second.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜