Querying multiple SQL tables for one type of objects

2023-03-17 22:48 问答作者：

Given the following SQL table (MySQL for instance):

CREATE TABLE `table` (
  `id` int(11) unsigned NOT NULL,
  `lang` tinyint(3) unsigned NOT NULL,
  `data` text NOT NULL,
  PRIMARY KEY (`id`,`lang`)
) ENGINE=InnoDB

This table stores some data about objects with IDs id and data can be written in several languages = lang. Typical use case for this table is: we need to get data for some object with some id and where language is lang = 1 or at least lang = 5 or any other language if no rows with lang = 1 or 5 for id = 1 is found.

In other words, I want to get information about object with id = 1 on English or at least on Deutsch, but if not - any other language will be sufficient.

This is a rather simple query:

SELECT * FROM `table` WHERE `id` = 1
ORDER BY
  CASE WHEN `lang` = 1 THEN 1
       WHE开发者_JS百科N `lang` = 2 THEN 2
       ELSE 3
  END ASC
LIMIT 1

This query is quite fast and use only PRIMARY key with in-memory sorting.

The questions appears when we'd like to get such data for several objects with one query. The only one I can think of is something like:

SELECT id, (SUBQUERY TO GET DATA AS ABOVE WHERE id = tmp.id LIMIT 1) AS data
FROM (SUBQUERY TO SELECT ids) as tmp

That query will do it's job but it looks and feels ugly :/

And this is a first question: is a good and right way to do such things? Do anybody knows a better way to resolve such problems?

Now let's think about highload and really big tables of data. For example, let's assume that we gave 1,000,000 of objects with 5 to 15 languages for each. This is really huge table for MySQL so we will split one table for several more (say 20 tables across several servers). Now we have some simple hash-function (like id % N == 0) to know where the data for particular object is stored.

So, question #2: *How to make such requests across several tables (even in one database, from table_1 to table_5) if we already know where the data is?* I guess that this question can be answered only after first one :(

Several other questions about this topic: maybe the whole situation is wrong? Should we store such data some other way? Or maybe there are some other, more efficient ways to do it?

You suggest

SELECT id, (SUBQUERY TO GET DATA AS ABOVE WHERE id = tmp.id LIMIT 1) AS data
FROM (SUBQUERY TO SELECT ids) as tmp

A small-ish change, but surely it's a bit better to join to a resultset telling you the best language for each desired row.

SELECT
    `table`.`id`,
    `table`.`lang`,
    `table`.`data`
FROM
    `table`
    JOIN (
        SELECT
            `id`,
            MAX(`evaluatelanguage`(`lang`)) AS `bestscore`
        FROM
            `table` AS `sqtable`
        GROUP BY
            `id`
    ) AS `subquery` ON
        `table`.`id` = `subquery`.`id` AND
        `evaluatelanguage`(`table`.`lang`) = `subquery`.`bestscore`

NB. This version requires you to give a separate rank to every language, because otherwise you will see multiple rows where an id falls into the ELSE clause in your CASE. I think this query can be improved (surely we need not use evaluatelanguage so many times?) but I'm unsure how best to go about it.

继续阅读：crosstab sql

Querying multiple SQL tables for one type of objects

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？