How to join two MySQL tables grouped depending on a max() value
while implementing an inline search function for our local file archive I've come up with a serious problem I have found no answer for. We have two tables:
fild_id | file_name --------------------- 1 | this_file 2 | that_file 3 | new_file file_archive_id | file_archive_version | file_id -------------------------------------------------- 1 | 1 | 1 2 | 2 | 1 3 | 1 | 2 4 | 1 | 3 5 | 3 | 1
I want to join both tables via file_id, selecting only the one file_archive row with the biggest file_archive_version:
fild_id | file_name | file_archive_id | file_archive_version -------------------------------------------------------------- 1 | this_file | 5 | 3 2 | that_file | 3 | 1 3 | new_file | 4 | 开发者_如何学JAVA 1
Is there any possibility to do this via a single select statement?
Solution:
SELECT df.*, ( SELECT dfa.file_archive_id FROM dca_file_archive dfa WHERE df.file_id = dfa.file_id ORDER BY dfa.file_archive_version desc LIMIT 1 ) as file_archive_id, ( SELECT dfa.file_archive_version FROM dca_file_archive dfa WHERE df.file_id = dfa.file_id ORDER BY dfa.file_archive_version desc LIMIT 1 ) as file_archive_version FROM dca_file df
Both tables having ~16k rows, this statement takes 0.9 seconds to perform, which is 120x faster than the first join solution.
Solution (without altering the indexes on my tables):
SELECT df.*, ( SELECT dfa.file_archive_id FROM dca_file_archive dfa WHERE df.file_id = dfa.file_id ORDER BY dfa.file_archive_version desc LIMIT 1 ) as file_archive_id, ( SELECT dfa.file_archive_version FROM dca_file_archive dfa WHERE df.file_id = dfa.file_id ORDER BY dfa.file_archive_version desc LIMIT 1 ) as file_archive_version FROM dca_file df
Both tables having ~16k rows, this statement takes 0.9 seconds to perform, which is 120x faster than the first join solution.
I know this is not the finest you can do with SQL
Try this (i named your tables table1
and table2
):
SELECT
t1.fild_id,
t1.file_name,
t2A.file_archive_id,
t2A.file_archive_version
FROM
table1 t1
JOIN
table2 t2A ON (t1.fild_id = t2A.file_id)
WHERE
NOT EXISTS (
SELECT
*
FROM
table2 t2B
WHERE
t2A.file_id = t2B.file_id
AND
t2B.file_archive_id > t2A.file_archive_id
)
ORDER BY t1.fild_id
Try this one -
SELECT f.*, a1.file_archive_id, a1.file_archive_version FROM files f
JOIN file_archives a1
ON f.file_id = a1.file_id
JOIN (
SELECT file_id, MAX(file_archive_version) max_file_archive_version FROM file_archives GROUP BY file_id
) a2
ON a1.file_id = a2.file_id AND a1.file_archive_version = a2.max_file_archive_version;
t1 as first table,
t2 as second table
SELECT t1.file_id as tx_id,t1.file_name,tx.file_archive_id,tx.file_archive_version
FROM maindb.t1 t1,maindb.t2 tx
WHERE t1.file_id = tx.file_id
GROUP BY t1.file_id
HAVING max(tx.file_archive_version) >= all (
SELECT max(t2.file_archive_version)
FROM maindb.t2
WHERE t2.file_id = tx_id
)
hope it may help.
精彩评论