开发者

SQL Query Theory Question

I have a large historical transaction table (15-20 million rows MANY columns) and a table with one row one column. The table with one row contains a date (la开发者_运维技巧st processing date) which will be used to pull the data in the trasaction table ('process_date').

Question: Should I inner join the 'process_date' table to the transaction table or the transaction table to the 'process_date' table?


This is how I would do it

SELECT <<list only columns you need>> 
FROM large_historical_transaction_table t
WHERE EXISTS (SELECT 1 FROM OneRowTable o 
              WHERE o.last_processing_date = t.process_date)


An Inner join is a symmetrical, bi-directional relationship, in general it doesn't matter, but in this case I would suggest not joining at all, read the threshold date into a variable and pass it to the other select query as a parameter...


For readability I would inner join from the transaction table to explicitly indicate that the second table with the date just acts like a filter.


When joining tables, the query optimizer takes a quick sniff of both to determine the most appropriate join implementation. Logically the inner join is symmetric, but the implementation may favor one side over the other for improved performance.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜