How do I avoid repeating this subquery for the IN clause?

2023-02-20 16:24 问答作者：

I have an SQL script (currently running against SQLite, but it should probably work against any DB engine) that uses the same subquery twice, and since it might be fetching a lot of records (the table has a couple of million rows) I'd like to only call it once.

A shortened pseudo-version of the query looks like this:

SELECT * FROM
    ([the subquery, returns a column of ids]) AS sq
[a couple of joins, that fetches things from other tables based on the ids]
WHERE thisorthat NOT IN ([the subquery again])

I tried just using the name (sq) in various ways (with/without parenthesis, with/without na开发者_如何学Goming the column of sq etc) but to no avail.

Do I really have to repeat this subquery?

Clarification: I am doing this in python and sqlite as a small demo of what can be done, but I would like my solution to scale as well as possible with as little modification as possible. In the real situation, the database will have a couple of million rows, but in my example there is just 10 rows with dummy data. Thus, code that would be well optimized on for example MySQL is absolutely good enough - it doesn't have to be optimized specifically for SQLite. But as I said, the less modification needed, the better.

There is a WITH clause in standard SQL, however, I don't know if it is supported by SQLlite - though of course worth a try:

WITH mySubQuery AS
(
  [the subquery code]
)

SELECT * FROM
    mySubQuery AS sq
    [a couple of joins, that fetches things from other tables based on the ids]
WHERE thisorthat NOT IN (mySubQuery)

That said, what you do here will likely be horribly slow for any data set that is more than a few thousand rows, so I'd try to remodel it if possible - NOT IN should be avoided in general, especially if you also have a couple of joins.

Do you need a subquery? You could probably rewrite using an OUTER JOIN e.g. something like:

SELECT * 
  FROM [the subquery's FROM clause] AS sq
       RIGHT OUTER JOIN [a couple of tables based on the ids]
          ON thisorthat = sq.[a column of ids]
 WHERE sq.[a column of ids] IS NULL;

In general, I question the need to eliminate the duplication. The SQL compiler can see that two subqueries are identical and chose to only do them once if that seems optimal.

In addition, by leaving duplicates in the source, the SQL compiler and optimizer is given the opportunity to treat them differently. For example the subquery flattening optimization of SQLite might be applied to one of a pair of duplicates or applied differently to each. See section 9.0, Subquery flattening of https://www.sqlite.org/optoverview.html.

you can put the SELECT part into a View than you can filter the View results using the alias "sq"

I hope it's helpful

继续阅读：dry in-clause sql subquery

How do I avoid repeating this subquery for the IN clause?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？