sql select from a large number of IDs

2023-01-02 23:06 问答作者：

I have a table, Foo. I run a query on Foo to get the ids from a subset of Foo. I then want to run a more complicated set of queries, but only on those IDs. Is there an efficient way to do this? The best I can think of is creating a query such as:

SELECT ... --complicated stuff
WHERE ... --more stuff
  AND id IN (1, 2, 3, 9, 413, 4324, ..., 939393)

That is, I construct a huge "IN" clause. Is this efficient? Is there a more efficient way of doing this, or is the only way to JOIN with the inital query that gets the IDs? If i开发者_高级运维t helps, I'm using SQLObject to connect to a PostgreSQL database, and I have access to the cursor that executed the query to get all the IDs.

UPDATE: I should mention that the more complicated queries all either rely on these IDs, or create more IDs to look up in the other queries. If I were to make one large query, I'd end up joining six tables at once or so, which might be too slow.

One technique I've used in the past is to put the IDs into a temp table, and then use that to drive a sequence of queries. Something like:

BEGIN;
CREATE TEMP TABLE search_result ON COMMIT DROP AS
  SELECT entity_id
  FROM entity /* long complicated search joins and conditions ... */;
-- Fetch primary entities
SELECT entity_id, entity.x /*, ... */
FROM entity JOIN search_result USING (entity_id);
-- Fetch some related entities
SELECT entity_id, related_entity_id, related_entity.x /*, ... */
FROM related_entity JOIN search_result USING (entity_id);
-- And more, as required
END;

This is particularly useful where the search result entities have multiple one-to-many relationships which you want to fetch without either a) doing N*M+1 selects or b) doing a cartesian join of related entities.

I would think it might be useful to use a VIEW. Simple create a view with your query for ID's, then join to that view via ID. That will limit your results to the required subset of ID's without an expensive IN statement.

I do know that the IN statement is more expensive then an EXISTS statement would be.

I think the join with the criteria to select the id's will be more efficient because the query optimizer has more options to do the right thing. Use the explain plan to see how postgresql will approach it.

You are almost certainly better off with a join, however, another option is to use a sub select, i.e.

SELECT ... --complicated stuff
WHERE ... --more stuff
  AND id IN (select distinct id from Foo where ...)

继续阅读：postgresql python sql sqlobject

sql select from a large number of IDs

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？