Optimize the PostgreSQL query

2022-12-07 18:01 问答作者：

I have a table likes the below.

id       session_id     start_time                   answer_time
1          111          2022-12-06 13:40:50          2022-12-06 13:40:55 
2          111          2022-12-06 13:40:51          Null
3          111          2022-12-06 13:40:57      开发者_运维知识库    Null
4          222          2022-12-06 13:40:58          Null
5          222          2022-12-06 13:41:10          Null 
6          222          2022-12-06 13:41:10          Null    
7          333          2022-12-06 13:46:10         2022-12-06 13:46:15
8          333          2022-12-06 13:46:18         2022-12-06 13:46:20

There are three sessions in the table, with session ids 111, 222, and 333; Each session has multiple records, but the session_id is the same; and the session is successful or unsuccessfulis depends on answer_time is Null or not of the smallest id record of that session.

The id 1 and id 4 and id 7 records in the above sample table determine whether a session is successful or unsuccessful.

I have the below SQL to query it, and it works well.

WITH t AS
(
    SELECT DISTINCT ON (session_id)  start_time, answer_time
    FROM logs
            WHERE ((SELECT NOW() AT TIME ZONE 'UTC') - start_time < interval '24 HOURS')
    ORDER BY logs.session_id, id
)
SELECT 
       COUNT(*) FILTER (WHERE (answer_time IS NOT NULL)) AS sccess_count,
       COUNT(*) FILTER (WHERE (answer_time IS NULL)) AS fail_count
FROM t;

But if the DB table have about 50M records, the query taken 20 seconds, this is unacceptable in the production environment, how can I optimize it? My goal is less than 1 second for the 50M records.

继续阅读：postgresql query-optimization

Optimize the PostgreSQL query

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？