开发者

Postgres order by foreign key performance?

I have strange (?) problem with ordering in Postgres by foreign key. It's second table & query that takes much longer with order by than without.

EXPLAIN ANALYZE SELECT "spoleczniak_zdjecia"."id", "spoleczniak_zdjecia"."postac_id", "spoleczniak_zdjecia"."zdjecie", "spoleczniak_zdjecia"."opis", "spoleczniak_zdjecia"."data", "spoleczniak_zdjecia"."avatar", "spoleczniak_zdjecia"."tagi", "postac_postacie"."id", "postac_postacie"."user_id", "postac_postacie"."avatar", "postac_postacie"."ikonka", "postac_postacie"."imie", "postac_postacie"."nazwisko", "postac_postacie"."pseudonim", "postac_postacie"."plec", "postac_postacie"."wzrost", "postac_postacie"."waga", "postac_postacie"."ur_tydz", "postac_postacie"."ur_rok", "postac_postacie"."ur_miasto_id", "postac_postacie"."akt_miasto_id", "postac_postacie"."kasa", "postac_postacie"."punkty", "postac_postacie"."zmeczenie", "postac_postacie"."zdrowie", "postac_postacie"."kariera" FROM "spoleczniak_zdjecia" INNER JOIN "taggit_taggeditem" ON ("spoleczniak_zdjecia"."id" = "taggit_taggeditem"."object_id") INNER JOIN "taggit_tag" ON ("taggit_taggeditem"."tag_id" = "taggit_tag"."id") INNER JOIN "postac_postacie" ON ("spoleczniak_zdjecia"."postac_id" = "postac_postacie"."id") WHERE ("taggit_tag"."slug" = 'ja' AND "taggit_taggeditem"."content_type_id" = 922 ) ORDER BY "spoleczniak_zdjecia"."id" DESC LIMIT 28;
                                                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=27.88..27.89 rows=7 width=198) (actual time=2984.689..2984.697 rows=28 loops=1)
   ->  Sort  (cost=27.88..27.89 rows=7 width=198) (actual time=2984.688..2984.692 rows=28 loops=1)
         Sort Key: spoleczniak_zdjecia.id
         Sort Method: top-N heapsort  Memory: 32kB
         ->  Nested Loop  (cost=2.31..27.78 rows=7 width=198) (actual time=1.063..2974.901 rows=9091 loops=1)
               ->  Nested Loop  (cost=2.31..22.02 rows=7 width=109) (actual time=1.057..2899.010 rows=9091 loops=1)
                     ->  Nested Loop  (cost=2.31..19.92 rows=7 width=4) (actual time=1.046..2848.853 rows=9103 loops=1)
                           ->  Index Scan using taggit_tag_slug on taggit_tag  (cost=0.00..4.27 rows=1 width=4) (actual time=0.025..0.027 rows=1 loops=1)
                                 Index Cond: ((slug)::text = 'ja'::text)
                           ->  Bitmap Heap Scan on taggit_taggeditem  (cost=2.31..15.56 rows=7 width=8) (actual time=1.019..2847.244 rows=9103 loops=1)
                                 Recheck Cond: (tag_id = taggit_tag.id)
                                 Filter: (content_type_id = 922)
                                 ->  Bitmap Index Scan on taggit_taggeditem_tag_id  (cost=0.00..2.31 rows=7 width=0) (actual time=0.954..0.954 rows=9103 loops=1)
                                       Index Cond: (tag_id = taggit_tag.id)
                     ->  Index Scan using spoleczniak_zdjecia_pkey on spoleczniak_zdjecia  (cost=0.00..0.29 rows=1 width=109) (actual time=0.005..0.005 rows=1 loops=9103)
                           Index Cond: (id = taggit_taggeditem.object_id)
               ->  Index Scan using postac_postacie_pkey on postac_postacie  (cost=0.00..0.81 rows=1 width=89) (actual time=0.007..0.007 rows=1 loops=9091)
                     Index Cond: (id = spoleczniak_zdjecia.postac_id)
 Total runtime: 2984.760 ms

And here is without order by:

EXPLAIN ANALYZE SELECT "spoleczniak_zdjecia"."id", "spoleczniak_zdjecia"."postac_id", "spoleczniak_zdjecia"."zdjecie", "spoleczniak_zdjecia"."opis", "spoleczniak_zdjecia"."data", "spoleczniak_zdjecia"."avatar", "spoleczniak_zdjecia"."tagi", "postac_postacie"."id", "postac_postacie"."user_id", "postac_postacie"."avatar", "postac_postacie"."ikonka", "postac_postacie"."imie", "postac_postacie"."nazwisko", "postac_postacie"."pseudonim", "postac_postacie"."plec", "postac_postacie"."wzrost", "postac_postacie"."waga", "postac_postacie"."ur_tydz", "postac_postacie"."ur_rok", "postac_postacie"."ur_miasto_id", "postac_postacie"."akt_miasto_id", "postac_postacie"."kasa", "postac_postacie"."punkty", "postac_postacie"."zmeczenie", "postac_postacie"."zdrowie", "postac_postacie"."kariera" FROM "spoleczniak_zdjecia" INNER JOIN "taggit_taggeditem" ON ("spoleczniak_zdjecia"."id" = "taggit_taggeditem"."object_id") INNER JOIN "taggit_tag" ON ("taggit_taggeditem"."tag_id" = "taggit_tag"."id") INNER JOIN "postac_postacie" ON ("spoleczniak_zdjecia"."postac_id" = "postac_postacie"."id") WHERE ("taggit_tag"."slug" = 'ja' AND "taggit_taggeditem"."content_type_id" = 922 ) LIMIT 28;
                                                                            QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2.31..27.78 rows=7 width=198) (actual time=1.113..1.482 rows=28 loops=1)
   ->  Nested Loop  (cost=2.31..27.78 rows=7 width=198) (actual time=1.112..1.477 rows=28 loops=1)
 开发者_C百科        ->  Nested Loop  (cost=2.31..22.02 rows=7 width=109) (actual time=1.102..1.292 rows=28 loops=1)
               ->  Nested Loop  (cost=2.31..19.92 rows=7 width=4) (actual time=1.092..1.145 rows=28 loops=1)
                     ->  Index Scan using taggit_tag_slug on taggit_tag  (cost=0.00..4.27 rows=1 width=4) (actual time=0.017..0.017 rows=1 loops=1)
                           Index Cond: ((slug)::text = 'ja'::text)
                     ->  Bitmap Heap Scan on taggit_taggeditem  (cost=2.31..15.56 rows=7 width=8) (actual time=1.072..1.118 rows=28 loops=1)
                           Recheck Cond: (tag_id = taggit_tag.id)
                           Filter: (content_type_id = 922)
                           ->  Bitmap Index Scan on taggit_taggeditem_tag_id  (cost=0.00..2.31 rows=7 width=0) (actual time=0.989..0.989 rows=9103 loops=1)
                                 Index Cond: (tag_id = taggit_tag.id)
               ->  Index Scan using spoleczniak_zdjecia_pkey on spoleczniak_zdjecia  (cost=0.00..0.29 rows=1 width=109) (actual time=0.004..0.005 rows=1 loops=28)
                     Index Cond: (id = taggit_taggeditem.object_id)
         ->  Index Scan using postac_postacie_pkey on postac_postacie  (cost=0.00..0.81 rows=1 width=89) (actual time=0.005..0.005 rows=1 loops=28)
               Index Cond: (id = spoleczniak_zdjecia.postac_id)
 Total runtime: 1.562 ms

What can cause problem? It's query? Config? Any particular config should I check? In my last question there was more complex query, but that query is not complex at all. Any suggestions?

And btw. that query is generated by Django (django-taggit to be precise). And btw. part II, it's not poor hardware at all (i7, 16 GB of RAM, RAID 10 3x2 for OS and data + 2 RAID1 disks for WAL, 512 MB of RAID cache + BBU)

Plain text query:

SELECT "spoleczniak_zdjecia"."id", "spoleczniak_zdjecia"."postac_id", "spoleczniak_zdjecia"."zdjecie", "spoleczniak_zdjecia"."opis", "spoleczniak_zdjecia"."data", "spoleczniak_zdjecia"."avatar", "spoleczniak_zdjecia"."tagi", "postac_postacie"."id", "postac_postacie"."user_id", "postac_postacie"."avatar", "postac_postacie"."ikonka", "postac_postacie"."imie", "postac_postacie"."nazwisko", "postac_postacie"."pseudonim", "postac_postacie"."plec", "postac_postacie"."wzrost", "postac_postacie"."waga", "postac_postacie"."ur_tydz", "postac_postacie"."ur_rok", "postac_postacie"."ur_miasto_id", "postac_postacie"."akt_miasto_id", "postac_postacie"."kasa", "postac_postacie"."punkty", "postac_postacie"."zmeczenie", "postac_postacie"."zdrowie", "postac_postacie"."kariera" FROM "spoleczniak_zdjecia" INNER JOIN "taggit_taggeditem" ON ("spoleczniak_zdjecia"."id" = "taggit_taggeditem"."object_id") INNER JOIN "taggit_tag" ON ("taggit_taggeditem"."tag_id" = "taggit_tag"."id") INNER JOIN "postac_postacie" ON ("spoleczniak_zdjecia"."postac_id" = "postac_postacie"."id") WHERE ("taggit_tag"."slug" = 'ja' AND "taggit_taggeditem"."content_type_id" = 922 ) ORDER BY "spoleczniak_zdjecia"."id" DESC LIMIT 28;


The difference is right here in the second line of the EXPLAIN output:

->  Sort  (cost=27.88..27.89 rows=7 width=198) (actual time=2984.688..2984.692 rows=28 loops=1)

Notice that the "actual time" is pretty much the entire time of the query. Sorting requires not only a bunch of comparisons (i.e. the cost of sorting anything) but also extra data management, the server needs to copy some data (rows or pointers to rows) to a temporary location so that it can be sorted without disturbing anything else.

Any query will take longer with sorting unless you get lucky and your sorting matches the order on disk and optimizer can notice that they match up.


The second one returns you the first 28 records found regardless the order.

The first you has to order the results THEN returning you the 28 first records.

If the data is not modified, the query with the ORDER BY will returns the same 28 records every time.

But the second query can returns 28 differents records each time you execute it. The result is not guaranteed.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜