开发者

How can I optimize this confusingly slow query in MySQL?

I have a table of blog posts, each with a foreign key back to it's author. There are < 15,000 entries in this table. This query scans over 19,000 rows (per EXPLAIN), requires a filesort (that might be regular MySQL behavior), and takes over 400ms to return 5 rows. possibly because of the complicated WHERE used to check if the item is actually published.

Dearest Stack Overflow, how can I wrangle this query under control?

Note: while this criteria might be up for simplification, all of the conditions are required.

SELECT      `blog_post.id`, 
            `blog_post.title`,
            `blog_post.author_id`,
            `blog_post.has_been_fact_checked`,
            `blog_post.published_date`,
            `blog_post.ordering`,
            `auth_user.username`,
            `auth_user.email`
FROM        `blog_post` 
INNER JOIN  `auth_user` 
ON          (`blog_post`.`author_id` = `auth_user`.`id`) 
WHERE       (`blog_post`.`is_approved` = True  AND 
             `blog_post`.`has_been_fact_checked` = True  AND 
             `blog_post`.`published_date` IS NOT NULL AND 
             `blog_post`.`published_date` <= '2010-10-25 22:40:05' ) 
ORDER BY    `blog_post`.`published_date` DESC, 
            `blog_post`.`ordering` ASC, 
            `blog_post`.`id` DESC 
LIMIT 5

Aside from the PKs, I have the following indexes on the table:

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date)
idx_pub_date -> blog_post(published_date)

The output from EXPLAIN looks like this:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: blog_post
         type: ref
possible_keys: blog_post_author_id,idx_published_blog_post,idx_pub_date
          key: idx_published开发者_如何转开发_blog_post
      key_len: 4
          ref: const,const
         rows: 19856
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: auth_user
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: blog.blog_post.author_id
         rows: 1
        Extra: Using index
2 rows in set (0.00 sec)

Side-note: 2010-10-25 22:40:05 is just a date generated by the code that executes this query.

Thanks so much for any & all help!


MySQL does not support ASC/DESC clauses in indexes.

You would need to create a separate column called reverse_ordering and set its value to -ordering (provided that ordering is an numeric value)

Then you could create the following index:

CREATE INDEX ix_blogpost_a_c_p_ro_id ON blog_post (is_approved, has_been_fact_checked, published_date, reverse_ordering, id)

and rewrite your query:

SELECT      `blog_post.id`, 
            `blog_post.title`,
            `blog_post.author_id`,
            `blog_post.has_been_fact_checked`,
            `blog_post.published_date`,
            `blog_post.ordering`,
            `auth_user.username`,
            `auth_user.email`
FROM        `blog_post` 
INNER JOIN  `auth_user` 
ON          `blog_post`.`author_id` = `auth_user`.`id`
WHERE       `blog_post`.`is_approved` = 1 AND 
            `blog_post`.`has_been_fact_checked` = 1 AND 
            `blog_post`.`published_date` <= '2010-10-25 22:40:05'
ORDER BY    `blog_post`.`published_date` DESC, 
            `blog_post`.`reverse_ordering` DESC, 
            `blog_post`.`id` DESC 
LIMIT 5

You may get rid of IS NULL check, since the inequality condition implies it.

Update:

You also might want to read this article:

  • Mixed ASC/DESC sorting in MySQL


Make a view of "blog_post" with all conditions you have applied in query (where clause) and do the joining of "auth_user" direct with this view.

feel free ask if not clear. :)


To me it looks like the filesort may be killing the speed. If you can get the ORDER BY fields into the index that is being used you may get a speed increase. Try changing:

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date)

to

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date DESC, ordering ASC, id DESC)

A couple of after thoughts: under what circumstances do you have a null published_date, searching in a date range may be quicker? Also published_date seems to be a date time field, so when sorting do you really have so may posts each second that you require the other sorting fields?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜