Optimize LEFT JOIN on table with 30 000+ rows
I have a website where visitors can leave comments. I want to add the ability to answer comments (i.e. nested comments).
At first this query was fast but after I populated the table with the existing comments (about 30000) a simple query like:
SELECT c.id, c2.id
FROM (SELECT id
FROM swb_comments
WHERE pageId = 1411
ORDER BY id DESC
LIMIT 10) AS c
LEFT JOIN swb_comments AS c2 ON c.id = c2.parentId
took over 2 seconds, with no childComments(!).
How do I optimize a query like this? On possible solution would be http://www.ferdychristant.com/blog//articles/DOMM-7QJPM7 (scroll to "The Flat Table Model done right") but this makes pagination rather difficult (how do I limit to 10 parent comments within 1 query?)
The table has 3 indexes, id, pageId and ParentId.
Thanks in advance!
EDIT:
Table definition added. This is the full definition with some differences to the above SELECT query, (i.e. pageId instead of numberId to avoid confussion)
CREATE TABLE `swb_comments` (
`开发者_JAVA百科id` mediumint(9) NOT NULL auto_increment,
`userId` mediumint(9) unsigned NOT NULL default '0',
`numberId` mediumint(9) unsigned default NULL,
`orgId` mediumint(9) unsigned default NULL,
`author` varchar(100) default NULL,
`email` varchar(255) NOT NULL,
`message` text NOT NULL,
`IP` varchar(40) NOT NULL,
`timestamp` varchar(25) NOT NULL,
`editedTimestamp` varchar(25) default NULL COMMENT 'last edited timestamp',
`status` varchar(20) NOT NULL default 'publish',
`parentId` mediumint(9) unsigned NOT NULL default '0',
`locale` varchar(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `userId` (`userId`),
KEY `numberId` (`numberId`),
KEY `orgId` (`orgId`),
KEY `parentId` (`parentId`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=34748 ;
The issue is that MySQL cannot apply index if it need to deal with a result from a derived query (that's why you have NULL in the possible_keys column). So I suggest to filter out ten comments that you need:
SELECT * FROM swb_comments WHERE pageId = 1411 ORDER BY id DESC LIMIT 10
And after that send separate request to get answers for each comment id:
SELECT * FROM swb_comments WHERE parentId IN ($commentId1, $commentId2, ..., $commentId10)
In this case database engine will be able to apply pageId and parentId indexes efficiently.
If Mr Fedorenko is correct and the subquery is causing the optimiser difficulties, could you not try...
SELECT c.id, c2.id
FROM swb_comments c LEFT JOIN swb_comments c2 ON c.id = c2.parentID
WHERE c.pageId = 1411
ORDER BY c.id DESC
LIMIT 10;
and see if it's any improvement?
Later - I have created a table using your definition, filled it in with 30,000 skeletal rows, and tried both the queries. They both complete in too short a time to notice. The explain plans are here...
mysql> EXPLAIN SELECT c.id, c2.id
FROM swb_comments c LEFT JOIN swb_comments c2 ON c.id = c2.parentID
WHERE c.numberId = 1411 ORDER BY c.id DESC LIMIT 10;
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
| 1 | SIMPLE | c | ref | numberId | numberId | 4 | const | 1 | Using where; Using filesort |
| 1 | SIMPLE | c2 | ref | parentId | parentId | 3 | books.c.id | 14 | |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
mysql> EXPLAIN SELECT c.id, c2.id
FROM swb_comments c LEFT JOIN swb_comments c2 ON c.id = c2.parentID
WHERE c.numberId = 1411 ORDER BY c.id DESC LIMIT 10;
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
| 1 | SIMPLE | c | ref | numberId | numberId | 4 | const | 1 | Using where; Using filesort |
| 1 | SIMPLE | c2 | ref | parentId | parentId | 3 | books.c.id | 14 | |
+----+-------------+-------+------+---------------+----------+---------+------------+------+-----------------------------+
and are exactly what I'd expect.
This is very mysterious.
I'll think about it a bit more to see if there's anything else we can try.
精彩评论