MySQL Query, Subquery optimization, SELECT, JOIN

2023-01-10 22:05 问答作者：

I have one table with some data and I want select newest data for each type...

Table:

+----+------+------+---------------------+
| ID | data | type | date                |
+----+------+------+---------------------+
|  1 | just |    2 | 2010-08-07 14:开发者_开发问答24:48 |
|  2 | some |    2 | 2010-08-07 18:07:32 |
|  3 | data |    9 | 2010-08-06 02:52:17 |
|  4 | abcd |    1 | 2010-08-08 17:23:22 |
|  5 | efg1 |    5 | 2010-07-10 21:36:55 |
|  6 | c123 |    5 | 2010-07-10 20:44:36 |
|  7 | bbey |   12 | 2010-08-09 09:01:26 |
+----+------+------+---------------------+

Currently I'm using simple subquery and looks like everything works

SELECT `data`,`type`,`date`
FROM `table1`
WHERE `date` = (
                  SELECT MAX( `date` )
                  FROM `table1` AS tbl2
                  WHERE tbl2.`type` = `table1`.`type`
                )
GROUP BY `type`
ORDER BY `type`,`date`

Result:

+------+------+---------------------+
| data | type | date                |
+------+------+---------------------+
| abcd |    1 | 2010-08-08 17:23:22 |
| some |    2 | 2010-08-07 18:07:32 |
| efg1 |    5 | 2010-07-10 21:36:55 |
| data |    9 | 2010-08-06 02:52:17 |
| bbey |   12 | 2010-08-09 09:01:26 |
+------+------+---------------------+

My question is is there a better way to do this, some optimization, improvement or maybe it's possible to make join?

You are using a correlated subquery. The subquery is dependent on the outer query, and therefore it has to be executed once for each row of the outer query.

In general, this can be improved by using your subquery as a derived table instead. Since a subquery as a derived table is not correlated to the outer query, this solution is considered more scalable:

SELECT    t1.`data`, t1.`type`, t1.`date`
FROM      `table1` t1
JOIN      (
              SELECT   MAX( `date`) `max_date`, `type`
              FROM     `table1`
              GROUP BY `type`
          ) der_t ON (der_t.`max_date` = t1.`date` AND der_t.`type` = t1.`type`)
GROUP BY  t1.`type`
ORDER BY  t1.`type`, t1.`date`;

Test case:

CREATE TABLE table1 (id int, data varchar(10), type int, date datetime); 

INSERT INTO table1 VALUES (1, 'just', 2, '2010-08-07 14:24:48');
INSERT INTO table1 VALUES (2, 'some', 2, '2010-08-07 18:07:32');
INSERT INTO table1 VALUES (3, 'data', 9, '2010-08-06 02:52:17');
INSERT INTO table1 VALUES (4, 'abcd', 1, '2010-08-08 17:23:22');
INSERT INTO table1 VALUES (5, 'efg1', 5, '2010-07-10 21:36:55');
INSERT INTO table1 VALUES (6, 'c123', 5, '2010-07-10 20:44:36');
INSERT INTO table1 VALUES (7, 'bbey', 12, '2010-08-09 09:01:26');

Result:

+------+------+---------------------+
| data | type | date                |
+------+------+---------------------+
| abcd |    1 | 2010-08-08 17:23:22 |
| some |    2 | 2010-08-07 18:07:32 |
| efg1 |    5 | 2010-07-10 21:36:55 |
| data |    9 | 2010-08-06 02:52:17 |
| bbey |   12 | 2010-08-09 09:01:26 |
+------+------+---------------------+
5 rows in set (0.00 sec)

It also looks like you can avoid subqueries altogether, by using a solution such as the follows:

SELECT     t1.`data`, t1.`type`, t1.`date`
FROM       `table1` t1
LEFT JOIN  `table1` t2 ON (t1.`date` < t2.`date` AND t1.`type` = t2.`type`)
WHERE      t2.`date` IS NULL
GROUP BY   t1.`type`
ORDER BY   t1.`type`, t1.`date`;

In general, this scales even better than the solution with the derived table, but if performance is paramount, you may want to measure both solutions. The article that @Naktibalda provided also provides a few other solutions that you may want to test.

继续阅读：select subquery

MySQL Query, Subquery optimization, SELECT, JOIN

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？