开发者

MySQL: how to index an "OR" clause

I'm executing the following query

SELECT COUNT(*)
FROM table
WHERE field1='value' AND (field2 >= 1000 OR field3 >= 2000)

There is one index over field1 and another composited over field2&field3.

I see MySQL always selects the field1 index and then makes a join using the other two fields which is quite bad because it needs to join 146.000 rows.

Suggestions on how to improve this? Thanks

(EDIT AFTER TRYING SOLUTION PROPOSED)

Based in the solution proposed I've seen this on Mysql when playing with this.

SELECT COUNT(*) FROM (SELECT * FROM table WHERE columnA = value1
UNION SELECT * FROM table WHERE columnB = value2) AS unionTable;

is a lot slower than execute:

SELECT COUNT(*)
FROM table
WHERE (columnA = value1 AND columnB = value2)
      OR (columnA = valu开发者_运维问答e1 AND columnC = value3)

Having two composited index:

index1 (columnA,columnB)
index2 (columnA,columnC)

Interesting enough is that asking Mysql to "explain" the query it's taking always index1 on both cases and index2 is not used.

If I change the indexes to:

index1 (columnB,columnA)
index2 (columnC,columnA)

And the query to:

SELECT COUNT(*)
FROM table
WHERE (columnB = value2 AND columnA = value1)
      OR (columnC = value3 AND columnA = value1)

Then it's the fastest way I've found Mysql works.


The typical way to break up OR predicates is with UNION.

Note that your example doesn't fit well with your indexes. Even if you omitted field1 from the predicate, you'd have field2 >= 1000 OR field3 >= 2000, which can't use an index. If you had indexes on (field1, field2) and (field1,field3) or field2 or field3 separately, you would get a reasonably fast query.

SELECT COUNT(*) FROM
(SELECT * FROM table WHERE field1 = 'value' AND field2 >= 1000
UNION
SELECT * FROM table WHERE field1 = 'value' AND field3 >= 2000) T

Note that you have to provide an alias for the derived table, which is why the subquery is aliased as T.

A real-world example. Column and table names have been anonymized!

mysql> SELECT COUNT(*) FROM table;
+----------+
| COUNT(*) |
+----------+
|  3059139 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM table WHERE columnA = value1;
+----------+
| COUNT(*) |
+----------+
|     1068 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM table WHERE columnB = value2;
+----------+
| COUNT(*) |
+----------+
|      947 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM table WHERE columnA = value1 OR columnB = value2;
+----------+
| COUNT(*) |
+----------+
|     1616 |
+----------+
1 row in set (9.92 sec)

mysql> SELECT COUNT(*) FROM (SELECT * FROM table WHERE columnA = value1
UNION SELECT * FROM table WHERE columnB = value2) T;
+----------+
| COUNT(*) |
+----------+
|     1616 |
+----------+
1 row in set (0.17 sec)

mysql> SELECT COUNT(*) FROM (SELECT * FROM table WHERE columnA = value1
UNION ALL SELECT * FROM table WHERE columnB = value2) T;
+----------+
| COUNT(*) |
+----------+
|     2015 |
+----------+
1 row in set (0.12 sec)


I'm new here, so I can't comment on other people's posts, but this is related to the posts by David M. and soulmerge.

The temporary table is not necessary. The UNION David M. suggested does not double count, as UNION implies a distinct (i.e. if a row exists in one half of the union, ignore it in the other). If you used UNION ALL, you would get two records.

The default behavior for UNION is that duplicate rows are removed from the result. The optional DISTINCT keyword has no effect other than the default because it also specifies duplicate-row removal. With the optional ALL keyword, duplicate-row removal does not occur and the result includes all matching rows from all the SELECT statements.

http://dev.mysql.com/doc/refman/5.0/en/union.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜