SQL Optimization
How to optimize this?
SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e, product_attr_text AS a
WHERE e.attr_id = a.attr_id
AND value
IN (
SELECT value
FROM product_attr_text
WHERE attribute_id = (
SELECT attribute_id
FROM eav_attr
WHERE attribute_code = 'similar_prod_id'
)
AND value != ''
GROUP BY value
HAVING (COUN开发者_开发知识库T( value ) > 1 )
)
SELECT e.attr_id, e.sku, a.value
FROM (
SELECT pat.value
FROM eav_attr ea
JOIN product_attr_text pat
ON pat.attribute_id = ea.attribute_id
WHERE ea.attribute_code = 'similar_prod_id'
AND value <> ''
GROUP BY
value
HAVING COUNT(*) > 1
) q
JOIN product_attr_text AS a
ON a.value = q.value
JOIN product_attr AS e
ON e.attr_id = a.attr_id
Create indexes:
eav_attr (attribute_code)
product_attr_text (attribute_id, value)
product_attr_text (value)
product_attr (attr_id)
Change it into a JOIN. MySQL doesn't optimize well subqueries in an IN() clause - it is recalculated per row (very inefficient for many rows)
SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e
INNER JOIN product_attr_text AS a ON e.attr_id = a.attr_id
INNER JOIN (SELECT value
FROM product_attr_text
INNER JOIN eav_attr ON eav_attr.attribute_id=product_attr_text.attribute_id
WHERE attribute_code = 'similar_prod_id'
AND value != ''
GROUP BY value
HAVING COUNT( value ) > 1
) AS filter ON filter.value=a.value
After you've converted the query (you may need to make some corrections according to your schema), run EXPLAIN
on the query and index accordingly.
Difficult to answer ...
The only thing I can say is:
look at the execution plan
use IO statistics-
and take a look what part of the statement is problematic, and why. Find out if you can improve performance by adding an index, etc...
Next to that, I don't understand why you're using a group by clause ...
And, why do you use subqueries, instead of joining tables ? (Although I don't think that it will make much difference, since the SQL Server optimizer should be smart enough to generate the same execution plan).
Change the second inner select into a join:
SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e, product_attr_text AS a
WHERE e.attr_id = a.attr_id
AND value
IN (
SELECT at.value
FROM product_attr_text at, eav_attr eat
WHERE at.attribute_id = eat.attribute_id
AND eat.attribute_code = 'similar_prod_id'
AND value != ''
GROUP BY value
HAVING (COUNT( value ) > 1 )
)
Then, take a look to the execution plan to see how the query is solved. You might need additional optimizations based on that execution plan (e.g. make sure the attribute_id field is indexed).
frederik answer is the right one, but i'd like to suggest a few point :
- avoid IN, use EXISTS instead
- avoid = (SELECT something from sometable), use exists instead
- useless group by
- use inner join instead of multiple table in from clause
but again, it depends on your machine/dbms/version etc... so to get the best performance you'll have to compare the different execution plan
Caution: Be wary of designs that use EAV (entity, attribute, value) table designs. They tend to become very difficult to handle, for reasons of referential integrity and query complexity.
Optimize by using:
- Joins instead of sub-selects.
- Join notation.
- Explicit table aliases on all column names.
Phase 1:
SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e JOIN product_attr_text AS a
ON e.attr_id = a.attr_id
WHERE a.value IN (
SELECT p.value
FROM product_attr_text AS p
JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
WHERE v.attribute_code = 'similar_prod_id'
AND p.value != ''
GROUP BY value
HAVING (COUNT( value ) > 1)
)
Phase 2:
SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e
JOIN product_attr_text AS a ON e.attr_id = a.attr_id
JOIN (SELECT p.value
FROM product_attr_text AS p
JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
WHERE v.attribute_code = 'similar_prod_id'
AND p.value != ''
GROUP BY value
HAVING (COUNT( value ) > 1)
) AS x ON x.value = a.value
use a derived table for the group by having... and join that back on whatever table(s) you need to pad out the result
精彩评论