SQL Optimization

2023-01-22 14:21 问答作者：

How to optimize this?

SELECT e.attr_id, e.sku, a.value

FROM product_attr AS e, product_attr_text AS a

WHERE e.attr_id = a.attr_id
AND value
IN (
    SELECT value
    FROM product_attr_text 
    WHERE attribute_id = (
        SELECT attribute_id
        FROM eav_attr 
        WHERE attribute_code = 'similar_prod_id' 
    ) 
    AND value != ''

    GROUP BY value
    HAVING (COUN开发者_开发知识库T( value ) > 1 )
)

SELECT  e.attr_id, e.sku, a.value
FROM    (
        SELECT  pat.value
        FROM    eav_attr ea
        JOIN    product_attr_text pat
        ON      pat.attribute_id = ea.attribute_id
        WHERE   ea.attribute_code = 'similar_prod_id'
                AND value <> ''
        GROUP BY
                value
        HAVING  COUNT(*) > 1
        ) q
JOIN    product_attr_text AS a
ON      a.value = q.value
JOIN    product_attr AS e
ON      e.attr_id  = a.attr_id

Create indexes:

 eav_attr (attribute_code)
 product_attr_text (attribute_id, value)
 product_attr_text (value)
 product_attr (attr_id)

Change it into a JOIN. MySQL doesn't optimize well subqueries in an IN() clause - it is recalculated per row (very inefficient for many rows)

SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e
INNER JOIN product_attr_text AS a ON e.attr_id = a.attr_id
INNER JOIN (SELECT value
    FROM product_attr_text 
    INNER JOIN eav_attr ON eav_attr.attribute_id=product_attr_text.attribute_id
    WHERE attribute_code = 'similar_prod_id'  
    AND value != ''
    GROUP BY value
    HAVING COUNT( value ) > 1 
) AS filter ON filter.value=a.value

After you've converted the query (you may need to make some corrections according to your schema), run EXPLAIN on the query and index accordingly.

Difficult to answer ...

The only thing I can say is:

look at the execution plan
use IO statistics-

and take a look what part of the statement is problematic, and why. Find out if you can improve performance by adding an index, etc...

Next to that, I don't understand why you're using a group by clause ...

And, why do you use subqueries, instead of joining tables ? (Although I don't think that it will make much difference, since the SQL Server optimizer should be smart enough to generate the same execution plan).

Change the second inner select into a join:

SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e, product_attr_text AS a
WHERE e.attr_id = a.attr_id
AND value
IN (
    SELECT at.value
    FROM product_attr_text at, eav_attr eat 
    WHERE at.attribute_id = eat.attribute_id 
          AND eat.attribute_code = 'similar_prod_id' 
          AND value != ''
    GROUP BY value
    HAVING (COUNT( value ) > 1 )
)

Then, take a look to the execution plan to see how the query is solved. You might need additional optimizations based on that execution plan (e.g. make sure the attribute_id field is indexed).

frederik answer is the right one, but i'd like to suggest a few point :

avoid IN, use EXISTS instead
avoid = (SELECT something from sometable), use exists instead
useless group by
use inner join instead of multiple table in from clause

but again, it depends on your machine/dbms/version etc... so to get the best performance you'll have to compare the different execution plan

Caution: Be wary of designs that use EAV (entity, attribute, value) table designs. They tend to become very difficult to handle, for reasons of referential integrity and query complexity.

Optimize by using:

Joins instead of sub-selects.
Join notation.
Explicit table aliases on all column names.

Phase 1:

SELECT e.attr_id, e.sku, a.value
  FROM product_attr AS e JOIN product_attr_text AS a
       ON e.attr_id = a.attr_id
 WHERE a.value IN (
          SELECT p.value
            FROM product_attr_text AS p
            JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
           WHERE v.attribute_code = 'similar_prod_id'
             AND p.value != ''
           GROUP BY value
          HAVING (COUNT( value ) > 1)
          )

Phase 2:

SELECT e.attr_id, e.sku, a.value
  FROM product_attr AS e
  JOIN product_attr_text AS a ON e.attr_id = a.attr_id
  JOIN (SELECT p.value
          FROM product_attr_text AS p
          JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
         WHERE v.attribute_code = 'similar_prod_id'
           AND p.value != ''
         GROUP BY value
        HAVING (COUNT( value ) > 1)
       ) AS x ON x.value = a.value

use a derived table for the group by having... and join that back on whatever table(s) you need to pad out the result

继续阅读：optimization sql sql-optimization

SQL Optimization

Phase 1:

Phase 2:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Phase 1:

Phase 2:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？