开发者

Please Help Me With Mysql Slow Query Analysis

I have this mysql query that I am trying to analyze. It is very slow, the visitor table here is about 50K entries, this query never returns. When I tried an explain statement, I found out that the index is not being used on the visitor table, In spite of the index being available. Now this is the great puzzle I need help solving. Any hints appreciated.

Query:

select distinct
  visitor0_.ID as ID130_,      

  case when visitor0_1_.id is not null then 1 when
  visitor0_.ID is not null then 0
  end as clazz_

from Visitor visitor0_ 
left outer join Operator visitor0_1_ on visitor0_.ID=visitor0_1_.id
where (visitor0_.ID not in
    (select operator1_.id 
     from Operator operator1_ 
     inner join Visitor operator1_1_ on operator1_.id=operator1_1_.ID))
  and (exists 
    (select visitorpro2_.ID 
     from VisitorProfileField visit开发者_StackOverfloworpro2_, ProfileField profilefie3_ 
     where visitorpro2_.profileFieldID=profilefie3_.ID 
       and visitorpro2_.visitorID=visitor0_.ID 
       and profilefie3_.name='subscription86' 
       and visitorpro2_.numberVal=1 
       and visitorpro2_.stringVal='Manual'))

Explain output screen shot: http://grab.by/grabs/9c3a629a25fc4e9ec0fa54355d4a092c.png


From what I infer of your query, the following should produce the same result, with no subqueries and a lot faster performance.

select v.ID as ID130_, 0 as clazz_
from Visitor v
left outer join (VisitorProfileField vpf join ProfileField pf 
                   on vpf.profileFieldID = pf.ID)
  on v.ID = vpf.visitorID and pf.name='subscription86' 
    and vpf.numberVal=1 and vpf.stringVal='Manual'
left outer join Operator o on v.ID = o.ID
where o.ID IS NULL;

Please explain if I got some of it wrong. It appears that your NOT IN predicate excludes any Visitor id's that match any id's in Operator. That is, the subquery generates a list of all id's that are in both tables, so the NOT IN condition is equivalent to an outer join to Operator and a simple test where o.ID IS NULL.

This means the CASE expression in your select-list is meaningless, since it will certainly be 0 if your conditions match only Visitor rows that don't match any rows in Operator.

I think something is seriously confused in your query.

Also, it appears that you're using the EAV antipattern in the VisitorProfileField and ProfileField tables. This is going to cause you a lot of trouble.


You're query is... big. Can you explain what it acomplishes for you? It looks like it pulls each visitor ID and whether or not they are an operator where they are not an operator and they have a specific profile setup. That doesn't make a ton of sense, so I must be missing something there.

Here's my attempt, based on my understanding of what you're trying to do:

select distinct visitor.ID, IF(operator.id IS NOT NULL, 1, 0) AS clazz
from Visitor left outer join Operator on visitor.ID = operator.id
where not exists 
    (select 'x' from Operator OperatorTwo where OperatorTwo.id = visitor.ID)
and exists
    (select 'x' from VisitorProfileField, ProfileField
        where VisitorProfileField.profileFieldID = ProfileField.ID
        and VisitorProfileField.profileFieldID.visitorID = visitor.ID
        and VisitorProfileField.profileFieldID.numberVal = 1
        and VisitorProfileField.profileFieldID.stringVal = 'Manual'
        and ProfileField .name = 'subscription86')

The joined table named "operator1_1_" doesn't appear to be used, you should be able to remove that. If you are using it just to make sure there is a record for the visitor in that table, I'd use an exists instead of a join. I dropped that.

I've switched your not in to a not exists, which I think may be easier for MySQL to optimize. I used an IF instead of a case because you only have two, and it was shorter to type. I don't know if either one is faster/easier on MySQL.

I can tell you that in my experience MySQL performance dies with subqueries in suqueries. It seems to give up optimising them and starts running them row by row. I bet that if you used a temporary table of results (just for testing purposes) you would find your query runs much faster.

Edit:

Bill went further than I did, I didn't go far enough. I like Bill's query and agree with his conclusions about the CASE statement, which was kind of throwing me.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜