开发者

How would I make this query run faster?

How would I make this query run faster...?

SELECT    account_id, 
          account_name, 
          account_update, 
          account_sold, 
          account_mds, 
          ftp_url,         
          ftp_livestatus, 
          number_digits, 
          number_cw,
          client_name, 
          ppc_status, 
          user_name 
FROM     
         Accounts, 
         FTPDetails, 
         SiteNumbers, 
         Clients, 
         PPC, 
         Users 

WHERE    Accounts.account_id = FTPDetails.ftp_accountid 
AND      Accounts.account_id = SiteNumbers.number_accountid 
AND      Accounts.account_client = Clients.client_id     
AND      Accounts.account_id = PPC.ppc_accountid 
AND      Accounts.account_designer = Users.user_id   
AND      Accounts.account_active = 'active' 
AND      FTPDetails.ftp_active = 'active' 
AND      SiteNumbers.number_active = 'active' 
AND      Clients.client_active = 'active'    
AND      PPC.ppc_active = 'active'   
AND      Users.user_active = 'active' 
ORDER BY 
         Accounts.account_update DESC

Thanks in advance :)

EXPLAIN query results:

How would I make this query run faster?

How would I make this query run faster?

I don't really have any fore开发者_JS百科ign keys set up...I was trying to avoid making alterations to the database as will have to do a complete overhaul soon.

only primary keys are the id of each table e.g. account_id, ftp_id, ppc_id ...


Indexes

  • You need - at least - an index on every field that is used in a JOIN condition.

  • Indexes on the fields that appear in WHERE or GROUP BY or ORDER BY clauses are most of the time useful, too.

  • When in a table, two or more fields are used in JOIns (or WHERE or GROUP BY or ORDER BY), a compound (combined) index of these (two or more) fields may be better than separate indexes. For example in the SiteNumbers table, possible indexes are the compound (number_accountid, number_active) or (number_active, number_accountid).

  • Condition in fields that are Boolean (ON/OFF, active/inactive) are sometimes slowing queries (as indexes are not selective and thus not very helpful). Restructuring (father normalizing) the tables is an option in that case but probably you can avoid the added complexity.


Besides the usual advice (examine the EXPLAIN plan, add indexes where needed, test variations of the query),

I notice that in your query there is a partial Cartesian Product. The table Accounts has a one-to-many relationships to three tables FTPDetails, SiteNumbers and PPC. This has the effect that if you have for example 1000 accounts, and every account is related to, say, 10 FTPDetails, 20 SiteNumbers and 3 PPCs, the query will return for every account 600 rows (the product of 10x20x3). In total 600K rows where many data are duplicated.

You could instead split the query into three plus one for base data (Account and the rest tables). That way, only 34K rows of data (having smaller length) would be transfered :

Accounts JOIN Clients JOIN Users 
  (with all fields needed from these tables)
  1K rows

Accounts JOIN FTPDetails
  (with Accounts.account_id and all fields from FTPDetails)
  10K rows

Accounts JOIN SiteNumbers
  (with Accounts.account_id and all fields from SiteNumbers)
  20K rows

Accounts JOIN PPC
  (with Accounts.account_id and all fields from PPC)
  3K rows

and then use the data from the 4 queries in the client side to show combined info.



I would add the following indexes:

Table Accounts
  index on (account_designer)
  index on (account_client)
  index on (account_active, account_id)
  index on (account_update)

Table FTPDetails
  index on (ftp_active, ftp_accountid)

Table SiteNumbers
  index on (number_active, number_accountid)

Table PPC
  index on (ppc_active, ppc_accountid)


Use EXPLAIN to find out which index could be used and which index is actually used. Create an appropriate index if necessary.

If FTPDetails.ftp_active only has the two valid entries 'active' and 'inactive', use BOOL as data type.

As a side note: I strongly suggest using explicit joins instead of implicit ones:

SELECT
  account_id, account_name, account_update, account_sold, account_mds, 
  ftp_url, ftp_livestatus, 
  number_digits, number_cw,
  client_name, 
  ppc_status, 
  user_name 
FROM Accounts 
INNER JOIN FTPDetails
  ON  Accounts.account_id = FTPDetails.ftp_accountid
  AND FTPDetails.ftp_active = 'active'
INNER JOIN SiteNumbers
  ON  Accounts.account_id = SiteNumbers.number_accountid 
  AND SiteNumbers.number_active = 'active'
INNER JOIN Clients
  ON  Accounts.account_client = Clients.client_id
  AND Clients.client_active = 'active'
INNER JOIN PPC
  ON  Accounts.account_id = PPC.ppc_accountid
  AND PPC.ppc_active = 'active'
INNER JOIN Users
  ON  Accounts.account_designer = Users.user_id
  AND Users.user_active = 'active'
WHERE Accounts.account_active = 'active' 
ORDER BY Accounts.account_update DESC

This makes the query much more readable because the join condition is close to the name of the table that is being joined.


EXPLAIN, benchmark different options. For starters, I'm sure that several queries will be faster than this monster. First, because query optimiser will spend a lot of time examining what join order is the best (5!=120 possibilities). And second, queries like SELECT ... WHERE ....active = 'active' will be cached (though it depends on an amount of data changes).


One of your main problems is here: x.y_active = 'active'

Problem: low cardinality
The active field is a boolean field with 2 possible values, as such it has very low cardinality. MySQL (or any SQL for that matter will not use an index when 30% or more of the rows have the same value).
Forcing the index is useless because it will make your query slower, not faster.

Solution: partition your tables
A solution is to partition your tables on the active columns.
This will exclude all non-active fields from consideration, and will make the select act as if you actually have a working index on the xxx-active fields.

Sidenote
Please don't ever use implicit where joins, it's much too error prone and consufing to be useful.
Use a syntax like Oswald's answer instead.

Links:
Cardinality: http://en.wikipedia.org/wiki/Cardinality_(SQL_statements)
Cardinality and indexes: http://www.bennadel.com/blog/1424-Exploring-The-Cardinality-And-Selectivity-Of-SQL-Conditions.htm
MySQL partitioning: http://dev.mysql.com/doc/refman/5.5/en/partitioning.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜