开发者

Which SQL query is faster and why?

Recently, I've been asked to write a query to select properties of entities from a group that contains the maximum number of such entities. So, I did it on Northwind (MSFT distributed sample) database in a couple of ways.

ONE:

SELECT cat.CategoryName, prod.ProductName 
  FROM Categories cat 
  JOIN Products prod ON cat.CategoryID = prod.CategoryID
  JOIN (SELECT TOP 1 p.CategoryID, COUNT(p.ProductId) as products
          FROM Categories c 
          JOIN Products p on c.CategoryID = p.CategoryID
      GROUP BY p.CategoryID
      ORDER BY products desc) c ON c.CategoryID = cat.CategoryID

TWO:

SELECT cat.CategoryName, prod.ProductName
  FROM Categories cat
  JOIN Products prod ON cat.CategoryID = prod.CategoryID
  JOIN (SELECT CategoryID, COUNT(ProductID) m_count
          FROM Products 
      GROUP BY CategoryID
        HAVING COUNT(ProductID) = (SELECT MAX(sub.cnt) 
                         开发者_如何学Python            FROM (SELECT CategoryId, COUNT(ProductID) cnt 
                                             FROM Products 
                                         GROUP BY CategoryId) sub)) m ON m.CategoryID = cat.CategoryID

The question is: which is faster any why? In execution plans nothing stands out, in particular. The elapsed time varies slightly, but is the same, approximately. The database is, of course, tiny.


A tiny database makes it difficult to determine which is better, but SQL Server Management Studio has functionality to compare the efficiency of statements to one another.

  1. Open Management Studio
  2. Click the "New Query" button
  3. Click to enable the "Include Actual Query Plan"
  4. Post all the queries into the active query window
  5. Click the "Execute" button
  6. Click the "Execution plan" tab (to the left of the results) when it appears

The query cost is averaged by the number of queries run. So if comparing the two queries provided as examples, if both have a cost of 50% then they are equivalent (because 100 / 2 = 50, etc). When there's a difference, you can mouseover the SELECT to review the subtree cost, besides looking at the graphical layout of the Execution path.

Why is one faster than the other?

That depends on the database -- the data types being joined (are they as narrow as they could be? "narrow" means taking less bytes to store), indexes, and what is being performed in the query. Using different syntax can make all the difference.


Execute both SQL statements with SET SHOWPLAN_ALL ON activated. The SQL server will give you query information, which should help your understanding what's going on and which one is/will be faster.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜