SQL query optimization
I have a problem with my SQL query that take time to get all records from database. Any body help me. Below is a sample of database:
order(order_id, order_nm)
customer(customer_id, customer_nm)
orderDetail(orderDetail_id, order_id, orderDate, customer_id, Comment)
I want to get latest customer and order detail information.
Here is may solution:I've created a function that GetLatestOrderByCustomer(CusID) to get lastest Customer infor开发者_JAVA百科mation.
CREATE FUNCTION [dbo].[GetLatestOrderByCustomer]
(
@cus_id int
)
RETURNS varchar(255)
AS
BEGIN
DECLARE @ResultVar varchar(255)
SELECT @ResultVar = tmp.comment
FROM
(
SELECT TOP 1 orderDate, comment
FROM orderDetail
WHERE orderDetail.customer_id = @cust_id
) tmp
-- Return the result of the function
RETURN @ResultVar
END
Below is my SQL query
SELECT
customer.customer_id
, customer.customer_nm
, dbo.GetLatestOrderByCustomer(customer.customer_id)
FROM Customer
LEFT JOIN orderDetail
ON orderDetail.customer_id = customer.customer_id
It's take time to run the function. Could anybody suggest me any solutions to make it better?
Use:
SELECT c.customer_id
, c.customer_nm
, y.comment
FROM CUSTOMER c
LEFT JOIN (SELECT od.customer_id,
MAX(od.orderdate) AS max_date
FROM ORDERDETAIL od
GROUP BY od.customer_id) x ON x.customer_id = c.customer_id
JOIN (SELECT od.customer_id,
od.comment,
od.orderdate
FROM ORDERDETAIL od) y ON y.customer_id = c.customer_id
AND y.orderdate = x.max_date
There's no need for the function - use a derived table/inline view/subquery. The reason your function performs poorly is because it is executing for every row returned.
Do you have the proper indices?
custoemr.customer_id,
orderDetail.customer_id
should both be indexed. As should possibly be oder.OrderDate. Without indices you run table scans galore - and you dont say anything about WHY your query is slow, so assuming bad indexing is something i normally do.
I find it funny that the customer_id is on the order detail, not on the order - normally orders are asssigned to one customer.
Now, fundamentally wrong:
The function GetLatestOrderByCustomer does not do that ;) There is a TOP 1 but there is no - ORDER BY. SQL Results have no determined order unless you say so, so the one element returned is esentially - RANDOM.
If the customer_id is a growing number, there is no need to have a function. Get rid of it, and merge the query part into the main SQL Query - allows the query optimizer to do better work.
Then - the id fields should not be strings. Seriously - you blow performance there, big time. Make them small, efficient. int, smallint. Do not use them for the "end user" coding (invoice number etc.) - that can be a separate string field with unique index. But joining strings - especially defined as varchar(255) is quite slow compared to joins in ints.
Depending on the size of your database, using an OUTER APPLY
might be much faster then the solution provided by @OMG Ponies:
SELECT c.customer_id
, c.customer_nm
, y.comment
FROM customer c
OUTER APPLY (
SELECT TOP 1 o.comment
FROM orderdetail o
WHERE o.customer_id = c.customer_id
ORDER BY orderDate DESC
) AS y
See query execution plans below. Note the query cost relative to batch (top plan is OUTER APPLY
, bottom plan is @OMG Ponies solution; other queries in batch are inserts into temporary tables):
精彩评论