Help improving SQL join
I have a st开发者_如何学Cored procedure that runs to update gaming points for user balances. It's an insert with 5 subqueries. I have isolated one of the subqueries as the query that slows the entire batch down. Without it, the stored procedure will run in under 2 seconds. With it, it will take as much as 8 seconds. 8 Seconds isn't the end of the world, but for the sake of scalability, I will need to have it complete faster. Here is the isolated subquery:
(SELECT IsNull(Sum(A.TransAmount) + Sum(Case When A.BetResult = 1 Then (A.BetWinAmount + (A.TransAmount * -1)) End), 0)
FROM User_T A
LEFT OUTER JOIN User_TD B on A.TID = B.TID
LEFT OUTER JOIN Lines_BL C ON B.LID = C.LID
LEFT OUTER JOIN Lines_BM D ON C.BMID = D.BMID
LEFT OUTER JOIN Event_M E ON D.EID = E.EID
LEFT OUTER JOIN Event_KB F ON A.TransReason = F.BID
LEFT OUTER JOIN Event_M G ON F.BID = G.EID
where A.UserID = U.UserID AND (A.IsSettled = 1)
AND
(
(A.TransReason = 1 AND (datediff(dd, Convert(datetime, E.EDate, 101), Convert(datetime, @EndDate, 101)) = @DaysAgo)) OR
(A.TransReason >= 3000 AND (datediff(dd, Convert(datetime, G.EDate, 101), Convert(datetime, @EndDate, 101)) = @DaysAgo)
AND [dbo].[Event_CEAFKBID](A.TransReason) = 1) OR
(A.TransReason BETWEEN 3 and 150 AND (datediff(dd, Convert(datetime, A.TransDT, 101), Convert(datetime, @EndDate, 101)) = @DaysAgo))
)
What I have done to further isolate: When I run a Select * on just the joins (without the where clauses), the performance in very good - > 100000 rows in under a second. As I add in the where clauses, I believe the great slow down is from the 'or' clause and/or the function that needs to be evaluated.
As I understand it, a function inside the where clause evaluates each row - as opposed to somehow caching the definition of the function and evaluating that way. I do have indexes on the tables, but I am wondering if some of them are not correct.
Without you knowing the full database structure, I am sure it's very difficult to pin down where the problem is, but I would like to get pointed in a direction to begin to further isolate.
I suspect your biggest performance hits are from the correlated subquery (whatever table is behind U.UserId) and from the embedded function call dbo.Event_CEAFKBID. Much of course depends upon how big the tables are (how many rows are being read). All those datetime conversions won’t help and generate a very strong “bad design” smell, but I don’t think they’d impact performance too much.
Those left outer joins are ugly, as the optimizer has to check them all for row – so if “A” is big, all the joins on all the rows have to be performed, even if there’s no data there. If they can be replaced with inner joins, do so, but I’m guessing not because of that “table E or table G” logic. Lesses, it sure looks like what you’ve got is three separate queries moshed into one; if you broke it out into three, unioned together, it’d look something like the Frankenstein query below. I’ve no idea if this would run faster or not (heck, I can’t even debug the query and make sure the panetheses balance), but if you’ve got sparse data relative to your logic this should run pretty fast. (I took out the date conversions to make the code more legible, you’d have to plug them back in.)
SELECT isnull(sum(Total), 0) FinalTotal from (
SELECT
sum(A.TransAmount + Case When A.BetResult = 1 Then A.BetWinAmount - A.TransAmount else 0 End) Total
FROM User_T A
INNER JOIN User_TD B on A.TID = B.TID
INNER JOIN Lines_BL C ON B.LID = C.LID
INNER JOIN Lines_BM D ON C.BMID = D.BMID
INNER JOIN Event_M E ON D.EID = E.EID
where A.UserID = U.UserID
AND A.IsSettled = 1
AND A.TransReason = 1
AND (datediff(dd, E.EDate, @EndDate) = @DaysAgo))
UNION ALL SELECT
sum(A.TransAmount + Case When A.BetResult = 1 Then A.BetWinAmount - A.TransAmount else 0 End) Total
FROM User_T A
INNER JOIN Event_KB F ON A.TransReason = F.BID
INNER JOIN Event_M G ON F.BID = G.EID
where A.UserID = U.UserID
AND A.IsSettled = 1
AND A.TransReason >= 3000
AND (datediff(dd, G.EDate, @EndDate) = @DaysAgo)
AND [dbo].[Event_CEAFKBID](A.TransReason) = 1
UNION ALL SELECT
sum(A.TransAmount + Case When A.BetResult = 1 Then A.BetWinAmount - A.TransAmount else 0 End) Total
FROM User_T A
where A.UserID = U.UserID
AND A.IsSettled = 1
AND A.TransReason BETWEEN 3 and 150
AND datediff(dd, A.TransDT, @EndDate) = @DaysAgo)
) ThreeWayUnion
You can put the case in the where cause, and not directly on select first line. why you need to put many join if in this statment you just user the tables A,E and G?
To performance better queries you can use execution plan on management Studio.
Correlated subqueries are a very poor programming technique which equates to using a cursor in the query. Make it a derived table instead.
And yes those functions are slowing you down. If you have to convert to datetime, your database structure needs to be fixed and the data stored correctly as datetime.
Do you need to do the conversions on the datetime for the DATEDIFF
functions? Are you storing the dates as test, or are you reconverting to get rid of the time? If you are, then you don't need to as days different will be correct including time.
You should review whether the outer joins are necessary - they are more expensive than inner joins. You have some values that come from the dominant table, tagged A. You also have an OR condition that references E, and an OR condition that references G. I'd look to restructure the query along the lines of:
SELECT SUM(x.result)
FROM (SELECT A.TransAmount + CASE WHEN A.BetResult = 1
THEN (A.BetWinAmount + (A.TransAmount * -1))
ELSE 0 END AS result
FROM A
WHERE A.TransReason BETWEEN 3 AND 150
AND datediff(dd, Convert(datetime, A.TransDT, 101),
Convert(datetime, @EndDate, 101)) = @DaysAgo
AND A.UserID = U.UserID -- Where does alias U come from?
AND A.IsSettled = 1
UNION
SELECT A.TransAmount + CASE WHEN A.BetResult = 1
THEN (A.BetWinAmount + (A.TransAmount * -1))
ELSE 0 END AS result
FROM User_T A
JOIN User_TD B ON A.TID = B.TID
JOIN Lines_BL C ON B.LID = C.LID
JOIN Lines_BM D ON C.BMID = D.BMID
JOIN Event_M E ON D.EID = E.EID
WHERE A.TransReason = 1
AND datediff(dd, Convert(datetime, E.EDate, 101),
Convert(datetime, @EndDate, 101)) = @DaysAgo
AND A.UserID = U.UserID -- Where does alias U come from?
AND A.IsSettled = 1
UNION
SELECT A.TransAmount + CASE WHEN A.BetResult = 1
THEN (A.BetWinAmount + (A.TransAmount * -1))
ELSE 0 END S result
FROM User_T A
JOIN User_TD B ON A.TID = B.TID
JOIN Lines_BL C ON B.LID = C.LID
JOIN Lines_BM D ON C.BMID = D.BMID
JOIN Event_M E ON D.EID = E.EID
JOIN Event_KB F ON A.TransReason = F.BID
JOIN Event_M G ON F.BID = G.EID
WHERE A.TransReason >= 3000
AND datediff(dd, Convert(datetime, G.EDate, 101),
Convert(datetime, @EndDate, 101)) = @DaysAgo
AND [dbo].[Event_CEAFKBID](A.TransReason) = 1
AND A.UserID = U.UserID -- Where does alias U come from?
AND A.IsSettled = 1
) AS x
The thinking here is that the inner join queries will each be quicker than the outer join queries, and summing intermediate results is not a hardship to the DBMS (it was doing that anyway). It probably also avoids the need for IFNULL
.
The alias U is, presumably, a reference to the outer query of which this is a part.
精彩评论