Need to Improve Performance of SQL Query using Aggregate Functions
I have a particular SQL query that seems to suffer from a mysterious performance issue. Here is the query:
SELECT COUNT(LengthOfTime) AS TotalTime,
SUM(LengthOfTime) AS TotalLength,
SUM(LengthOfTime) / COUNT(LengthOfTime) AS AverageTime,
SUM(Pops) / COUNT(LengthOfTime) AS AveragePop
FROM ((SELECT *
FROM (SELECT *, ID & YearRec AS ID2
FROM MyFirstTable
UNION ALL
SELECT *, ID & YearRec AS ID2
FROM Table2011) AS TEMP
WHERE STARTTIME >= '8/1/2011 00:00:00'
AND StartTime <= '8/5/2011 23:59:59' ) AS TEMP2
JOIN AppleTable ON TEMP2.Reas开发者_开发百科on = AppleTable.Skills )
JOIN PeopleTable ON TEMP2.Operator = PeopleTable.Operators
WHERE AppleTable.[ON] = 1
AND PeopleTable.[ON] = 1
AND Rec_Type = 'SECRET AGENT'
The issue here is that this query runs very quickly (0:00 to 0:02) when run for a 5 day span, but very slowly (1:20 to 1:45) for a 6 day span.
There are approximately 105,000 records per day in the Tables (MyFirstTable and Table2011).
My question: Is there an upper limit to the number of rows you can pass an aggregate function before you see a serious performance issue in SQL Server? (currently using 2008 R2)
Short answer: No, there's not some magic number of records that will cause MSSQL to start performing poorly.
Now, it's possible queries won't scale well and, as a result, the larger the dataset the [exponentially] worse it performs.
A large problem you're going to have is that you're predicating the StartTime after the UNIONED statements. Instead, try predicating on that in both of your selects prior to the UNION. That should make a huge difference, especially if you index both tables on StartTime (generating index seeks on those tables).
SELECT * FROM (
SELECT *, ID & YearRec AS ID2 FROM MyFirstTable
WHERE STARTTIME >= '8/1/2011 00:00:00'
AND STARTTIME <= '8/5/2011 23:59:59'
UNION ALL SELECT *, ID & YearRec AS ID2
FROM Table2011
WHERE STARTTIME >= '8/1/2011 00:00:00'
AND STARTTIME <= '8/5/2011 23:59:59'
) AS TEMP
You may be able to do some additional refactoring of your code as well.
No, there is no pre-defined upper limit for aggregate functions.
The skew in performance is likely affected by one or multiple of the following:
- Old and/or unsuitable index structure
- Cached execution plan
- Cached data
- data size not being uniform (the first five days are 10 rows while the sixth is 100 B rows)
You can run the query in SSMS and view the actual execution plan. This will tell you the places where the cost of running the query is the highest, and that will help you determine the best course of action.
Edit based on comments:
If there isn't an index on Table2011
that contains [STARTTIME]
, then create one. If there is an index, but it is getting ignored, then you have to figure out why. If the is fragmented, then rebuilding the index will definitely help. Here is how to rebuild
ALTER INDEX [YourIndexName] ON [dbo].[Table2011] REBUILD WITH (STATISTICS_NORECOMPUTE = ON);
Alternately you can do this in SSMS - browse to the specific index in the object browser, right click and rebuild.
精彩评论