开发者

Massive table in SQL 2005 database needs better performance!

I am working on a data driven web application that uses a SQL 2005 (standard edition) database.

One of the tables is rather large (8 million+ rows large with about 30 columns). The size of the table obviously effects the performance of the website which is selecting items from the table through stored procs. The table is indexed but still the performance is poor due to the sheer amount of rows in the table - this is part of the problem - the table is as equally read as updated, so we can't add / remove indexes without making one of the operations worse.

The goal I have here is to increase the performance when selecting items from the table. The table has 'current' data and old / barely touched data. The most effective solution we can think of at this stage is to seperate the table into 2, i.e, one for old items (before a certain date, say 1 Jan 2005) and one for newer items (equal to or开发者_如何学Python before 1 Jan 2005).

We know of things like Distributed Partitioned Views - but all of these features require Enterprise Edition, which the client will not buy (and no, throwing hardware at it isn't going to happen either).


You can always roll your own "poor man's partitioning / DPV," even if it doesn't smell like the right way to do it. This is just a broad conceptual approach:

  1. Create a new table for the current year's data - same structure, same indexes. Adjust the stored procedure that writes to the main, big table to write to both tables (just temporarily). I recommend making the logic in the stored procedure say IF CURRENT_TIMESTAMP >= '[some whole date without time]' - this will make it easy to backfill the data in this table which pre-dates the change to the procedure that starts logging there.

  2. Create a new table for each year in your history by using SELECT INTO from the main table. You can do this in a different database on the same instance to avoid the overhead in the current database. Historical data isn't going to change I assume, so in this other database you could even make it read only when it is done (which will dramatically improve read performance).

  3. Once you have a copy of the entire table, you can create views that reference just the current year, another view that references 2005 to the current year (by using UNION ALL between the current table and those in the other database that are >= 2005), and another that references all three sets of tables (those mentioned, and the tables that pre-date 2005). Of course you can break this up even more but I just wanted to keep the concept minimal.

  4. Change your stored procedures that read the data to be "smarter" - if the date range requested falls within the current calendar year, use the smallest view that is only local; if the date range is >= 2005 then use the second view, else use the third view. You can follow similar logic with stored procedures that write, if you are doing more than just inserting new data that is relevant only to the current year.

  5. At this point you should be able to stop inserting into the massive table and, once everything is proven to be working, drop it and reclaim some disk space (and by that I mean freeing up space in the data file(s) for reuse, not performing a shrink db - since you will use that space again).

I don't have all of the details of your situation but please follow up if you have questions or concerns. I have used this approach in several migration projects including one that is going on right now.


performance is poor due to the sheer amount of rows in the table

8 million rows doesn't sound all that crazy. Did you check your query plans?

the table is as equally read as updated

Are you actually updating an indexed column or is it equally read and inserted to?

(and no, throwing hardware at it isn't going to happen either)

That's a pity because RAM is dirt cheap.


Rebuild all your indexes. This will boost up performance of queries. How to do it is this and More on effect on rebuild of clustered and non-clustered index here

Secondly perform de-fragmentation on your drive on which the DB is stored.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜