SQL statistics on large databases
What is a good way to handle statistics on SQL 2008 for very large databases? Multiple tables with 100m+ rows in each.
Should auto update statistics be on? Will auto update statistics async help at all? should a job be setup to manually update statistics on some kind of schedule?
Usually data is added to the table but older data isn't changed very often.
Update: About 100k ro开发者_C百科ws inserted each hour. Mostly reporting is done on the data. Updates can happen on 1-2 columns on ~500k rows per day.
For one I would not want update stats to run in the middle of the day on a large table, so I would say no. Also you need to hit the threshold (20% I believe) before it kicks in anyway
Now if you have a job already that rebuilds the index then stats are updated automatically (this is not true with a reorg/defrag)
Also 100 million rows doesn't mean much, how many columns if the table is 12 bytes wide (per row) compared to 4100 bytes that is a big difference (especially since with the 4100 bytes per row table you can only fir 1 row per page)
What is a good way to handle statistics on SQL 2008 for very large databases? Multiple tables with 100m+ rows in each.
PLEASE dont call this very large. I give you an example of very large. We just run a sql statement on some data in our warehouse. Temp space usage tops at 180gb. For that statement. Db? two digits terabytes. 100m+ rows are not small, but not very large.
Should auto update statistics be on? Will auto update statistics async help at all? should a job be setup to manually update statistics on some kind of schedule?
Depend. On update and usage patterns.
Usually data is added to the table but older data isn't changed very often.
How often? How much in percentage? What data? Do the statistics get out of scope fast or slowly move? YOu ahve to provide a LOT more information to make sensible suggestions.
Should auto update statistics be on?
It Depends...
Will auto update statistics async help at all?
It will help prevent a stats update that takes a long time from killing a query. Basically this tells SQL Server that if a query comes in and it realizes stats are outdated instead of holding the query, updating the stats, then running the query. just run the query and update the stats behind the scenes. So that particular query that kicked off the need for a stats update won't get any benefits but it also won't sit around waiting for the stats to update first either.
should a job be setup to manually update statistics on some kind of schedule?
Yes! Stats are only updated if 20% of a tables data has been "changed". On very big tables that can basically be the same as saying stats will never be updated. If you have any large tables where new data is being added you should always have a scheduled process to update stats on them.
"It depends" is a good answer but in the absence of reproducable and measurable improvement I'd leave it at the default.
If you update stats manually overnight then you have less chance of an auto-update kicking in. And you can defer a stats update by setting AUTO_UPDATE_STATISTICS_ASYNC (See "When to Use Synchronous or Asynchronous Statistics Updates")
On balance I wouldn't disable it or change the default which is "on".
精彩评论