开发者

How should extremely large tables in SQL SERVER be indexed?

We have a large table (450 million rows containing 34 columns of numeric or datetime data) that currently has around a dozen recommended paths for querying. The table currently has 17 indexes and I have no authority to change the structure of this table, though I am able to provide indexing strategies.

The first problem I see is that there is no clustered index, depite the fact that the table has a unique key that is composed of 2 columns. I was thinking I could change that to be clustered then deal with the other indexes. S开发者_开发知识库ince there are around a dozen common ways to query the table, I was thinking that adding an index for each query method would be a good thing. So say one of the common ways to query the table was by CustomerId, I would add an index on customer Id. That would be a non-clustered index though and would still be fairly inefficient right? What if I made that index contain CustomerId and the 2 columns within the clustered index? Would that make SQL Server more efficient in its execution plans or is that a useless task?


I would think that the best tactic is always to start by running the SQL Server Profiler on your database for a period of time. Once you then have a decent trace stored in file or to a dedicated trace table you can then run the SQL Server Database Tuning Advisor to get real statistics and index recommendations based on the actual use of your database rather than assumptions on how you perceive the lookup behaviours on your db.

It may actually be a case that there are certain expensive queries on your tables which are currently completely bypassing your existing configured indexes that you are not aware of. The tool will help you track down the best possible combinations.

Here is an example of this in practice:

Using the Database Tuning Advisor


Indexes are for efficient data retrieval.

You should review the queries that are run against the large table and figure out which columns get used most.

Here are some rules of thumb for out indexes:

  1. Primary keys: these are usually clustered index
  2. Foreign Keys: Columns that are used in Joins. This can be an index per column or a composite index depending on your needs
  3. Columns that are frequently used in WHERE clauses

In a warehouse environment, datetime columns are good candidates for clustered index because they are used very frequently in WHERE clauses.

So how do you figure all this out?

Run SQL Server Profiler. This will help you find the queries that are running against your table. You then figure out the ones that use the most resources on a given time period by reviewing the number of times run and cost of query. Follow one of the two paths to better indexing

  • Review the query execution plan to determine the index needed
  • Use the Index Tuning Wizard to help out t


A Clustered Index has advantages for Range queries (WHERE KeyColumn BETWEEN(...))

In your CustomerId example there is absolutely no gain in adding Primary Columns. A non-custered index will contain item-refs to the cluster pages.

Actually your question does not contain any information to base a good advice on. You's better start with profiling to find out any bottlenecks.


only change to use a clustered index if the data is INSERTed sequentially based on the clustered columns. If the columns you use are not unique, the database will adds a 4-byte uniqueifier column to the table, so be sure they are unique.

Clustered Index Design Guidelines

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜