开发者

Defining indexes: Which Columns, and Performance Impact?

I know how to use indexes(clustured and non clustured) But when should i use non clustured indexes in my table. What scenarios should be there, so as to make my column non clustured index. I have gone throught msdn guidelines but still little bit confusion.

Should i make only unique columns as NC or should there any other columns also as NC.

If i overload my table with NC indexes then will it decrease my performance also 开发者_如何学Python?

Should I use composite non-C index on columns that are foreign keys.

I know primary key should be Clustured, Unique keys should be NC but what about foreign keys.


The clustered index defines your table's physical structure (to a certain degree) - e.g. it defines in what order the data is ordered. Think of the phonebook, which is "clustered" by (LastName,FirstName) - at least in most countries it is.

You only get one clustered index per table - so choose it wisely! According to the gospel of the Queen of Indexing, Kimberly Tripp, the clustering key should be narrow, stable (never change), unique (yes!) and ideally ever-increasing.

It should be narrow, because the clustering key will be added to each and every entry of each and every non-clustered index - after all, the clustering key is the value used to ultimately find the actual data.

It should be stable since constantly updating lots of index values is a costly affair - especially since the clustering key would have to updated in all non-clustered indices as well.

It needs to be unique, since again - it's ultimately the value used to locate the actual data. If you choose a column that is not guaranteed to be unique, SQL Server will "uniquify" your clustering key by adding a 4-byte value to it - not a good thing.

And ideally, the clustering key should be ever-increasing since that causes the least page and index fragmentation and thus is best for performance.

The ideal candidate for a clustering key would be a INT (or BIGINT) IDENTITY - it ideally fulfills all those requirements.

As for non-clustered indices - use and choose them wisely! There's only one general rule I can give you: all columns that are part of a foreign key (referencing another table) should be in an index - SQL Server will not (contrary to popular belief and lots of myths) put such an index in place automatically - never has, never does.

Other than that - you need to watch your system, see what kind of queries you have - all columns that show up in a WHERE or SORT clause are potential candidate to be indexed - but too many indices isn't a good thing either....


You can only have one clustered index per table. It doesn't have to be the primary key, but in most cases it will be.

Beyond that - it really depends on the queries & the tipping point for what indexes will be used. But defining indexes also means there will be an impact to DML - inserts, updates & deletes will take a slight performance hit.

Should I use composite non clustered index(es) on columns that are foreign keys?

Doesn't matter what the column is, it's the usage that matters for the optimizer to determine what index, clustered or otherwise, to use.


Yes, you can overload your tables with too many indexes. In general, every additional index costs performance time in terms of index maintenance. Tables that are heavily updated should generally have fewer indexes.

Another broad rule (from Richard Campbell, on RunAs Radio and DotNetRocks), is that a few broad indexes will perform better than a larger number of narrow indexes. A broad index will cover a wider range of queries, and there's less for the query optimizer to investigate. Remember that the query optimizer has a limited time to run.

Investigate SQL Server Profiler. There are tools there (used to be stand-alone, but they've changed and I haven't used them recently). They can analyze workloads and make indexing recommendations. These will be better choices than indexes picked "intuitively."


If you have queries that are referencing columns that are not in your index, the SQL server engine will have to perform a table lookup to get the non-included columns from the actual table.

If you are running these queries often, you should create non-clustered indexes that "cover" the query by including all the referenced columns in the index. This should include any non-unique columns.

Adding indexes to a table always decreases write performance, since the index will have to be updated every time the table is updated.


What fields are you doing lookups on? Searching? Etc. Determine what fields you are using when running your queries (WHERE clause) and they could possibly be good candidates.

For instance, think of a library. The book catalog has a clustered index for the ISBN number and a non clustered index for say publishing year, etc.

Also what helped me is something that Bart Duncan posted a long time ago. He deserves the credit for this.

The article was entitled "Are you using SQL's Missing Index DMV's?". Look it up and run this query:

SELECT 

  migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure, 

  'CREATE INDEX [missing_index_' + CONVERT (varchar, mig.index_group_handle) + '_' + CONVERT (varchar, mid.index_handle) 

  + '_' + LEFT (PARSENAME(mid.statement, 1), 32) + ']'

  + ' ON ' + mid.statement 

  + ' (' + ISNULL (mid.equality_columns,'') 

    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END 

    + ISNULL (mid.inequality_columns, '')

  + ')' 

  + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement, 

  migs.*, mid.database_id, mid.[object_id]

FROM sys.dm_db_missing_index_groups mig

INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

WHERE migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 10

ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC

It is not the ultimate solution for you but it will help you determine some indexes. And the link to the article: http://blogs.msdn.com/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx. By default when you create a PK in SQL Server it by default is the clustered index, it doesn't have to be, but it generally is.


If you should or not make clustered indexes depends on you workload (usually dominated by the amount and kind of SELECT statements hitting your table)

A clustered index will force the disk storage order of the rows to be according to the clustered index values. (For this reason, there can be only 1 clustered index per table, as rows are stored on disk only once) This makes sense if most of your queries are always demanding a group of related rows.

Example: suppose you are storing CustomerOrders, and you frequently want to know the number of CustomerOrders (regardless of the customer) in a certain time period. In this case it may be useful to create a clusterd index with the OrderDate as first column. If on the other hand you are frequently looking for all CustomerOrders with the same CustomerId, it makes more sense to put the CustomerId as first column in your clustered index.

The disadvantage of clustered indexes is not in de clustered index itself, but on the secondary indexes: secondary indexes are themselves not clustered (by definition, as the rows can only be stored once, and are stored in order of the clustered index), and their index entries point to the index entries of the clustered index. So to retrieve a row via a secondary index requires 2 read operations: one of the secondary index, and then one of the clustered index it it pointing to.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜