Index view: How to choose the Clustered Index?
I'm going to do an indexed view, based on three tables with inner and outer joins between them (SQL Server 2005). I will run all kind of queries against this view. So, I wonder what is the best way to choose which index to be clustered. What are the criteria or is there any tools to help me around.
(Sorry if my question is dull, I don't have a lot of experience in designing databases).
Thanks in advance!
EDIT: I should make clarification here, that the tables I use in the view are with very intense use and any overhead I tak开发者_JS百科e for maintenance of the indexes, should be paid-off.
Since it's an index, you have to pick a column (or set of columns) which is guaranteed to be non-null and unique in all cases. That's the biggest and most stringent criteria - anything that might be NULL or duplicate is out of the question right from the get-go.
Depending on the type of queries you'll be running on this indexed view, you might also want to see if you have any columns (e.g. a DATE or something) that you'll be running range queries against. That might make an interesting candidate for a clustering key.
But the main thing is: your clustering key must be unique and non-null in any circumstance. And in my personal experience, to reduce index size (and thus increase the number of entries per page), I'd try to use as small a key as possible - a single INT is best, or a combination of two INTs - or possibly a GUID - but don't use VARCHAR(500) fields in your clustering key!
UPDATE: to all those poster who keep telling us clustered indexes don't need to be unique - check out what the "Queen of Indexing", Kimberly Tripp, has to say on the topic:
Let's start with the key things that I look for in a clustering key:
* Unique * Narrow * Static
Why Unique?
A clustering key should be unique because a clustering key (when one exists) is used as the lookup key from all non-clustered indexes. Take for example an index in the back of a book - if you need to find the data that an index entry points to - that entry (the index entry) must be unique otherwise, which index entry would be the one you're looking for? So, when you create the clustered index - it must be unique. But, SQL Server doesn't require that your clustering key is created on a unique column. You can create it on any column(s) you'd like. Internally, if the clustering key is not unique then SQL Server will “uniquify” it by adding a 4-byte integer to the data. So if the clustered index is created on something which is not unique then not only is there additional overhead at index creation, there's wasted disk space, additional costs on INSERTs and UPDATEs, and in SQL Server 2000, there's an added cost on a clustereD index rebuild (which because of the poor choice for the clustering key is now more likely).
Source: http://www.sqlskills.com/blogs/kimberly/post/Ever-increasing-clustering-key-the-Clustered-Index-Debateagain!.aspx
The thumb rule: Select the columns which are you are probably going to use MOST in your queries as WHERE, GROUP etc. Those columns could be a good candidate for non-clustered indexes. Select a column (or a group of column) which would probably make your row unique, and that could be a good candidate for clustered index.
As mentioned by marc, a clustered index imposes a unique constraint, so it definately needed that the column you selct should not have any null and duplicate.
A clustered index does not have to be unique. The columns in it can even be nullable. For example, this will run without an error:
create table #test (col1 int identity, col2 int)
create clustered index ix_test on #test (col2)
insert into #test (col2) values (1)
insert into #test (col2) values (1) -- Duplicate in clustered index
insert into #test (col2) values (null)
A clustered index is part of the table structure on disk. As such, a clustered index uses no additional disk space.
By default, SQL Server clusters on the primary key, which is usually a good choice. You can change that if you have intensive queries with a lot of table lookups. Changing which index is clustered can eliminate table lookups.
精彩评论