开发者

What does `INCLUDE` do in an index?

What does INCLUDE in an unclustered index?

CREATE NONCLUSTERED INDEX [MyIndex] ON [dbo].[Individual] 
(
开发者_如何学编程    [IndivID] ASC
)
INCLUDE ( [LastName], [FirstName]) 
WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]

I know the first part is used for the WHERE clause, but what do the INCLUDE columns do? What's the benefit of having them "added to the leaf level of the nonclustered index"?

edit Also if i already have a clustered PK index for IndivID, why does Tuning Advisor recommend this index?


INCLUDE columns include the associated fields WITH the index. They are not used FOR indexing, but they are placed in the leaf node of the B-tree that makes up the index.

In essence: The index is still ON [IndivID] and [IndivID] alone. However, if your query only needs a subset of [IndivID], [LastName] and [FirstName], SQL doesn't need to go back to the table after it's found the [IndivID] it's searching for in the Index.

SEE: Covering Index

EDIT: B-tree assumes MS SQL Server. I'm not positive other implementations use the same data structure

Tuning Advisor (Speculation):: A clustered index places the entire data row at the leaf node of the index's B-tree, and this takes up a lot of space. If the Tuning Advisor sees that you're never accessing more than those three fields ([IndivID] + INCLUDEs), it will attempt to save you space (and insert/update time) by downgrading it to a non-clustered index with the only "important" fields present.


INCLUDE adds those fields at the leaf-level of the index. Basically the bt-ree is not sorted by those fields, but once the index finds the row with the indexed field(s) it's looking for, it also has the other fields immediately.

If you use the phone book analogy, the INCLUDED fields in the phone book index (which is sorted by Lastname, Firstname) would be Phone Number and Address - you can't look up a person by those fields but once you have their name you can find them.

CLUSTERED indexes have all fields included already by design, so INCLUDE is invalid in a CLUSTER. You also shouldn't bother INCLUDEing the clustered field in a non-clustered index since it is already implicitly there as the row key.

I most often use the INCLUDE fields for aggregation. For instance, if I have an index on CalendarDate and CustomerID I can include PaidAmt and get

MAX(PAidAmt) Where CustomerId = x AND CalendarDate = 1/1/2011

At the most basic level they are used to avoid a bookmark or key lookup.


That is data that is included as payload in the index. It won't be used to filter, but it can be returned.

If you for example have a query that filters on age and return name:

select name
from persons
where age = 42

Then you could create an index for the age field, with the name field included. That way the database could use only the index to run the entire query, and doesn't have to read anything at all from the actual table.


From MSDN - CREATE INDEX (Transact-SQL):

INCLUDE (column [ ,... n ] )

Specifies the non-key columns to be added to the leaf level of the nonclustered index.

Meaning, you can add more columns to the unclustered index - if you are returning several fields every time you query on the key column, adding them to the index will improve performance as they are stored with it, aka a covering index.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜