What does `INCLUDE` do in an index?
What does INCLUDE
in an unclustered index?
CREATE NONCLUSTERED INDEX [MyIndex] ON [dbo].[Individual]
(
开发者_如何学编程 [IndivID] ASC
)
INCLUDE ( [LastName], [FirstName])
WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
I know the first part is used for the WHERE
clause, but what do the INCLUDE
columns do? What's the benefit of having them "added to the leaf level of the nonclustered index"?
edit Also if i already have a clustered PK index for IndivID
, why does Tuning Advisor recommend this index?
INCLUDE columns include the associated fields WITH the index. They are not used FOR indexing, but they are placed in the leaf node of the B-tree that makes up the index.
In essence: The index is still ON [IndivID] and [IndivID] alone. However, if your query only needs a subset of [IndivID], [LastName] and [FirstName], SQL doesn't need to go back to the table after it's found the [IndivID] it's searching for in the Index.
SEE: Covering Index
EDIT: B-tree assumes MS SQL Server. I'm not positive other implementations use the same data structure
Tuning Advisor (Speculation):: A clustered index places the entire data row at the leaf node of the index's B-tree, and this takes up a lot of space. If the Tuning Advisor sees that you're never accessing more than those three fields ([IndivID] + INCLUDEs), it will attempt to save you space (and insert/update time) by downgrading it to a non-clustered index with the only "important" fields present.
INCLUDE
adds those fields at the leaf-level of the index. Basically the bt-ree is not sorted by those fields, but once the index finds the row with the indexed field(s) it's looking for, it also has the other fields immediately.
If you use the phone book analogy, the INCLUDED
fields in the phone book index (which is sorted by Lastname
, Firstname
) would be Phone Number
and Address
- you can't look up a person by those fields but once you have their name you can find them.
CLUSTERED
indexes have all fields included already by design, so INCLUDE
is invalid in a CLUSTER
. You also shouldn't bother INCLUDE
ing the clustered field in a non-clustered index since it is already implicitly there as the row key.
I most often use the INCLUDE
fields for aggregation. For instance, if I have an index on CalendarDate
and CustomerID
I can include PaidAmt
and get
MAX(PAidAmt) Where CustomerId = x AND CalendarDate = 1/1/2011
At the most basic level they are used to avoid a bookmark or key lookup.
That is data that is included as payload in the index. It won't be used to filter, but it can be returned.
If you for example have a query that filters on age and return name:
select name
from persons
where age = 42
Then you could create an index for the age
field, with the name
field included. That way the database could use only the index to run the entire query, and doesn't have to read anything at all from the actual table.
From MSDN - CREATE INDEX (Transact-SQL):
INCLUDE (column [ ,... n ] )
Specifies the non-key columns to be added to the leaf level of the nonclustered index.
Meaning, you can add more columns to the unclustered index - if you are returning several fields every time you query on the key column, adding them to the index will improve performance as they are stored with it, aka a covering index.
精彩评论