Which non-clustered index should I use?
Here I am studying nonclustered indexes on SQL Server Management Studio.
I've created a table with more than 1 million records. This table has a primary key.
CREATE TABLE [dbo].[Customers](
[CustomerId] [int] IDENTITY(1,1) NOT NULL,
[CustomerName] [varchar](100) NOT NULL,
[Deleted] [bit] NOT NULL,
[Active] [bit] NOT NULL,
CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED
(
[CustomerId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
This is the query I'll be using to see what execution plan is showing:
SELECT CustomerName FROM Customers
Well, executing this command with no additional non-clustered index, it leads the execution plan to show me:
I/O cost = 3.45646
Operator cost = 4.57715
开发者_高级运维
Now I'm trying to see if it's possible to improve performance, so I've created a non-clustered index for this table:
1) First non-clustered index
CREATE NONCLUSTERED INDEX [IX_CustomerID_CustomerName] ON [dbo].[Customers]
(
[CustomerId] ASC,
[CustomerName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Executing again the select against Customers table, the execution plan shows me:
I/O cost = 2.79942
Operator cost = 3.92001
It seems better. Now I've deleted this just created non-clustered index, in order to create a new one:
2) First non-clustered index
CREATE NONCLUSTERED INDEX [IX_CustomerIDIncludeCustomerName] ON [dbo].[Customers]
(
[CustomerId] ASC
)
INCLUDE ( [CustomerName]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
With this new non-clustered index, I've executed the select statement again and the execution plan shows me the same result:
I/O cost = 2.79942
Operator cost = 3.92001
So, which non-clustered index should I use? Why the costs are the same on execution plan for I/O and Operator? Am I doing something wrong or this is expected?
thank you
This is because of "CustomerName" being INCLUDE
-ed in the second index (see this about INLCUDEd columns).
Basically, BOTH indexes work the same exact way for you - they are covered indexes with the first index column NOT matching the WHERE clause.
Meaning that the query in both cases will be scanning the index but NOT touching the table.
The index that I'd expect ti be better performing for that specific query would be an index on CustomerName alone.
You're not going to notice much difference with or without indexes until you make use of the index by filtering the results using WHERE, ordering the results with ORDER, or joining the results to another table on the indexed column.
Try doing a query like this without an index:
SELECT *
FROM Customers
WHERE CustomerName = 'Marcus Adams'
Then add an index on the CustomerName column and try again.
You'll also need enough rows in the table that the database system will actually use the index, and enough that you'll notice the difference between scanning the rows and using the index.
Neither of your two nonclustered indices makes a lot of sense, really.
The point is this: the column(s) of the clustered index - in your case CustomerId
- is already included in every single entry of every single non-clustered index you have. That clustering column after all is what is used for the actual data lookup, if an entry is found. So adding that to a non-clustered index typically is superfluous and just a waste of space.
The question is more: how do you select those rows that you want to have displayed? What columns will show up in the WHERE
clause?
If you find a pattern here (e.g. you always select by e.g. City
), then the non-clustered index to suit your needs would be
CREATE NONCLUSTERED INDEX [IX_Customer_City] ON [dbo].[Customers]
(
[City] ASC
)
INCLUDE ( [CustomerName])
That way, you give SQL Server a method to easily find rows that match a given city, and including the column that you want to have returned (CustomerName
) allows SQL Server to get the necessary information directly from the index page (making it a so-called covering index
- it covers your query, e.g. returns all the information needed) - you won't need to do a "bookmark lookup", e.g. fetch the entire Customer
data row from the actual data pages (finding it via the CustomerId
which is in the non-clustered index, too, since it's the clustering key).
精彩评论