What should be the order of filters in query to optimize queries to use Indexing
I have few doubts. I have heard from some of my colleagues that, if we have created composite Nonclustured index on following table order of all the filters should be in order of Indexed column then Filter.
开发者_运维百科MyTable (T1, T2, T3, T4, T5)
Non Clustured index(T1, T2) in the order T1 then T2
Questions
Which of the queries works faster ?
Does order of Indexed column affects performance ?
Should Indexed Columns come first to optimize queries ?
What is the order of processing a query. Which filter is taken first while processing a query ? Does it starts from Last filter and goes on to first filter ?
Query 1
Select * from MyTable WHERE T1=1 AND T2=2 AND T3=1
--(Indexing will be used) Fastest as T3 has been included after indexed columns
Query 2
Select * from MyTable WHERE T2=1 AND T1=2 AND T3=1
--(No Indexing will be used)
Query 3
Select * from MyTable WHERE T3=1 AND T1=1 AND T2=2
--(Indexing will be used) slower than Query 1 as indexed columns included afterwards
Query 4
Select * from MyTable WHERE T3=1 AND T2=1 AND T1=2
--(No Indexing will be used) slower than Query 2 as indexed columns occurs after non indexed column condition
EDIT :
Query 5
Select * from MyTable WHERE T3=1 AND T2=1
Query 6
Select * from MyTable WHERE T3=1 AND T1=1
Query 7
Select * from MyTable WHERE T3=1 AND T2=1 OR T1=2
Query 8
What does the following order in SSMS index creation means i.e. which filter should come first ?
When I tried to shuffle the order of columns shown in index columns. I saw Bookmark lookups were increasing and Clustured index scan was being used. But when I uses the same order as shown in image. Then bookmark lookup got removed and also got Index Seek. So I can see order of Columns in index is playing some vital role here but could not sense how.
The order of the predicates in the query makes no difference!
The usual analogy when considering column order in composite indexes is that of a phone book. This is ordered by (surname, firstname). This makes look ups by surname straightforward but doesn't help you looking up numbers by forename.
If you have an index on (T1, T2) it can be used to efficiently answer
Select * from MyTable WHERE T1=2
Or
Select * from MyTable WHERE T2=1 AND T1=2
But not to efficiently answer
Select * from MyTable WHERE T2=1
(That is not to say this index is always going to be entirely useless for this query. If the index is narrow compared to the size of the table and the statistics indicate that few matching rows will be returned the query optimiser might still opt to scan this index and do key lookups in preference to a clustered index scan that might need to cover a lot more pages.)
Example Script as per comments
CREATE TABLE MyTable (
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
T1 INT NOT NULL,
T2 INT NOT NULL,
Filler CHAR(8000) NOT NULL /*Just to make sure that the table is very wide!*/);
CREATE NONCLUSTERED INDEX IX ON MyTable (T1,T2);
WITH E00(N) AS (SELECT 1 UNION ALL SELECT 1),
E02(N) AS (SELECT 1 FROM E00 a, E00 b),
E04(N) AS (SELECT 1 FROM E02 a, E02 b),
E08(N) AS (SELECT 1 FROM E04 a, E04 b),
E16(N) AS (SELECT 1 FROM E08 a, E08 b),
E32(N) AS (SELECT 1 FROM E16 a, E16 b),
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E32)
INSERT INTO MyTable
SELECT N, N-1,''
FROM cteTally
WHERE N <= 1000;
SET STATISTICS IO ON
Select * from MyTable WHERE T2=1
/*
Table 'MyTable'. Scan count 1, logical reads 9, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
/*Force a clustered index scan just to compare...*/
Select * from MyTable WITH( INDEX (1) ) WHERE T2=1
/*
Table 'MyTable'. Scan count 1, logical reads 1005, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.*/
Q8 (not numbered) What does the following order in SSMS index creation means i.e. which filter should come first ?
The order of index creation has nothing to do with which index will be used. The indexes, once created, just sit alongside the table data in index pages ready for use whenever it suits the query, whichever suits best rather than which was created first (chronologically).
Q 1. Which of the queries works faster ?
The order of the OR clauses in the query does not matter
. SQL Server will inspect all of them and use whatever index it can so all your first 4 queries will use exactly the same plan, therefore the same performance
.
Q 2. Does order of Indexed column affects performance ?
Not in the query, but it does if you are creating a composite index. Index on (A,B) will be good for queries that involve A, cannot (normally) be used for queries that involve B only. Index on (B,A) first groups by B, then by A in the index pages, so a query Where A=1 cannot use it. There are some special cases such as when the query only needs A and B in either SELECT/JOIN/WHERE clauses, so it may still use the index on (B,A) because it is easier to collect than the clustered key.
Q 3. Should Indexed Columns come first to optimize queries ?
Same as answer to 1.
Q 4. What is the order of processing a query. Which filter is taken first while processing a query ? Does it starts from Last filter and goes on to first filter ?
Same as answer to 1.
For EDITed questions 5-7
Q7 is the only one that is different. The others involve a straight AND, so the condition is agnostic to the order of the fields. The last one actually has two groups of conditions (A and B) or (C), so an index on both (A,B) and/or index on (C) will be used, if they are selective enough (will filter the data to a sufficiently smaller subset). If both indexes exist, two result streams may be collected and compared (hash/merge) to result in the final output.
Besides all other points made, if you always use SELECT * FROM MyTable
in your queries, more often than not SQL Server's query optimizer will opt to ignore any indices at all and do a table scan - even if there is an index.
Why?
If you ask SQL Server to return all columns in your table, it will eventually have to read the actual data page for each row anyway. Finding rows in the index, and then doing a bookmark lookup into the actual data pages, is a fairly expensive operation - so unless you are very selective with your WHERE clause (selecting a few percent only of the data), then most likely, SQL Server will do a full table scan to get all the data needed.
If you can, always use SELECT (list of columns) FROM Mytable
and keep that list of columns as small as possible. In that case, you might be able to have an index that has most of those columns already in it, and then you could add those additional two or three columns as included columns into your index, in order to make it a covering index. This would allow SQL Server to return the values you ask for by simply scanning the non-clustered index - instead of having to do either a lot of expensive bookmark lookup, or do a full table scan.
精彩评论