Why am I getting an index scan for a covered query using aggregate function?
I have a query:
select min(timestamp) from table
This table has 60+million rows, and daily I delete a few off the end. To determine whether or not there is any data old enough do delete I run the query above. There is an index on timestamp ascending, containing only one column, and the query plan in oracle causes this to be a full inde开发者_开发问答x scan. Should this not be the definition of a seek?
edit including plan:
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 2 | INDEX FULL SCAN (MIN/MAX)| NEVENTS_I2 | 1 | 8 | 4 (100)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
| 0 | SELECT STATEMENT | | 1 | 8 | 4 (0)| 00:00:01 |
Can you post the actual query plan? Are you sure that it is not doing a min/max index full scan? As you can see in this example, we're getting the MIN value from a 100,000 row table using a min/max index full scan with only a handful of consistent gets.
SQL> create table foo (
2 col1 date not null
3 );
Table created.
SQL> insert into foo
2 select sysdate + level
3 from dual
4 connect by level <= 100000;
100000 rows created.
SQL> create index idx_foo_col1
2 on foo( col1 );
Index created.
SQL> analyze table foo compute statistics for all indexed columns;
Table analyzed.
SQL> set autotrace on;
<<Note that I ran this statement once just to get the delayed block cleanout to
happen so that the consistent gets number wouldn't be skewed. You could run a
different query as well>>
1* select min(col1) from foo
SQL> /
MIN(COL1)
---------
02-FEB-11
Execution Plan
----------------------------------------------------------
Plan hash value: 817909383
--------------------------------------------------------------------------------
-----------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
--------------------------------------------------------------------------------
-----------
| 0 | SELECT STATEMENT | | 1 | 7 | 2 (0)|
00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 7 | |
|
| 2 | INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 | 1 | 7 | 2 (0)|
00:00:01 |
--------------------------------------------------------------------------------
-----------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
532 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
At first I thought that the index would only be used if the column is declared NOT NULL. I tested with the following setup:
SQL> CREATE TABLE my_table (ts TIMESTAMP);
Table created
SQL> INSERT INTO my_table
2 SELECT systimestamp + ROWNUM * INTERVAL '1' SECOND
3 FROM dual CONNECT BY LEVEL <= 100000;
100000 rows inserted
SQL> CREATE INDEX ix ON my_table(ts);
Index created
SQL> EXPLAIN PLAN FOR SELECT MIN(ts) FROM my_table;
Explained
SQL> SELECT * FROM TABLE(dbms_xplan.display);
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 69 (2)| 00:00:0
| 1 | SORT AGGREGATE | | 1 | 13 | |
| 2 | INDEX FULL SCAN (MIN/MAX)| IX | 90958 | 1154K| |
--------------------------------------------------------------------------------
Here we notice that the index is used, but all rows from the index are read. If we specify that the column is not null we get a much better plan:
SQL> ALTER TABLE my_table MODIFY ts NOT NULL;
Table altered
SQL> EXPLAIN PLAN FOR SELECT MIN(ts) FROM my_table;
Explained
SQL> SELECT * FROM TABLE(dbms_xplan.display);
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:0
| 1 | SORT AGGREGATE | | 1 | 13 | |
| 2 | INDEX FULL SCAN (MIN/MAX)| IX | 90958 | 1154K| 2 (0)| 00:00:0
--------------------------------------------------------------------------------
In fact this is the same plan that is also used if we add a WHERE clause (Oracle will read a single row from the index):
SQL> EXPLAIN PLAN FOR SELECT MIN(ts) FROM my_table WHERE ts IS NOT NULL;
Explained
SQL> SELECT * FROM TABLE(dbms_xplan.display);
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:
| 1 | SORT AGGREGATE | | 1 | 13 | |
| 2 | FIRST ROW | | 90958 | 1154K| 2 (0)| 00:00:
| 3 | INDEX FULL SCAN (MIN/MAX)| IX | 90958 | 1154K| 2 (0)| 00:00:
--------------------------------------------------------------------------------
This last plan shows (line 2) that Oracle is indeed performing a "seek".
Just wanted to hone in on the fact that an "INDEX FULL SCAN (MIN/MAX)" is simply not the same as an "INDEX FULL SCAN". An INDEX FULL SCAN really does scan the entire index (possibly with filtering). However an INDEX FULL SCAN (MIN/MAX) or INDEX RANGE SCAN (MIN/MAX) only gets the smallest or largest leaf block (from the range), but can only be employed as long as the column is NOT NULL (which is a bit silly, and really a bug, since a NULL value is by definition neither the smallest nor largest value). The (MIN/MAX) optimization is an implicit FIRST_ROWS action, and doesn't need the "WHERE ... IS NOT NULL" query condition to perform the optimization. Interestingly the MIN/MAX optimization is normally not considered by the CBO for function-based indexes, that's another little bug.
精彩评论