开发者

DB2 Multiple Index on same column combinations

Lets say I've a table T with columns C1, C2 ... C10.

C1, C2 and C3 are most frequently refered columns in multiple queries. The order in which these columns are refered can't be controlled.

To improve query performance do I need to create multiple indexes such as (C1, C2, C3), (开发者_StackOverflowC1, C3, C2), (C2, C1, C3), (C2, C3, C1) etc.?

With example of two similar queries...

select * from t1, t2 where t1.c1 = t2.c1 and t1.c2 = t2.c2 and t1.c3 = t3.c3 and

select * from t1, t2 where t1.c2 = t2.c2 and t1.c1 = t2.c1 and t1.c3 = t3.c3

Do I need two indexes (C1, C2, C3) and (C2, C1, C3) for improving performance of above two different queries or just any one combination is enough for both the queries?

I guess it is not required and assume DB2 will be smart enough to rearrange columns in the query to match any one of the index combination as long as C1, C2 and C3 all are used.

Can anyone confirm and point me some articles related to that.

DB2 version used is 9.5 in AIX


Database tuning is not a set-and-forget operation, any more than is investing in Bernie Madhoff's hedge funds :-)

Yous should provide a moderately intelligent first guess (based on expected behavior) and then monitor, in production, the queries that are executed.

You should, at a minimum, start with indexes on the individual columns. Then, if someone complains that their queries are taking too long, have DB2 analyse the query and see where the bottleneck is (explain plan or Visual Explain). At that point, you should figure out whether the cost of adding the index is outweighed by the benefit.

That's the only way you can be sure that your database is tuned correctly.

Your assumption that DB2 can re-order thing to use other indexes depends entirely on the query - all but the simplest of queries will invalidate your assumption.

For example, let's say you have the index (c1,c2). The query

select * from t where c2 = '2009-01-01';

will not use that index since all the c2 keyparts are scattered around the index, not in one area.

It may be inteeligent enough to optimize

select * from t where c2 = '2009-01-01' and c1 = 'x';

since it can then just look up x2009-01-01 in the index. However,

select * from t where c2 <= '2009-01-01' and c2 <= '2009-01-05' and c1 = 'x';

suffers the same problem as the first query - the c2 keyparts are not contiguous in the index.

That's why you should measure, not guess.


In DB2 for Linux/UNIX/Windows, the order of the predicates and join criteria in a query does not matter from the optimizer's perspective. The query rewrite portion of the optimizer will account for this.

Depending on the query, the optimizer may select the index, whether it is on (C3,C2,C1) or (C1,C2,C3). However, that's not to say that the column order in the index doesn't matter -- it does. Generally you would want to put columns in the order of highest cardinality to lowest cardinality, but of course there are lots of exceptions. There are entire books written on proper index design (the book by Lahdenmaki / Leach is very good), so I would recommend that you look into these.


I am not very sure about DB2 but in oracle wat happens is when we create an index say

Create INDEX I1 on T1 (c1);

then it will create an index on column(c1) and also sort it.

Incase of a composite index as in your case :

Create INDEX I2 on T2 (c1, c2)

It will store the data in sorted format. So keeping that in mind i guess it will be faster to query the first column in the above index like below:

select c1 , c2 from T2

Please correct me if i am wrong in any part

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜