Maximizing performance of SQL indexes for VARCHAR columns with homogeneous prefixes

2023-01-25 07:39 问答作者：

I'm designing a DB2 table, one VARCHAR column of which will store an alpha-numeric product identifier. The first few characters of these IDs have very little variation. The column will be indexed, and I'm concerned that performance may suffer because of the common prefixes.

As far as I can tell, DB2 does not use hash codes for selecting VARCHARs. (At least basic DB2, I don't know about any extensions.)

If this is to be a problem, I can think of three obvious solutions.

Create an extra, hash code column.
Store the text backward, to ensure good distribution of initial characters.
Break the product IDs into two columns, one containing a long enough prefix to produce better distribution in the remainder.

Each of these would be a hack, of course.

Solution #2 would provide the best key distribution. The backwards text could be stored in a separate column, or I could reverse the string after reading. Each approach involves overhead, which I w开发者_Go百科ould want to profile and compare.

With solution #3, the key distribution still would be non-optimal, and I'd need to concatenate the text after reading, or use 3 columns for the data.

If I leave my product IDs as-is, is my index likely to perform poorly? If so, what is the best method to optimize the performance?

I'm a SQL dba, not db2, but I wouldn't think that having common prefixes would hurt you at all, indexing wise.

The index pages simply store a "from" and "to" range of key values with pointers to the actual pages. The fact that an index page happens to store FrobBar001291 to FrobBar009281 shouldn't matter in the slightest to the db engine.

In fact, having these common prefixes allows the index to take advantage of other queries like:

SELECT * FROM Products WHERE ProductID LIKE 'FrobBar%'

I agree with BradC that I don't think this is a problem at all, and even if there was some small benefit to the alternatives you suggest, I imagine all the overhead and complexity would outweigh any benefits.

If you're looking to understand and improve index performance, there are a number of topics in the Info Center that you should consider (in particular the last two topics seem relevant): http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/nav/2_3_2_4_1 like:

Index structure
Index cleanup and maintenance
Asynchronous index cleanup
Asynchronous index cleanup for MDC tables
Online index defragmentation
Using relational indexes to improve performance
Relational index planning tips
Relational index performance tips

继续阅读：db2 sql

Maximizing performance of SQL indexes for VARCHAR columns with homogeneous prefixes

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？